Discourse migration log
This is a post-mortem/runbook of a real migration. I skip common Discourse prep (official docs cover it). I focus on the exact switches, Cloudflare/R2 gotchas, the rails/rake one-liners that mattered, what failed, and how to make the same move low-risk next time.
Target end-state
- Discourse runs on the new host (Docker, single app container).
- TLS via Let’s Encrypt.
- Traffic proxied by Cloudflare for
forum.example.com
(orange cloud). - Uploads + front-end assets live on Cloudflare R2:
- Bucket
discourse-uploads
(public) - Bucket
discourse-backups
(private)
- Bucket
- Custom domain:
https://files.example.com
(R2 “Custom domain”, not a manual CNAME)
1) Restore DB-only first (fast cutover, tiny backup)
On the old machine (RN):
- Admin → Backups → Enable read-only.
- Create a DB-only backup (uncheck Include uploads).
On the new machine:
|
|
If you need “almost zero” content gap, repeat this DB-only hop right before DNS cutover.
2) (Optional) bring local uploads once, before switching to R2
|
|
This is just a safety net; we will move all uploads to R2 shortly.
3) Switch production domain
- Ensure in
containers/app.yml
→env
:
|
|
|
|
Cloudflare DNS:
forum.example.com
→ A to the new IP (orange cloud ON), SSL/TLS Full (Strict), Brotli ON.- Do not page-cache forum HTML.
Sanity:
|
|
Seeing HTTP/2 403
for anonymous is often login_required
(a setting), not a failure.
4) R2: all the knobs that actually matter
4.1 Create buckets + Account API Token
discourse-uploads
(public),discourse-backups
(private).- Create Account API Token (not user token):
- Admin Read & Write, scoped to those two buckets.
- We need bucket-level ops because Discourse’s asset task touches CORS (
Get/PutBucketCors
).
After it works, rotate to Object Read & Write (least privilege) and rebuild.
4.2 Custom domain for R2 (how to avoid CF 1014)
The 1014 (“CNAME Cross-User Banned”) happens when a hostname on Cloudflare tries to CNAME to a target that is also on Cloudflare but belongs to another account. Two safe patterns:
- Correct (no 1014):
Put the DNS zone (
example.com
) in the same Cloudflare account that owns the R2 buckets. In R2 → Custom domains, addfiles.example.com
. R2 will bind the subdomain internally; the DNS record appears in the same account. No cross-account CNAME; no 1014. - Wrong (triggers 1014): Manually create a DNS CNAME in Account A pointing to an R2 hostname that lives in Account B (or any other CF-managed target). Cloudflare blocks cross-account CNAME.
Checklist to stay clean:
- The domain hosting
files.example.com
is under the same CF account as R2. - Create the binding in R2 → Custom domain and wait for Status: Active.
- Keep
files.example.com
proxied (orange cloud). Set TLS min 1.2; enable Brotli. No Page Rules needed.
4.3 Bucket CORS
R2 → discourse-uploads
→ CORS:
|
|
5) Discourse: enable S3 (R2) + push assets there
In containers/app.yml
→ env
: add:
|
|
Add hooks
so front-end assets publish to R2 during rebuild:
|
|
Rebuild:
|
|
Now CSS/JS/fonts serve from https://files.example.com/...
.
6) Migrate historical uploads to R2 (one-time)
Run inside the container, always with bundler + discourse user:
|
|
What to expect:
- “Listing local files / Listing S3 files / Syncing files”
- “Updating the URLs in the database…”
- “Flagging posts for rebake … N posts …”
- Done!
If it screams about posts “not remapped to new S3 upload URL”, see §7.2.
7) What actually broke (and how we fixed it)
7.1 R2 checksum conflict (hard blocker)
Symptom
Aws::S3::Errors::InvalidRequest:
You can only specify one non-default checksum at a time.
Root cause
Newer AWS SDKs auto-add x-amz-checksum-*
. Discourse sometimes adds Content-MD5
. R2 rejects having both.
Fix (keep permanently in env
):
|
|
(Alternative older switch: AWS_S3_DISABLE_CHECKSUMS=true
, but the two above are the modern knobs.)
7.2 “X posts are not remapped to new S3 upload URL” (soft blocker)
Symptom At the end of migration:
FileStore::ToS3MigrationError:
35 posts are not remapped to new S3 upload URL
Why
- Some posts’
cooked
HTML still contained/uploads/default/original/...
while DB URLs were updated. - In many cases
raw
containedupload://...
(expands only on rebake), so simple string replace inraw
didn’t touch them.
Fix — do NOT full-site rebake. Do targeted work:
-
List offenders:
1 2 3 4
sudo -E -u discourse RAILS_ENV=production bundle exec rails r ' db = RailsMultisite::ConnectionManagement.current_db puts Post.where("cooked LIKE ?", "%/uploads/#{db}/original%").pluck(:id, :topic_id).map{|id,tid| "#{id}:#{tid}"} '
-
Option A: directly rebake only those posts:
1 2 3 4 5 6
sudo -E -u discourse RAILS_ENV=production bundle exec rails r ' db = RailsMultisite::ConnectionManagement.current_db ids = Post.where("cooked LIKE ?", "%/uploads/#{db}/original%").pluck(:id) ids.each { |pid| Post.find(pid).rebake! } puts "rebaked=#{ids.size}" '
-
Option B: if you truly have static old strings, remap then auto-rebake touched posts:
1 2
sudo -E -u discourse RAILS_ENV=production bundle exec \ rake "posts:remap[/uploads/default/original,[https://files.example.com/original](https://files.example.com/original)]"
-
Re-run a quick migration check (fast):
1 2
yes "" | AWS_REQUEST_CHECKSUM_CALCULATION=WHEN_REQUIRED AWS_RESPONSE_CHECKSUM_VALIDATION=WHEN_REQUIRED \ sudo -E -u discourse RAILS_ENV=production bundle exec rake uploads:migrate_to_s3
7.3 “s3:info” doesn’t exist / rake -T shows nothing (false trails)
-
Use proper context, or tasks won’t appear:
1 2
sudo -E -u discourse RAILS_ENV=production bundle exec rake -T s3 sudo -E -u discourse RAILS_ENV=production bundle exec rake -T uploads
-
To read current settings (reliable):
1 2
sudo -E -u discourse RAILS_ENV=production bundle exec rails r \ 'puts({ enable_env: ENV["DISCOURSE_USE_S3"], bucket: ENV["DISCOURSE_S3_BUCKET"], endpoint: ENV["DISCOURSE_S3_ENDPOINT"], cdn: ENV["DISCOURSE_S3_CDN_URL"] })'
7.4 s3:upload_assets AccessDenied on CORS (permissions)
- The token had only Object permissions.
- Give the bootstrap token Admin Read & Write (bucket-level CORS ops), then rotate down to Object RW after success.
8) Verification
Inside container
|
|
Browser
- View-source / Network → assets load from
files.example.com
. - Old topics show images from
https://files.example.com/original/...
.
Backups
- Admin → Backups → create one; a new object appears in
discourse-backups
on R2.
9) Cleanup (only after you’re sure)
When cooked references are effectively 0 and random old topics look good:
|
|
Rotate secrets:
- Create a new R2 token (Object RW), update
app.yml
,./launcher rebuild app
, then revoke the old token. - Rotate SMTP app password if it ever leaked into logs.
10) Next time (playbook) — R2-first version
- Old → New (DB-only):
- Old: enable read-only; make DB-only backup.
- New: restore DB-only.
- R2 wiring before DNS:
- Create
discourse-uploads
(public),discourse-backups
(private). - Account API Token (Admin RW, scoped to these buckets).
- Custom domain
files.example.com
in R2 UI (same CF account as the DNS zone). Wait for Active. - Add CORS (GET/HEAD from forum + files).
- Create
- Discourse
env
+hooks
:- In
app.yml
→env
: set S3/R2 vars + checksum flags:1 2 3 4 5 6 7 8
DISCOURSE_USE_S3: true DISCOURSE_S3_ENDPOINT: https://<ACCOUNT_ID>.r2.cloudflarestorage.com DISCOURSE_S3_BUCKET: discourse-uploads DISCOURSE_S3_CDN_URL: [https://files.example.com](https://files.example.com) DISCOURSE_BACKUP_LOCATION: s3 DISCOURSE_S3_BACKUP_BUCKET: discourse-backups AWS_REQUEST_CHECKSUM_CALCULATION: WHEN_REQUIRED AWS_RESPONSE_CHECKSUM_VALIDATION: WHEN_REQUIRED
- Add
after_assets_precompile
withs3:upload_assets
+s3:expire_missing_assets
. ./launcher rebuild app
(assets go to R2).
- In
- DNS cutover for
forum.example.com
(orange cloud, Full/Strict). - Migrate uploads to R2:
1 2 3
./launcher enter app yes "" | AWS_REQUEST_CHECKSUM_CALCULATION=WHEN_REQUIRED AWS_RESPONSE_CHECKSUM_VALIDATION=WHEN_REQUIRED \ sudo -E -u discourse RAILS_ENV=production bundle exec rake uploads:migrate_to_s3
- Fix stragglers (if any):
- List posts whose
cooked
still references/uploads/<db>/original
. - Either targeted rebake those posts, or
posts:remap
legacy strings. - Re-run the migration check once; expect
Done!
without complaints.
- List posts whose
- Speed up rebake (optional):
Or let Sidekiq process in background; forum stays live.
1
sudo -E -u discourse RAILS_ENV=production bundle exec rake posts:rebake_uncooked_posts
- Backups to R2: trigger one; verify an object appears in
discourse-backups
. - Permissions hardening:
- Rotate the R2 token to Object RW; rebuild.
- Leave the checksum flags in
env
for good.
- Final cleanup after 1–2 days of smooth traffic:
- Archive then remove local uploads to reclaim disk.
Appendix — run commands the right way
Inside the container, always:
|
|
Running rake
/rails
as root without bundler will hide tasks and cause false errors.
This is everything I actually had to touch. No theory, just the levers that moved.