Vibe Coding a Blog Migration Posted on August 26, 2025 by Dave Fowler
For a number of years I've been meaning to copy content I wrote on my old sites Chartio.com and DataSchool.com over to my blog as well. It's such a messy task with static site generators—moving files and assets, front matter differing, and both link and markdown formats accepted. I'd made a few starts over the years, and even tried to just do it manually—but I never finished.
Recently I finally succeeded at it with the help of Cursor. I took some notes on my process and I've got some prompt advice to share. You'll want to adjust (especially the special instructions) to fit your needs.
Prompt prep
Before migrating from an old site, it helps to use an LLM to organize the docs and standards in your new one (maybe it's just a README.md). A simple prompt in Cursor saying:
Scan through the front matter options supported in the templates and the articles in this blog and update the README with the standard schema for posts
You might notice things like inconsistent front matter names or date formats, and you can just prompt your tool to fix and standardize them. A cleanup step can be helpful for finding inconsistencies and deciding on a structure before starting a full port.
Blog Migration Prompt
I have another blog in {your folder here}
I want to move the posts to this blog.
Special instructions:
- Only port articles with the author matching {your name}
- Move articles with the title "Off the Charts"
- Move all ported articles into the {category dir}, and tag their category in the front matter with {category tag}
First, write a `BLOG_PORT_PLAN.md` with your plan to migrate the article files and dependent assets. In the plan, be sure to include:
Content porting:
- A listing of which front matter keys are accepted and used in this blog
- A description of where the relevant content files are in the other blog
- An analysis of front matter keys and their frequency of use in the other blog
- A proposed mapping from the other blog to this one
- Any filters you'll apply when deciding if a page should be migrated
Asset porting:
- A description of where the assets are in the other blog
- A description of where the assets are in this blog
- A plan for how you will decide which assets require moving
- A plan for how you'll move them
After you've made this plan, I'll review it and you can get started.
GPT-5 will create quite an impresive and comprehensive plan for this. Check out the example below.
Using an LLM in migration
If some editing or more complex mapping requires decision-making in your port, it's probably only a few cents to have an LLM go through all of your posts.
First add an AI key like OPENAI_API_KEY
to a .env
file in your blog's root dir and then add something like this to your prompt:
In the migration script, send each article to an LLM to get back a "summary" and an "image" for the ported files.
Use the $OPENAI_API_KEY in the .env file.
If it's really not that many files, you don't even need to tell the LLM to make a script to process each; you can have a tool like Cursor or Claude Code just do it with a prompt like:
Go through the uncommitted new articles and manually add summaries
And it'll just do it right there in the tool with a nice approve/reject workflow.
Testing tools
Most blog tools and static site generators now have automatic link checkers that can scan through and make sure there aren't any missing or broken links or images. Be sure to set that up with yours—it will help the debugging process greatly. Otherwise you can use external tools like Screaming Frog, or again, just have the AI vibe‑code one for you.
Example Blog Plan
Using the above sample prompt in Cursor with GPT-5 gives quite an incredible migration plan with plenty of details that would have taken me a long time to suss out myself.
## Plan: Port Data School chapters into this blog
### Goal
- Port selected articles from `~/Projects/dataschool.com/_chapters/**` into this Metalsmith blog under a new category `dataschool` at `src/things/dataschool/`.
- Only migrate articles whose author list matches the word "dave" (case-insensitive). See Filters.
---
## Content porting
### Accepted front matter keys in this blog (per README and build)
- `layout`: set via directory `metadata.yaml` in `src/things/**` (do not add per-file)
- `title`: display title
- `description`: short summary for meta tags/previews
- `draft`: optional boolean to exclude from build
- `date`: ISO date string used for sorting/display
- `slug`: optional; if omitted, slug is derived from `title`
- `permalink`: set `false` to skip permalink processing (rare; 404 only)
- `canonical`: absolute canonical URL of original source (auto-filled if missing)
- `canonical_name`: human-friendly source label (shown with title when present)
- `image`: absolute URL to social/preview image (falls back to `site.image`)
Build also derives: `permalinkSlug` (from `slug` or `title`), `formattedDate`, `formattedDateShort`, and sets `canonical` when missing.
### Source content locations (other blog)
- Articles: `~/Projects/dataschool.com/_chapters/**/{*.md,*.MD}` organized by topic folders (e.g., `learn-sql`, `data-governance`, `how-to-design-a-dashboard`, etc.).
- People (author identities): `~/Projects/dataschool.com/_people/*.md` (e.g., `_people/dave.md`).
### Front matter keys seen in the other blog and frequency (approx.)
- Present commonly:
- `title` (~170)
- `description` (~163)
- `image` (~156)
- `section` (~159)
- `meta_title` (~157)
- `number` (~145)
- `reading_time` (~147)
- `is_under_construction` (~134)
- `is_featured` (~132)
- `reviewers` (~72)
- `img_border_on_default` (~72)
- `feedback_doc_url` (present in many; also appears in body content)
- Present occasionally/rarely:
- `authors` (~20)
- `author` (singular; sparse)
- `is_community_story` (~1)
- Not observed in chapters:
- `date`, `layout`, `permalink`, `tags`
Notes:
- `authors` values are usually paths to `_people/*.md` (e.g., `_people/dave.md`). Many files include `dave` in `reviewers` rather than `authors`.
### Proposed mapping (other blog → this blog)
- `title` → `title` (keep)
- `meta_title` → drop (do not carry over)
- `description` → `description` (keep as-is)
- `image` → `image` (normalize to site-absolute path `/assets/...` or absolute URL)
- `authors`/`author` → used only for migration filtering; do not keep in output (not used by templates)
- `section`, `number` → drop (not used here)
- `reviewers`, `feedback_doc_url` → drop
- `is_featured`, `img_border_on_default`, `is_under_construction`, `is_community_story`, `story_intro_blurb`, `reading_time` → drop
- Add new in this blog:
- `canonical_name: Data School`
- `canonical`: original Data School URL (see below)
- `date`: derive from source git history (earliest commit); fallback heuristics if unavailable
Folder structure and tagging:
- Preserve the source subfolder structure under `src/things/dataschool/<subfolders>/*`.
- Do not add tags per-file; instead, create/update a `metadata.yaml` in each subfolder to include `tags: [dataschool, <subfolder-name>]`.
Canonical URL construction:
- Base: `https://dataschool.com/`
- Path: `<topic-folder>/<slug>/` derived from source file path and filename (hyphenated). When exact routing differs, we will still set `canonical` to the best-guess stable public URL or omit to let build backfill.
Date derivation strategy:
1. Use source repo git history for each file: `git log --follow --format=%ad --date=short -n 1 <file>`
- Use the earliest commit date for “originally published”.
2. If git date unavailable, fallback to a date encoded in the file path/name if present; otherwise use a fixed fallback (e.g., `2019-01-01`).
Slug strategy:
- Let the build derive `permalinkSlug` from `slug` or `title`. We will not hardcode `slug` unless the source filename is better than the title-derived slug.
Output placement and naming:
- Destination directory: `src/things/dataschool/<preserve-subfolders>/`
- Filename: `<derived-slug>.md`
### Filters (which pages to migrate)
- Include only when any author matches `/dave/i`:
- `authors` list contains an item with `dave` (e.g., `_people/dave.md`, `dave-fowler`, etc.)
- OR `author` (singular) contains `dave`
- Exclude when:
- `is_under_construction: true`
- Clearly test content (e.g., filenames containing `test` or `sqlbox-test-do-not-publish`)
- Missing `title` or empty body
- In addition, skip the entire `learn-sql` section for now.
---
## Asset porting
### Where assets live (source)
- Images: `~/Projects/dataschool.com/assets/images/**` (chapters reference paths like `/assets/images/...`)
- Other assets: `~/Projects/dataschool.com/assets/sqlbox/**`, `~/Projects/dataschool.com/assets/javascript/**` (not typically embedded in migrated articles)
### Where assets live (this blog)
- Images and other static files are served from `build/assets/**`, sourced from `src/assets/**`.
- Existing convention: images under `src/assets/images/**`.
### Which assets to move
- Only those referenced by the migrated posts:
- Front matter `image`
- Markdown image references: `` and ``
- HTML `<img src="/assets/images/...">`
- Ignore assets outside of `/assets/images/**` unless explicitly referenced (e.g., downloadable files).
### How assets will be moved
1. For each selected chapter, extract unique referenced image paths.
2. Copy from `~/Projects/dataschool.com/assets/images/...` to `~/Projects/metalsmith-system-of-a-dave/src/assets/images/dataschool/...` preserving subpaths.
3. Rewrite references in the migrated Markdown to `/assets/images/dataschool/...` (site-absolute) to match this blog’s serving path.
4. For front matter `image`, rewrite to an absolute URL or to `/assets/images/dataschool/...` consistently.
Idempotency and safety:
- Generate a manifest of copied assets and migrated files.
- Skip copy if destination file with identical size/hash exists.
- Warn on missing source assets; leave the original URL if external, otherwise annotate for manual follow-up.
---
## Implementation outline (post-approval)
1. Enumerate chapters and parse YAML front matter.
2. Apply filters (authors include `dave`, exclude tests/under-construction, skip `learn-sql`).
3. For each selected file (preserving subfolders):
- Build new front matter via the mapping above, deriving `date`, `canonical`, and `canonical_name`.
- Extract and copy required assets; rewrite paths.
- Write to `src/things/dataschool/<subfolders>/<slug>.md`.
- Ensure `metadata.yaml` exists in `src/things/dataschool/` with `category: dataschool`.
- Ensure each `src/things/dataschool/<subfolder>/metadata.yaml` contains `tags: [dataschool, <subfolder-name>]` (merged if already present).
4. Run local build (`./run`) and fix any broken links or missing assets.
5. Review output list and RSS impact.
Decisions confirmed:
- Date: use earliest commit date.
- Filters: skip `learn-sql` section.
- `meta_title`: drop.
Go
What was a daunting task now only took 60 minutes or so of prompting and testing. So you've got no excuse now—it's never been easier to go through and clean up your old files!