The Devil Is in the AI Skills
I wanted to experiment with game design, and used Rebecca and I's engagement photo as the starting point.
Round one was generic stick figures with no glasses, no recognizable hair, and a walk cycle that looked like a Microsoft Bob avatar. Round two — same underlying image model, with an open-source skill wrapped around it — was us. By the end of the weekend, Bernard was running across an Austin side-scroller to reach Rebecca, with checkpoint lanterns, dash afterimages, and procedural WebAudio. Same flagship models. Different scaffolding. That's the whole thing.
Frontier AI models are commodities. The thing that determines what they can actually do for you is the layer of skills stacked on top — and right now that layer is heavily biz/code, undercooked for creative work.
- Stock Claude Design + GPT-5.5 produced sprites that didn't look like us. The model itself hedged: “solid usable draft, not yet artist-polished.”
- I found agent-sprite-forge, an open-source skill by 0x0funky on GitHub. Same Gemini image model under the hood, but with prompt rules + deterministic postprocessing + a QA repair pass. Output looked like us.
- From the same sprites I had GPT-5.5 build a playable Austin side-scroller. Play it →
- The model didn't get better between attempts. The skill stack around it did. That gap is where the next two years of AI work lives.
Find the chauffeurs. Borrow their toolboxes.
Round one: stock AI
I fed the engagement photo to a flagship harness out of the box — Claude Design — and asked for a sprite sheet. Here is what came back.
The model itself knew. In the Slack thread where it dropped the package, it added a caveat:
This is a solid usable draft, not yet “artist-polished.” The next useful pass would be tightening likeness: her hair shape, your face/glasses, and maybe adding a couple's idle/emote sheet.
Translation: the model could see what was missing, and could enumerate what would fix it — it just couldn't do that work itself. It generated the cells, but it had no opinion about cell consistency, no eye for likeness, no game-engine convention to anchor on.
Why stock AI couldn't draw
Out-of-box AI harnesses ship with a skill library. That library is heavily biased toward business and code, because that's where the early enterprise demand was: PowerPoint decks, Word documents, spreadsheet ops, debugging sessions, code review, GitHub PRs, security scans. The default agent is a suit-and-tie consultant.
Creative work — pixel art, game feel, sprite rigging, animation timing — isn't in the standard kit. The model can wing it, but “wing it” without the right scaffolding is exactly what produces generic NPC walk cycles and faces that don't look like the people in the photo.
This isn't a model-capability problem. It's a model-skill problem. The same weights that can't draw my fiancée can draw her perfectly — if you give them the right wrapper.
Round two: a skill from GitHub
I went looking. A search on GitHub for sprite agent surfaced 0x0funky/agent-sprite-forge — an open-source skill purpose-built for game-asset pixel art. The pipeline:
- Prompt rules. Character-consistency scaffolding — same outfit, same proportions across all 16 cells (4 directions × 4 frames).
- Gemini 3 Pro image model for generation. Same underlying weights anyone can call.
- Forge postprocessor. Cell alignment, transparent background extraction, color quantization, frame normalization.
- Deterministic QA repair / reassembly. Detects broken cells and patches them by re-rendering against the canonical pose, then reassembles the final sheet.
I gave it the same engagement photo. It returned a 384×768 sheet plus per-direction walk-cycle previews.
The receipts
These sprites are now wandering at the bottom of my about page, next to my sign-off line. They walk back and forth across the row, bounce off the edges, occasionally pass each other. Scroll to the bottom of that page and you'll see them. Round-one sprites would have looked like a screensaver from 2003. Round-two sprites look like us.
Then I built a game
With characters that actually looked like us, the question stopped being “can the AI draw” and started being “what do you do with them?”
I had GPT-5.5 build a side-scroller. Bernard runs from SOCO past the South Congress food trucks, over Lady Bird Lake under the Congress Avenue bridge bats, past the Capitol dome, to Rebecca on a neon rooftop. Touching her ends the game. No flagpole — the win condition is reaching her.
Three iterations got it from prototype to “tiny finished game.” v2 was pixel sprites + tilemap. v3 added an HD painterly parallax-scrolling Austin backdrop. v4 was the polish pass:
- Game feel: coyote time, jump buffering, variable jump release, dash with recharge, screen shake on dash and damage.
- Save: Six checkpoint lanterns that update respawn and reduce death punishment.
- Juice: dash afterimages, speed streaks, landing dust, collectible bursts, checkpoint rings, a heart-and-confetti burst when Rebecca is reached.
- Presentation: progress bar, run timer, zone title toasts, mute badge, end-of-run stats panel.
- Audio: procedural WebAudio that boots after first input — ambient bed plus jump/dash/collect/checkpoint/hurt/win SFX.
None of this would have happened if round one had been good enough. Stick-figure sprites wouldn't have inspired me to spend the weekend on it. Recognizable sprites did.
The devil is in the AI skills
The model didn't get better between rounds. The skill stack around it did.
This is the thing nobody briefs you on when they sell you the flagship model. The model itself is a commodity — it'll be replaced by something within ninety days. What isn't a commodity is the layer of skills that determine what the model can actually do for you. agent-sprite-forge uses the same Gemini image weights anyone with an API key can call. The difference is the wrapper.
And right now the open-source skill ecosystem is heavily skewed toward business and code, because that's where the early demand was. The skill library is full of things like:
- PPTX skills for generating slide decks
- DOCX skills for writing reports and letters
- XLSX skills for cleaning and analyzing spreadsheets
- PDF skills for extraction and form-filling
- Code-review, debug, GitHub-PR, security-audit skills
What's not in the default kit: pixel art, sprite rigging, game feel tuning, animation timing, audio mixing, video editing, design-system enforcement, music composition, voice direction. The creative side of the skill ecosystem is several years behind the business side. That gap is where the next two years of AI work lives.
Chauffeur knowledge
There's a parable about Max Planck and his chauffeur. After Planck won the Nobel, he gave the same lecture so many times across Germany that his chauffeur memorized it. One night they swapped places — the chauffeur gave the lecture, Planck sat in the audience in the chauffeur's hat. The talk went fine. Then a physicist in the front row asked a real question. The chauffeur shrugged and said, “I'm surprised to hear such a simple question in an advanced city like Munich. I'll let my chauffeur answer it.”
The usual moral is about real expertise vs. surface fluency. But there's an inverse moral worth holding onto: you don't have to be Planck. You just have to know who Planck is and how to find their chauffeur.
For AI skills, the practical version is: interview people who actually ship things in your domain. Ask which libraries they reach for. Ask what they wish existed. They will tell you faster than you can search the GitHub trending feed yourself. One conversation gets you eighty percent of the way.
I found agent-sprite-forge because I described the failure mode out loud to someone who'd already built sprites for a game in the same engine. He sent the link. Hours of fumbling through Hugging Face wouldn't have produced that link. The chauffeur did.
This is how I now do skill discovery for everything new I touch — reels, voice agents, music, 3D. Find the chauffeur. Borrow their toolbox. Skip the long tail.
The takeaway
Stock AI couldn't make a pixel version of my fiancée. A skill did.
The sprites wandering on /about/ — those are receipts. The Austin side-scroller at /games/run-to-rebecca/ — those are receipts. Both came from finding the right skill, not from picking a better model.
The chauffeur of this month is whoever curated agent-sprite-forge on GitHub. The chauffeur of next month is whoever's curating the creative-stack skill you haven't found yet. Go find them.
The devil is in the AI skills. The model is the easy part.
— Bernard
Recommended Reading
- OpenClaw Is an MMORPG
A fresh OpenClaw install is a level 0 character — full of potential, zero abilities. Every API key you add, every skill you install is a talent point.
- Build for Agents, Price Per Call.
Hermes + Codex 5.5 matched Opus-era smoothness — we one-shot a new product (veracityapi.com) in an afternoon. But the tooling unlock isn't the moat.
- Plan 3×, Build Once: How Three Models Plan What One Model Ships
We asked Opus 4.7 and GPT-5.5 to independently plan the same VeracityAPI feature.
Newsletter
Get the next post by email.
One email when I publish something new. No spam, no fixed schedule, unsubscribe anytime.