Kapiko: a 5-day, $150 post-mortem
AI music generation is technically solved; the moat is patience and distribution, not generation.
- Built and shipped Kapiko, a capybara-in-headphones ambient music channel, end-to-end in 5 days for ~$150.
- 57 videos shipped. 3 YouTube subscribers. Top video ~49 views. 4 monthly Spotify listeners. The pipeline worked. The audience did not arrive.
- The Suno API does not exist. Every Suno-powered side-hustle you have seen is reverse-engineering a website that breaks every 36 hours. Mine broke every 36 hours too.
- The actual system underneath was a music-scouting machine: hand-pick masterpiece tracks per genre, generate 50-100 Suno candidates per run, have Gemini grade each one against the masters, ship only the 9/10s.
- The genre is real. Lofi Girl, the 10-hour fireplace channel, dog-anxiety music, meeting chimes. Patient operators in functional audio do fine. I wasn’t one.
Below: how it scored on the rubric, why the genre is real, and the part of the build that comes with me.
How Kapiko started
Three subscribers. Fifty-seven videos. Four monthly Spotify listeners. The pipeline worked. The audience did not arrive.
It started a few weeks earlier, on a long drive. Hours of solo piano on shuffle: the Einaudi, Yiruma, Tiersen end of the spectrum, the genre Spotify calls Peaceful Piano. I came home, sat down, and the obvious thought arrived. This is exactly what Suno is good at.
I had just tested every AI music generator I could find — Suno, Udio, MiniMax Music, all of them. Suno had won. So the question wasn’t can I generate music? It was can a single operator with good taste run a music channel?
Kapiko was the test. A capybara in over-ear headphones. “A capybara, a pair of headphones, and nowhere to be. Solo instruments. Gentle melodies.” Site at kapiko.ai. YouTube channel @kapiko-music. Five days end to end, about $150 total: kapiko.ai domain ($80/yr), DistroKid annual ($23), a month of Suno Pro (~$10), Gemini and Nano Banana and MiniMax tokens, plus openclaw cron compute.
The tracks have real names. Whispers of November’s End. Porch Light Love. Moonlit Petals Drift. All generated, all named, all live on Spotify and Apple Music. An AI named a song Porch Light Love. It is better than what I would have called it.
How the system worked
Kapiko was a one-person talent scout. For each genre I hand-picked the masters of the sound. Suno generated 50-100 candidates per run. Gemini graded each one against the masters. Only the 9/10s shipped.
Step 1: pick a genre, write the prompts
Solo piano first: neo-classical, jazz ballad, ambient piano, Japanese piano, lo-fi piano, classical nocturne, a long tail of subgenres. Then fingerstyle guitar. Handpan next in the queue. A prompt generator wrote 25-50 Suno prompts per run in genre-specific sonic language: recording texture, emotional direction, instrumental technique, tempo, feel. Artist anchors shaped the style internally (Einaudi-leaning piano; Don Ross, Andy McKee, Lance Allen for guitar) but never went into the prompt. Suno blocks named-artist references.
Step 2: generate the candidates
Prompts hit Suno v5 (chirp-crow) through a wrapper I built. It polled for completion, downloaded MP3s, and used NopeCHA to clear hCaptcha when Suno threw it. Suno returns two clips per prompt, so a run produced 50-100 candidate tracks plus a manifest: prompt text, model version, timestamps, file hashes.
Step 3: score against the masterpieces
The part that made Kapiko interesting. I uploaded every candidate alongside hand-picked masters for Gemini to grade.
Piano used Einaudi’s “Nuvole Bianche,” Yiruma’s “River Flows in You,” and Tiersen’s “Comptine d’un autre été” as the standard. Five dimensions:
- Playlist fit. Would it sit on Spotify’s Peaceful Piano without a listener skipping?
- Closest reference. Which master did it resemble, and why?
- Quality gap. What got better or worse if you played it immediately after the references?
- Production match. Piano tone realism, reverb and space, dynamics, mastering loudness, artifacts.
- Emotional authenticity. Does this feel performed by a human or algorithmically assembled?
Verdict: PASS, BORDERLINE, or FAIL, weighted toward playlist fit and production match.
Guitar used a written rubric instead of uploaded reference tracks. Suno’s guitar output was uneven enough that a structured 8-dimension scorecard caught more failure modes than a master MP3 would have. Eight dimensions including tone, arrangement, production, whether it was actually acoustic fingerstyle (not a wrong-instrument hallucination), and explicit deductions for full-band drift, vocals, drums, or synth leakage. Across 80 clips the system landed 21 at 9/10 and 36 at 8/10. A 71% hit rate at 8+. Strongest in percussive, flamenco, baroque, and Middle Eastern fusion. Weakest in lullaby and post-rock, where Suno kept drifting into full bands.
Handpan referenced Hang Massive (“Once Again”), Yuki Koshimoto, and Daniel Waples (“Hang in Balance”), graded on seven dimensions including resonance, rhythmic flow, meditative quality, and instrument authenticity.
Anything below 9 was killed. Ties went to a head-to-head re-grading. One winner per genre per run.
Step 4: package the winner
Gemini listened to the winning MP3 and inferred mood, landscape, season, time of day, color palette. Nano Banana rendered a 1920×1080 still: the capybara plus a matching landscape. Gemini wrote a MiniMax image-to-video prompt with one rule: no camera move, ambient motion only. Clouds drifting, water rippling, grass swaying, mist, light. FFmpeg looped the 6-second clip to the MP3’s runtime, muxed the audio in, overlaid the title and the Kapiko mark in Pacifico, produced an MP4.
Step 5: distribute
YouTube auto-uploaded via the official API under kapiko-youtube-refresh-token, Music category, title pattern “Song Name [Single - Solo Piano] - kapiko”. DistroKid pushed to Spotify, Apple Music, and the long tail. That part stayed manual, not in the cron.
How Kapiko scored as a fully AI autonomous business
The six dimensions are not Kapiko-specific. Every project I touch (Tabiji, Veracity, Kapiko) gets scored on the same rubric, so I can compare them honestly instead of falling in love with whichever idea I am currently working on. The thesis behind it: AI collapsed the cost of generation. It has not yet collapsed distribution. The arbitrage is the gap. Each probe tests one intersection. Kapiko was the music probe.
Three green. Two red. One amber. And the two reds are the same story told twice. Low barrier plus high go-to-market means anyone can play and getting found is the actual job. The rubric flagged both before I started. I underweighted them because I was excited about the generation side.
The genre is real
Two of those scorecard cells, channel durability and TAM, aren’t hypothetical. Ambient and functional audio on YouTube and Spotify has been a quietly enormous category for over a decade. Exactly the evergreen surface where a one-operator AI-content business should work. Four receipts.
Trainers Warehouse meeting chime — 161K views, 17 years old, used in workshops and seminars to signal start, end, and attention. Not entertainment. A tool people play in a room of humans. The freshest reminder that the long tail in this space runs longer than anyone’s patience for it.
Complex’s 10-hour fireplace clip — Fireplace 10 hours full HD, 156M+ views, reportedly $1M+ in ad revenue over a decade. One upload. One channel. Set-it-and-forget-it incarnate.
Relax My Dog — 15 HOURS of Deep Separation Anxiety Music for Dog Relaxation, 66M+ views, “helped 4 million dogs worldwide.” Animal-calming is a durable sub-niche I would not have guessed at without looking.
And Lofi Girl — the genre’s anchor, 15.8M subscribers on one animated loop, 407 videos in.
Patient operators in ambient and functional audio do quite well over time. The niches are non-obvious until you find them. So why did I stop?
Cause of death
Two killers. The wrapper, which makes the cron impossible. And the marketing grind, which every channel in this category was always going to need anyway.
Suno has no public API. I reverse-engineered the site. The wrapper broke every 1-2 days: new auth flow, captcha shape changed, an endpoint moved, a response shape shifted by one field. A Saturday morning in week two, the auth flow shifted and every download in flight 401’d. I patched it before coffee. The next break came Sunday night.
You cannot cron a business that needs babysitting every 36 hours.
The only configuration that works at this barrier to entry, where anyone with the same tools can compound on you weekly, is a daily cron. Without bulletproof automation, you’re a person operating a music label by hand, not a system shipping evergreen inventory.
Then the marketing grind. The receipts up top say the genre is crowded and getting found is the actual job. I was unwilling to do the grind for a channel I was running as a research project. The wrapper is what made me stop. The grind is what would have made me stop anyway.
What’s reusable
The music channel was the surface. The taste-filter engine underneath is the artifact.
What I actually built is a feedback loop: generate many, score each candidate against genre-defining references, keep only what scores 9+. The references are the rubric. The 9/10 gate is the filter. The output medium is incidental.
That loop is portable. Point it at recipe-blog headers. The references become the food magazines that nailed the look: Bon Appétit, Cherry Bombe, Eater feature shots. Same loop: generate fifty, grade against those, keep the 9s. The loop does not care what it is grading. It needs a measurable quality ceiling and a reference set with enough signal.
The generator gets better every quarter. The grading rubric is what I get to own. That’s the durable piece.
Next
Kapiko was a research project priced like a coffee tab. Five days, $150, a working pipeline, and a clean answer to can a single operator with a cron job and good taste run an ambient music channel? Yes, technically. No, not while Suno’s API is a fiction and the marketing grind is the actual moat.
The scoring loop comes with me. Next probe pointed at an adjacent surface: same scorecard, different medium.
Generation is solved. Distribution is the job. The Suno API is still a fiction.
Newsletter
Get the next post by email.
One email when I publish something new. No spam, no fixed schedule, unsubscribe anytime.
Recommended Reading
- Suno vs MiniMax Music: Which AI Composer Wins?
We tested Suno AI and MiniMax Music 2.0/2.5/2.5+ in production across 200+ Instagram Reels.
- The True Cost of AI Content Production
Everyone obsesses over model costs and token prices. But after producing 400+ pages and 200+ videos with AI, we learned the real expense is data…
- Why AI Slop Is Necessary
The goal isn't to avoid AI slop — it's to slop on purpose, learn from what fails, and curate what works.