← Back to Posts

Kapiko: a 10-day, $150 post-mortem

Bernard Huang

May 30, 2026 · 5 min read

TL;DR

AI music generation is technically solved; the moat is patience and distribution, not generation.

Built and shipped Kapiko, a capybara-in-headphones ambient music channel, end-to-end in 10 days for ~$150.
57 videos shipped. 3 YouTube subscribers. Top video ~49 views. 4 monthly Spotify listeners. The pipeline worked. The audience did not arrive.
The Suno API does not exist. Every Suno-powered side-hustle you have seen is reverse-engineering a website that breaks every 36 hours. Mine broke every 36 hours too.
The actual system underneath was a music-scouting machine: hand-pick masterpiece tracks per genre, generate 50-100 Suno candidates per run, have Gemini grade each one against the masters, ship only the 9/10s.
The genre is real. Lofi Girl, the 10-hour fireplace channel, dog-anxiety music, meeting chimes. Patient operators in functional audio do fine. I wasn’t one.

Below: how it scored on the rubric, why the genre is real, and the part of the build that comes with me.

How Kapiko started

Three subscribers. Fifty-seven videos. Four monthly Spotify listeners. The pipeline worked. The audience did not arrive.

Three subscribers. One of them is me. Top video: 49 views, also probably me.

It started a few weeks earlier, on a long drive. Hours of solo piano on shuffle: the Einaudi, Yiruma, Tiersen end of the spectrum, the genre Spotify calls Peaceful Piano. I came home, sat down, and the obvious thought arrived. This is exactly what Suno is good at.

I had just tested every AI music generator I could find — Suno, Udio, MiniMax Music, all of them. Suno had won. So the question wasn’t can I generate music? It was can a single operator with good taste run a music channel?

Kapiko was the test. A capybara in over-ear headphones. “A capybara, a pair of headphones, and nowhere to be. Solo instruments. Gentle melodies.” Site at kapiko.ai. YouTube channel @kapiko-music. Ten days end to end, about $150 total: kapiko.ai domain ($80/yr), DistroKid annual ($23), a month of Suno Pro (~$10), Gemini and Nano Banana and MiniMax tokens, plus openclaw cron compute.

The tracks have real names. Whispers of November’s End. Porch Light Love. Moonlit Petals Drift. All generated, all named, all live on Spotify and Apple Music. An AI named a song Porch Light Love. It is better than what I would have called it.

The kapiko Spotify artist page showing 4 monthly listeners and the top tracks: Whispers of November's End, Glimpse Through Frosted Pa..., Porch Light Love, Sunset's Gentle Invitation, Moonlit Petals Drift. — The other side of the receipts: kapiko on Spotify, 4 monthly listeners.

How the system worked

Kapiko was a one-person talent scout. For each genre I hand-picked the masters of the sound. Suno generated 50-100 candidates per run. Gemini graded each one against the masters. Only the 9/10s shipped.

Step 1: pick a genre, write the prompts

Solo piano first: neo-classical, jazz ballad, ambient piano, Japanese piano, lo-fi piano, classical nocturne, a long tail of subgenres. Then fingerstyle guitar. Handpan next in the queue. A prompt generator wrote 25-50 Suno prompts per run in genre-specific sonic language: recording texture, emotional direction, instrumental technique, tempo, feel. Artist anchors shaped the style internally (Einaudi-leaning piano; Don Ross, Andy McKee, Lance Allen for guitar) but never went into the prompt. Suno blocks named-artist references.

Step 2: generate the candidates

Prompts hit Suno v5 (chirp-crow) through a wrapper I built. It polled for completion, downloaded MP3s, and used NopeCHA to clear hCaptcha when Suno threw it. Suno returns two clips per prompt, so a run produced 50-100 candidate tracks plus a manifest: prompt text, model version, timestamps, file hashes.

Step 3: score against the masterpieces

The part that made Kapiko interesting. I uploaded every candidate alongside hand-picked masters for Gemini to grade.

Piano used Einaudi’s “Nuvole Bianche,” Yiruma’s “River Flows in You,” and Tiersen’s “Comptine d’un autre été” as the standard. Five dimensions:

Playlist fit. Would it sit on Spotify’s Peaceful Piano without a listener skipping?
Closest reference. Which master did it resemble, and why?
Quality gap. What got better or worse if you played it immediately after the references?
Production match. Piano tone realism, reverb and space, dynamics, mastering loudness, artifacts.
Emotional authenticity. Does this feel performed by a human or algorithmically assembled?

Verdict: PASS, BORDERLINE, or FAIL, weighted toward playlist fit and production match.

Guitar used a written rubric instead of uploaded reference tracks. Suno’s guitar output was uneven enough that a structured 8-dimension scorecard caught more failure modes than a master MP3 would have. Eight dimensions including tone, arrangement, production, whether it was actually acoustic fingerstyle (not a wrong-instrument hallucination), and explicit deductions for full-band drift, vocals, drums, or synth leakage. Across 80 clips the system landed 21 at 9/10 and 36 at 8/10. A 71% hit rate at 8+. Strongest in percussive, flamenco, baroque, and Middle Eastern fusion. Weakest in lullaby and post-rock, where Suno kept drifting into full bands.

Handpan referenced Hang Massive (“Once Again”), Yuki Koshimoto, and Daniel Waples (“Hang in Balance”), graded on seven dimensions including resonance, rhythmic flow, meditative quality, and instrument authenticity.

Anything below 9 was killed. Ties went to a head-to-head re-grading. One winner per genre per run.

Step 4: package the winner

Gemini listened to the winning MP3 and inferred mood, landscape, season, time of day, color palette. Nano Banana rendered a 1920×1080 still: the capybara plus a matching landscape. Gemini wrote a MiniMax image-to-video prompt with one rule: no camera move, ambient motion only. Clouds drifting, water rippling, grass swaying, mist, light. FFmpeg looped the 6-second clip to the MP3’s runtime, muxed the audio in, overlaid the title and the Kapiko mark in Pacifico, produced an MP4.

Step 5: distribute

YouTube auto-uploaded via the official API under kapiko-youtube-refresh-token, Music category, title pattern “Song Name [Single - Solo Piano] - kapiko”. DistroKid pushed to Spotify, Apple Music, and the long tail. That part stayed manual, not in the cron.

How Kapiko scored as a fully AI autonomous business

Barrier to entryLOW

Low barrier sounds good. It is not. Suno plus a video model and anyone can ship. The same low barrier that lets me play lets every new operator compound on top of me weekly.

Channel durabilityHIGH

YouTube as a surface will take years for AI to consume. Video is the slowest substrate to be replaced.

Market sizeHIGH

YouTube keeps growing. Ambient and functional audio is a durable, multi-billion-hour annual demand pool.

Product durabilityHIGH

Once a track ships, it’s evergreen. The fireplace clip is still earning a decade in.

GTM / marketingHIGH

High effort, not high leverage. You have to grind for subscribers. This is where it’s hard.

Business modelMEDIUM

1,000 subscribers to enter the creator program. Pennies per view until a sticky niche locks in.

The six dimensions are not Kapiko-specific. Every project I touch (Tabiji, Veracity, Kapiko) gets scored on the same rubric, so I can compare them honestly instead of falling in love with whichever idea I am currently working on. The thesis behind it: AI collapsed the cost of generation. It has not yet collapsed distribution. The arbitrage is the gap. Each probe tests one intersection. Kapiko was the music probe.

Three green. Two red. One amber. And the two reds are the same story told twice. Low barrier plus high go-to-market means anyone can play and getting found is the actual job. The rubric flagged both before I started. I underweighted them because I was excited about the generation side.

The genre is real

Two of those scorecard cells, channel durability and market size, aren’t hypothetical. Ambient and functional audio on YouTube and Spotify has been a quietly enormous category for over a decade. Exactly the evergreen surface where a one-operator AI-content business should work. Four receipts.

Trainers Warehouse meeting chime — 161K views, 17 years old, used in workshops and seminars to signal start, end, and attention. Not entertainment. A tool people play in a room of humans. The freshest reminder that the long tail in this space runs longer than anyone’s patience for it.

YouTube video of the Trainers Warehouse Meeting Chime sound, 161K views over 17 years. — Functional audio: 17 years old, still earning its keep one workshop at a time.

Complex’s 10-hour fireplace clip — Fireplace 10 hours full HD, 156M+ views, reportedly $1M+ in ad revenue over a decade. One upload. One channel. Set-it-and-forget-it incarnate.

Reddit r/passive_income post about Complex's 10-hour YouTube fireplace video reportedly earning the creator over $1 million. — One upload, ten years, reportedly $1M+. The category has receipts.

Relax My Dog — 15 HOURS of Deep Separation Anxiety Music for Dog Relaxation, 66M+ views, “helped 4 million dogs worldwide.” Animal-calming is a durable sub-niche I would not have guessed at without looking.

YouTube search results for dog separation anxiety music; top result has 66 million views. — Animal-calming audio: an entire sub-niche I’d never have surfaced without looking.

And Lofi Girl — the genre’s anchor, 15.8M subscribers on one animated loop, 407 videos in.

Lofi Girl YouTube channel header showing 15.8M subscribers and 407 videos. — Lofi Girl: the genre’s anchor tenant.

Patient operators in ambient and functional audio do quite well over time. The niches are non-obvious until you find them. So why did I stop?

Cause of death

Two killers. The wrapper, which makes the cron impossible. And the marketing grind, which every channel in this category was always going to need anyway.

Suno has no public API. I reverse-engineered the site. The wrapper broke every 1-2 days: new auth flow, captcha shape changed, an endpoint moved, a response shape shifted by one field. A Saturday morning in week two, the auth flow shifted and every download in flight 401’d. I patched it before coffee. The next break came Sunday night.

I kept the crons running for about a week. Each break was small. Together they added up to a part-time job I never agreed to.

You cannot cron a business that needs babysitting every 36 hours.

The only configuration that works at this barrier to entry, where anyone with the same tools can compound on you weekly, is a daily cron. Without bulletproof automation, you’re a person operating a music label by hand, not a system shipping evergreen inventory.

Then the marketing grind. The receipts up top say the genre is crowded and getting found is the actual job. I was unwilling to do the grind for a channel I was running as a research project. The wrapper is what made me stop. The grind is what would have made me stop anyway.

What’s reusable

The music channel was the surface. The taste-filter engine underneath is the artifact.

What I actually built is a feedback loop: generate many, score each candidate against genre-defining references, keep only what scores 9+. The references are the rubric. The 9/10 gate is the filter. The output medium is incidental.

That loop is portable. Point it at recipe-blog headers. The references become the food magazines that nailed the look: Bon Appétit, Cherry Bombe, Eater feature shots. Same loop: generate fifty, grade against those, keep the 9s. The loop does not care what it is grading. It needs a measurable quality ceiling and a reference set with enough signal.

The generator gets better every quarter. The grading rubric is what I get to own. That’s the durable piece.

Kapiko was a research project priced like a coffee tab. Ten days, $150, a working pipeline, and a clean answer to can a single operator with a cron job and good taste run an ambient music channel? Yes, technically. No, not while Suno’s API is a fiction and the marketing grind is the actual moat.

The scoring loop comes with me. Next probe pointed at an adjacent surface: same scorecard, different medium.

Generation is solved. Distribution is the job. The Suno API is still a fiction.