How I built a product video with AI

React components, not keyframes. Claude Code and Remotion turned 2 hours into 39 seconds of polished video.

After creating "The Open Window" with AI, I needed a hero video for SpecStory. This time, instead of traditional video editing, I built it programmatically with Remotion and Claude Code in just 2 hours. This is a very detailed step-by-step of how I had a conversation and produced a product video that would have otherwise been nearly impossible for me. I hope it inspires you to try your hand!🚀 Get the complete source code and chat historygithub.com/specstoryai/zero-to-product-video-hero

The Initial Vision

Unlike traditional video editing, I built this programmatically with Remotion and Claude Code in just 2 hours. But the original plan was different. Here's what plan.md outlined:

## Total Duration: 35 seconds

### Scene 0: Medieval Monk Video (0-5 seconds)
**Visual:** Existing medieval monk video with text overlays
**Text Overlays:**
- "We've always preserved our intent" (1-3s)
- "Now, let AI preserve yours" (3-5s)
**Transition:** Fade to black with time warp effect

### Scene 1: Time Transition (5-7 seconds)
**Visual:** Swirling time vortex effect morphing medieval to modern
**Audio:** Whoosh/time travel sound effect
**Purpose:** Bridge the gap between past and present

### Scene 2: Modern Developer (7-11 seconds)
**Visual:** Developer coding with AI assistance, prompts floating away
**Text Overlay:** "But your AI prompts vanish..."
**Animation:** Show prompts detaching from code, floating into void
**Purpose:** Establish the problem

### Scene 3: Intent is the New Source Code (11-13 seconds)
**Visual:** Split screen - monk writing / developer coding
**Main Text:** "Intent is the new source code"
**Subtext:** "From medieval scribes to modern developers"
**Purpose:** Core message delivery

### Scene 4: SpecStory Solution (13-23 seconds)
**Visual:** Product demo showing extension in action
**Animation Sequence:**
1. Extension icon magnetically pulls prompts back (13-15s)
2. Prompts attach to code like puzzle pieces (15-17s)
3. Timeline shows prompt evolution (17-19s)
4. Team avatars appear, sharing context (19-21s)
5. Quick text overlays: "Auto-saves every prompt" • "Preserves your why" • "Share intent with your team" (21-23s)

### Scene 5: Call to Action (23-35 seconds)
**Visual:** Three extension logos orbiting central text
**Main Text:** "Get SpecStory Extensions"
**Subtext:** "Never lose your intent again"
**Counter:** Install counter ticking up
**End Card:** specstory.com

What actually happened was both simpler and more complex. Simpler because ambitious animations like "swirling time vortex effect" became a 4-second MP4 file. More complex because it did take 191 exchanges with Claude Code to discover what the video wanted to be.

As I look back and analyze my SpecStory (which I exported to help me write this recap) I noticed a lot of patterns. Here's a taste of what I detail in the feedback loop section:

  • "bigger logo" - said three times for the same logo until it was finally right
  • "sorry it should count up from 90%" - the immediate correction that made the stats feel more authentic
  • "a little" - appeared over 8 times, showing how progress happens in small nudges

Each piece of feedback shaped what came next. When I said "the zoom should be a little longer and not zoom back out," I was discovering what felt right through iteration.

The Timeline

Unlike my 10-hour journey with "The Open Window," this project was a focused sprint. The commit messages might not win any awards for clarity, but they capture the energy of the session - pure creative flow powered by Remotion.dev and a HEALTHY amount of yapping with Claude Code.

Git Commit Timeline

July 28, 2025
1h 45m
191 messages with Claude Code
START
SHIP IT! 🚀
14:41
c14d13e
15:04
20bdfd0
494cea2
15:32
12b10d5
15:57
2291b47
16:06
a8747ba
16:26
09ac4a7
7
Commits
105
Minutes
27.3
Feedback/Commit

The Technology Stack

Unlike traditional video editing, this project combined programmatic animation with AI-generated assets:

Remotion
Remotion.dev
served as the foundation - a React-based programmatic video creation framework. Think of it as "React for videos" where each frame is just a component rendered at a specific time. It handled the entire video composition, scene transitions, and frame-perfect timing.
Claude Code
Claude Code
was my tireless AI development partner, handling 191 pieces of feedback over 2 hours without a single complaint about my "make it bigger" obsession. It wrote all the Remotion code, debugged animations, and implemented every micro-adjustment. I leveraged Opus 4 extensively throughout this process—its enhanced reasoning capabilities made the conversational development flow seamless.
Gamma
Gamma.app
helped with initial character concept designs and visual exploration, allowing quick iteration on character styles before committing to final assets.
Midjourney
Midjourney
generated the character animations - creating both the medieval scribe GIF and modern developer illustration with consistent artistic style.
Google Veo 3
Google Veo 3
generated the ethereal time tunnel transition video. When code-based transitions proved too complex, Veo 3 created the perfect 4-second bridge between medieval and modern scenes.
CapCut
CapCut
provided the final audio polish - adding Paganini's violin concerto and handling the subtle fade effects that make the video feel professional.

And about that music: Paganini's Violin Concerto No.2 in B minor. The dramatic violin perfectly matches the journey from medieval to modern, adding emotional weight to the technological transformation.

What is Remotion.dev?

Before we dive into the development process, let's talk about the magic that made this possible. Remotion is not your typical video editor - it's React for videos. Yes, you read that right. Videos built with React components.

Think of it this way: traditional video editing is like painting - you manipulate pixels on a timeline. Remotion is like programming - you write code that generates frames. Each frame is just a React component rendered at a specific point in time.

Here's how my SpecStory video maps to Remotion's core concepts:

// HeroVideo.tsx - The composition definition
<Composition
  id="HeroVideo"
  component={HeroAnimation}
  durationInFrames={1170}  // 39 seconds at 30fps
  fps={30}
  width={1920}
  height={1080}
/>

Those 1170 frames? Remotion rendered my React component 1170 times, incrementing the frame count each time. The genius is in how you use that frame count:

// Inside any component
const frame = useCurrentFrame();
const opacity = interpolate(frame, [0, 30], [0, 1]);

// Frame 0: opacity = 0 (invisible)
// Frame 15: opacity = 0.5 (half visible)  
// Frame 30: opacity = 1 (fully visible)

This is the fundamental shift: instead of keyframes and timelines, you have math and logic. Want something to fade in? Interpolate. Want complex sequences? Use conditional rendering based on frame ranges. Want to loop something? Modulo arithmetic.

My 39-second video was really just five React components (VideoIntro, TimeTransition, ModernDeveloper, IntentOverlay, BenefitsAndLogos) rendered in sequence, each knowing exactly when to appear and disappear based on the current frame number.

The beauty? Everything that works in React works here. CSS-in-JS, component composition, hooks, even npm packages. That VS Code interface in scene 3? It's just divs and CSS. The animated stats counter? A custom hook that increments based on frame progression.

This is why building a video in 2 hours was possible - I wasn't learning video editing software, I was writing React. And Claude Code? It already knows React.

But here's what really makes Remotion special: it's not just about generating graphics. You can drop in real videos and audio files just like any other React asset. Remember that time tunnel transition? That was a 4-second MP4 file from Google Veo 3:

// Scene 2 - TimeTransition.tsx
<Video
  src={staticFile('time tunnel.mp4')}
  startFrom={60}  // Start 2 seconds into the video
  endAt={180}     // End at 6 seconds (4 seconds total)
/>

The medieval monk? An animated mp4 from Midjourney. The VS Code interface? Pure React components. Paganini's violin concerto? An MP3 file. Remotion doesn't care - it composites everything together, frame by frame.

This hybrid approach is the secret sauce. You get the precision of code for UI elements (perfect for product demos) combined with the richness of pre-rendered media (great for artistic elements). Traditional video editors make you choose between programmatic control or rich media. Remotion gives you both.

In my case, I used AI to generate the artistic assets (Midjourney for characters, Veo 3 for transitions), then used code to choreograph everything with pixel-perfect timing. The result? A video that looks professionally edited but was built like a web app.

The Development Process

If you've never seen Remotion's dev environment, it looks like a lot of other video editors. But under the hood, it's fundamentally different - you're not manipulating a timeline, you're writing React components that render video frames.

Remotion Studio showing video editing with React DevTools

That's React DevTools... debugging a video. Each component render is a frame. Change a prop, the video updates. Add a console.log, watch it fire 30 times per second. This is what makes Remotion special:

  • Parametric videos - Pass data to your video like props to a component. Want to generate 1000 personalized videos? Just map over your data.
  • Real MP4 output - Not a screen recording or browser capture. Actual video files rendered frame-by-frame at any resolution.
  • Hot reload everything - Change code, see video update instantly. No rendering queue, no waiting. It's like developing a website but for video.
  • Full React ecosystem - Use any npm package, CSS-in-JS, hooks, context, even Redux if you're into that. If it works in React, it works in your video.
  • Deterministic rendering - Frame 420 always looks the same. No timing issues, no async weirdness. Perfect for version control and collaboration.

The workflow becomes addictive: Write component → See video update → Realize it's 2 frames too fast → Adjust interpolation → See video update → Notice logo is too small → Adjust size → Repeat 191 times. Each iteration takes seconds, not minutes.

What really sets it apart is the mental model shift. Traditional video editing is destructive - you cut, you transform, you export. Remotion is declarative - you describe what frame 300 should look like, and it renders it. Every time. Perfectly.

This approach enabled something crucial for my SpecStory video: I could treat animations as data. The stats counter that increments from 90% to 99%? It's just interpolate(frame, [0, 60], [90, 99]). The logo that fades in? opacity: interpolate(frame, [30, 60], [0, 1]). When Claude Code suggested "make it count up from 90%," it was a one-line change.

Creating the Assets

Before I could animate anything, I needed the raw materials. I figured out after initial trial and errors that it would be fun to make some of our website—specstory.com—come alive. I had previously created both the scribe and the woman using gamma.app. So I figured I'd give animating them a whirl!

The Medieval Scribe

The scribe didn't actually take that long to animate. You can see I used a simple prompt and after 2 generations I picked one that achieved the pan in the direction I thought made the most sense.

Midjourney interface showing medieval monk generation attempts

The Modern Developer

Getting a modern developer who didn't look like she time-traveled was harder than expected. The first attempts produced characters that looked distinctly out of place next to the VS Code interface. After switching to Midjourney's character reference feature, I finally got someone who looked like she belonged in 2025.

Midjourney attempts at creating a modern developer character

The Time Tunnel

The original plan called for "swirling time vortex effect morphing medieval to modern." After 15 minutes trying to code particle effects and bezier curves in Remotion, I realized I could use those failed attempts to create something better. I took my code experiments—the particle physics, the bezier curve calculations—and transformed them into a detailed prompt for Google Flow (Veo 3).

"Time Tunnel Effect - 20 concentric rings that move through 3D space, creating a tunnel/vortex effect. Rings alternate between medieval candleglow and modern monitor glow colors, scaling from small to large as they approach the viewer's POV. A perspective clock faces in the center with: Hour markers around the perimeter - Two rotating hands at different speeds (fast hand: 2 full rotations, slow hand: 1 rotation) - Glowing center dot - Medieval and modern colored hands. Light Streaks - 6 angled light beams radiating outward from the center, with gradient opacity creating a streaming effect. Flash Effect - A white radial flash at the midpoint (frames 28-36) for dramatic impact. Animation Sequence: Frames 0-15: Fade in from black - Frames 0-60: Tunnel rings continuously flow outward, clock hands rotate - Frames 28-36: White flash peaks at frame 32 - Frames 45-60: Fade out to black. The overall effect suggests traveling through time, with the clock as the focal point and the tunnel creating a sense of motion and transformation between eras."

Google Veo 3 time tunnel transition generation

My failed code attempts weren't wasted—they became the blueprint for exactly what I needed. Google Flow generated a perfect 4-second transition that would have taken hours to code.

Scene-by-Scene Breakdown

Let me walk you through what each scene actually does and the journey of building it:

Scene 1: VideoIntro (5 seconds)

Scene 1: VideoIntro - Medieval monk with text overlays
Opens on the medieval monk with two lines of text. But getting those lines right took much longer than I expected. Originally they appeared word-by-word. User feedback: "drop vertically not twist." Then: "all text should always be on one line." The final version has each complete line fade in, hold for 2.5 seconds, then drop away.
The logo appears with the second line at exactly 600px width because, yes, I was told to make it bigger three separate times. First it was just "bigger logo," then "the specstory logo at the beginning should be a little bigger," finally "should be as big as it is at the end."
// Text timing after multiple iterations:
// "We've always preserved our intent" - frames 15-90
// "What makes today any different?" - frames 90-150
// Logo at 600px width - frame 85 onward

Scene 2: TimeTransition (4 seconds)

Scene 2: TimeTransition - Time tunnel effect
This was supposed to be a complex coded animation. The original plan called for "swirling time vortex effect morphing medieval to modern." After 15 minutes trying to code particle effects and bezier curves in Remotion, I made the pragmatic choice. Google Veo 3: "ethereal time tunnel transition, 4 seconds." Done. Sometimes the best code is no code.
The reality: <Video src={staticFile('time tunnel.mp4')} />.

Scene 3: ModernDeveloper (7 seconds)

Scene 3: ModernDeveloper - VS Code interface with AI interaction
The meat of the demo. A working VS Code interface built entirely in React. The sequence had to be precise:
  1. Claude panel appears (5-15% progress)
  2. User prompt: "Build an app together"
  3. Accept button appears and gets clicked
  4. Code streams in at 40% progress
  5. SpecStory history appears at 75% progress
  6. Zoom effect highlights the .specstory folder
The zoom effect on the .specstory folder went through multiple iterations - first it was "choppy," then "too fast," then "should be a little longer and not zoom back out." The final version uses a Bezier curve (0.25, 0.1, 0.25, 1) for that smooth, professional feel.
transform: `translate(-50%, -50%) scale(${interpolate(
  frame,
  [110, 170],
  [1, 1.8],
  {
    extrapolateLeft: 'clamp',
    extrapolateRight: 'clamp',
    easing: Easing.bezier(0.25, 0.1, 0.25, 1)
  }
)}) translateX(${interpolate(
  frame,
  [110, 170],
  [0, 400],
  { /* same options */ }
)}px)`
The notification "SpecStory: AI Conversation saved to history" moved around the screen like it was looking for a home. Started centered, then "at the top of the screen," then "a little bit bigger and more towards the bottom." Finally landed at 75% height with 2xl font size.
One crucial detail: "the cursor that appears as the code is generated should be near the gutter" - positioning it at exactly `left: 60px` made the difference between artificial and authentic.

Scene 4: IntentOverlay (5 seconds)

Scene 4: IntentOverlay - Intent is the new source code message
"Intent is the new source code" streams in over modern.mp4 background. The subtitle positioning was crucial - "You don't write prompts. You author intent." needed to be "closer up" to the main text. These 5 seconds establish the entire philosophical premise. Every word matters here.

Scene 5: BenefitsAndLogos (18 seconds)

Scene 5: BenefitsAndLogos - Benefits list, IDE logos, and call to action
The longest scene, packed with information. Benefits appear with spring animations, logos fade in on white backgrounds (learned that one the hard way with transparent PNGs), and those stats counters... well, they have their own story. Originally planned for 13 seconds, it grew to 18 as we kept adding "just one more thing."
The benefits text went from tiny to "the checkmark bullets and text should be centered and bigger." The logos needed white backgrounds because transparent on dark looked amateur. And the final CTA link? "Make the link size bigger than enhance your AI workflow today" - priorities were clear.
The stats counter at the end went through 10 distinct refinements. The "sorry it should count up from 90%" correction was revealing - starting from 10% felt fake, but 90% suggested the numbers were already substantial and growing. The final implementation used proportional rates with subtle randomness:
const stats = [
  { value: 100883, label: 'Installs', rate: 10 },
  { value: 4294574, label: 'Sessions saved', rate: 1 },
  { value: 175916, label: 'Rules generated', rate: 5 }
];

// Add randomness for dynamic counting effect
const randomOffset = Math.sin(frame * 0.3 + i * Math.PI) * 0.001;
const adjustedProgress = Math.min(1, frameProgress + randomOffset);

// Increment based on value magnitude
const incrementPerFrame = stat.value * 0.02 / 90 * stat.rate;
const displayValue = stat.value + Math.floor(frame * incrementPerFrame);
After 2 hours of intense iteration, the video was complete. The final render command that brought it all together:
npx remotion render src/index.tsx HeroVideo out/HeroVideo.mp4 \
  --quality=100 \
  --fps=30 \
  --crf=16 \
  --pixel-format=yuv444p
1170 frames of carefully orchestrated animation, compressed into a 39-second journey from medieval scribes to modern developers.

Notable Challenges

Unlike character continuity issues like I detailed in The Open Window, when you're building programmatically there are different issues to work through. Personally, I'd take these any day compared to the stochastic nature of video gen for complex sequences (despite my burgeoning love for meta prompting).

1. Compilation Errors

Remotion's interpolation system is powerful but can be unforgiving: the fix is to switch from interpolating color strings to using conditional logic as I discovered.

// ❌ Wrong - This throws an error
background: interpolate(frame, [0, 60], ['#000000', '#ffffff'])

// âś… Correct - Use conditional logic instead
background: frame > 60 ? '#ffffff' : '#000000'

// âś… Or interpolate opacity
background: `rgba(255, 255, 255, ${interpolate(frame, [0, 60], [0, 1])})`

Other technical hurdles included React closing tag mismatches (multiple instances of `</AbsoluteFull>` instead of `</AbsoluteFill>`) and missing imports (Easing wasn't imported initially). The lesson: even AI needs to sweat the syntax details.

2. Visual Hierarchy Battles

"Make it bigger" became my mantra. The feedback was relentless: "Its all cut off and needs to LOOK MUCH BETTER and more modern and the logos should be prominent." One of my favorite moments - changing "why" to "WHY" in "Save the WHY behind AI codegen" - a single capitalization that transformed the message's impact.

3. Animation Timing Perfection

The zoom animation on the VS Code file explorer went through multiple iterations. Ultimately I realized that bezier curve easing (with Opus 4's help) and careful interpolation timing would do the trick. I described that as "zoom needed to be 'a little longer and not zoom back out.'" Here is the actual implementation:

transform: `translate(-50%, -50%) scale(${interpolate(
  frame,
  [110, 170],
  [1, 1.8],
  {
    extrapolateLeft: 'clamp',
    extrapolateRight: 'clamp',
    easing: Easing.bezier(0.25, 0.1, 0.25, 1)
  }
)}) translateX(${interpolate(
  frame,
  [110, 170],
  [0, 400],
  { /* same options */ }
)}px)`

The Bezier curve (0.25, 0.1, 0.25, 1) creates that smooth, professional feel. Extended from 30 to 60 frames after feedback that it was "choppy."

The Feedback Loop

191 pieces of feedback in 105 minutes. That's one adjustment every 33 seconds. The patterns that emerged tell me that 1) I have a lot of work to do to communicate visual intent better but 2) despite that, you actually can get a lot done with this type of phrasing!

What follows is an analysis of my SpecStory history. I thought it would be fun to "meta analyze" how and what I said to make this video come to life.

Most Frequent Feedback Types

CategoryExample FeedbackFrequency
Size Adjustments"bigger logo", "make the link size bigger", "a little bigger"23 times
Timing/Speed"the should count a little slower", "a bit faster", "longer zoom"18 times
Positioning"closer up", "near the gutter", "at the top of the screen"15 times
Text Formatting"WHY should be all caps", "on one line", "drop vertically not twist"12 times
Course Corrections"sorry it should count up from 90%", "actually make it..."8 times

The Conversational Arc

The feedback process follows a natural rhythm. The discovery phase begins when the initial implementation reveals what doesn't feel right—like when I saw the design scene and immediately knew: "Its all cut off and needs to LOOK MUCH BETTER". That moment kicks off the conversation.
Then comes the refinement loop, where quick iterations with immediate visual feedback create a dialogue: "bigger" leads to seeing the result, which prompts "a little bigger", until finally reaching "MUCH BETTER". Each exchange builds on the last, creating momentum rather than frustration.
The cascade effect emerges naturally: fixing one issue reveals the next. The zoom animation gets fixed, which makes the text look too small, which when corrected reveals timing issues. The design reveals itself through interaction.

The Language of "A Little"

The word "little" appeared in over eight feedback messages. This isn't indecision—it's precision through iteration. When working visually, "a little" becomes the perfect unit of measurement because it's safe to experiment without breaking anything, acknowledges you're close to the solution, leaves room for overcorrection, and matches natural human visual processing.
The progression reveals itself: "make it bigger" starts too vague, so we move to "make it a little bigger" for testing, then "a little bigger" for refining, until we reach "perfect" or "MUCH BETTER"—success through small steps rather than giant leaps.

One moment captures the humanity in this process: "sorry it should count up from 90%". That "sorry" wasn't for Claude Code—it was me correcting my own thought mid-stream. The difference between 10% and 90% wasn't technical; it was psychological. Starting at 90% made the numbers feel established and growing, not struggling up from zero.

Remotion Patterns That Emerged

Through 191 iterations, certain patterns crystallized. These aren't just code snippets—they're the visual language that emerged from conversational development:

1. The Fade Pattern: Every Scene's Foundation

The most basic yet crucial pattern. Every scene uses opacity interpolation with clamping to prevent values outside 0-1. This appeared in Scene 1 (VideoIntro) for text reveals, Scene 4 (IntentOverlay) for the streaming text, and literally every scene transition.
// The universal fade - used 30+ times across all scenes
const fadeIn = interpolate(frame, [0, 30], [0, 1], {
  extrapolateLeft: 'clamp',
  extrapolateRight: 'clamp'
});

// Complex multi-phase fade in BenefitsAndLogos.tsx
const benefitsOpacity = interpolate(
  frame,
  [BENEFITS_START, BENEFITS_START + 30, LOGOS_START - 30, LOGOS_START],
  [0, 1, 1, 0], // Fade in, hold, fade out
  { extrapolateLeft: 'clamp', extrapolateRight: 'clamp' }
);

2. The Spring Pattern: Making Things Feel Alive

Springs add that "designed" feel. After feedback like "it needs to drop vertically not twist", springs became the go-to for entrance animations. The medieval text in Scene 1 drops with gravity using springs. The logos in Scene 5 bounce in with identical spring configs for consistency.
// Text entrance with gravity (VideoIntro.tsx)
const firstTextSpring = spring({
  fps,
  frame: frame - 15,
  config: { damping: 100, stiffness: 200 }
});

// Logo stagger animation (BenefitsAndLogos.tsx)
logos.map((logo, i) => {
  const logoSpring = spring({
    fps,
    frame: frame - LOGOS_START - 20 - i * 10, // 10 frame stagger
    config: { damping: 100, stiffness: 200 } // Same config = cohesive feel
  });
});

3. The Bezier Zoom: Professional Polish

The zoom on the .specstory folder in Scene 3 (ModernDeveloper) went through multiple iterations. "Choppy" became "too fast" became "should be a little longer and not zoom back out." The final Bezier curve (0.25, 0.1, 0.25, 1) creates that smooth deceleration that screams "professional."
// The zoom that took 6 iterations to perfect (ModernDeveloper.tsx)
transform: `translate(-50%, -50%) scale(${interpolate(
  frame,
  [110, 170], // Extended from 30 to 60 frames after "choppy" feedback
  [1, 1.8],
  {
    extrapolateLeft: 'clamp',
    extrapolateRight: 'clamp',
    easing: Easing.bezier(0.25, 0.1, 0.25, 1) // The magic curve
  }
)}) translateX(${interpolate(
  frame,
  [110, 170],
  [0, 400], // Shift right to center the folder
  { /* same options */ }
)}px)`

4. The Streaming Text Pattern: Building Anticipation

Character-by-character reveals create tension. Used in Scene 4 (IntentOverlay) for "Intent is the new source code" and Scene 3 for code generation. The blinking cursor uses sine waves for organic pulsing.
// Text streaming with cursor (IntentOverlay.tsx)
const textToStream = "Intent is the new source code";
const charsToShow = Math.floor(interpolate(
  frame,
  [20, 80],
  [0, textToStream.length],
  { extrapolateLeft: 'clamp', extrapolateRight: 'clamp' }
));

// Organic blinking cursor
<span style={{
  opacity: Math.sin(frame * 0.3) * 0.5 + 0.5, // Sine wave for smooth pulse
  color: colors.modern.monitorGlow
}}>
  {charsToShow < textToStream.length ? '|' : ''}
</span>

5. The Stagger Pattern: Choreographed Sequences

Nothing appears all at once. Benefits in Scene 5 stagger by 20 frames. Logos stagger by 10. Code lines in Scene 3 stream with calculated delays. This creates visual rhythm.
// Benefits stagger (20 frame delays)
benefits.map((benefit, i) => {
  const benefitProgress = interpolate(
    frame,
    [40 + i * 20, 55 + i * 20], // Each benefit 20 frames after previous
    [0, 1],
    { extrapolateLeft: 'clamp', extrapolateRight: 'clamp' }
  );
});

// IDE code streaming (calculated delays)
const visibleLines = codeLines.filter((_, index) => {
  const lineDelay = index * 3; // 3 frames per line
  return frame > CODE_STREAM_START * duration + lineDelay;
});

6. The Phase Management Pattern: Complex Choreography

Scene 5 (BenefitsAndLogos) is 18 seconds with three distinct phases. Constants define boundaries, making timing adjustments surgical rather than chaotic.
// Scene phase constants (BenefitsAndLogos.tsx)
const BENEFITS_START = 0;      // frames 0-210 (7 seconds)
const LOGOS_START = 210;       // frames 210-330 (4 seconds)  
const CTA_START = 330;         // frames 330-540 (7 seconds)

// Clean phase transitions
const showBenefits = frame >= BENEFITS_START && frame < LOGOS_START;
const showLogos = frame >= LOGOS_START && frame < CTA_START;
const showCTA = frame >= CTA_START;

7. The Counter Pattern: Numbers That Feel Real

The stats counter evolution in Scene 5 embodies the entire feedback process. "Count up from 90%" made them feel established. Rate-based increments with slight randomness prevent mechanical uniformity.
// The stats counter that took 10 iterations (BenefitsAndLogos.tsx)
const stats = [
  { value: 100883, label: 'Installs', rate: 10 },
  { value: 4294574, label: 'Sessions saved', rate: 1 },
  { value: 175916, label: 'Rules generated', rate: 5 }
];

// Start from 90% as per feedback
const startValue = Math.floor(stat.value * 0.9);
const incrementPerFrame = stat.value * 0.02 / 90 * stat.rate;

// Add organic randomness
const randomOffset = Math.sin(frame * 0.3 + i * Math.PI) * 0.001;
const displayValue = startValue + Math.floor(frame * incrementPerFrame);

These patterns didn't emerge from planning—they evolved through feedback. Each "make it bigger" and "a little slower" shaped not just individual animations but the entire visual language. The consistency across scenes? That's what 191 iterations of "MUCH BETTER" looks like in code.

Finishing in CapCut

Remotion outputs a video file. A very nice video file. But it's missing something - soul.

CapCut is where I added Paganini's Violin Concerto No.2 in B minor. Not AI-generated - sometimes you need the classics. The dramatic violin swells perfectly match the journey from medieval to modern. Added subtle fades between scenes, adjusted audio levels so the music doesn't overpower, exported at high quality.

CapCut editing interface showing the SpecStory video timeline

The irony of using ByteDance's consumer video editor to polish a programmatically-generated video isn't lost on me. But CapCut just works. Drag in the video, drag in the audio, line them up, add a few fades, export. Done in 10 minutes.

The music choice mattered. I tried electronic tracks, ambient soundscapes, even generated some with Suno. Nothing worked. Then I remembered Paganini - virtuosic, classical, timeless. It bridged the medieval opening to the modern tech demo perfectly. Sometimes AI isn't the answer.

Reflections on Agentic Development

Building a video through conversation is different. With Remotion and Claude Code, I could:

  • See changes instantly and not wait for rendering
  • Version control the video like any other code
  • Make precise adjustments with natural language
  • Iterate faster than I could form complete thoughts

What worked:

  • Natural language for incremental adjustments. "Make it bigger" is valid feedback when the system understands context.
  • Spatial descriptions over pixel values. "Near the gutter" meant more than "left: 60px" until we figured out what near the gutter actually meant.
  • Quick corrections mid-stream. That "sorry it should count up from 90%" moment showed how conversational development handles human indecision.

What didn't:

  • Complex animations through words. The time vortex became an MP4 for a reason.
  • Precise timing descriptions. "A little longer" could mean 2 frames or 20. Context helped, but not always.
  • Color descriptions. Easier to just try different values than explain the exact shade of blue.

The surprising part? "Make it bigger" was legitimate feedback. In traditional development, you'd specify sizes upfront. Here, you discover the right size by seeing what's wrong with the current one. Those three attempts to get the logo size right weren't failures - they were the process working as intended.

What's Next?

This video took 2 hours and 191 pieces of feedback. The next one might take 1 hour and 100. Or maybe 4 hours and 400 if it's more complex. The point isn't efficiency - it's possibility.

Remotion keeps improving. Claude Code gets better at understanding visual intent. The tools will evolve. But the fundamental shift has already happened: we can now build videos the way we build websites - iteratively, programmatically, conversationally.

For anyone building products or creating content: try talking to your tools. You might be surprised by what emerges from the conversation.

Want to try it yourself?

The entire codebase is available, and Remotion has excellent documentation. Start with a simple animation, add an AI conversation partner, and see where 2 hours takes you. You might be surprised by what you can build when you stop coding and start conversing.

Embrace the "so dope" energy. Your commit messages don't need to be perfect—they just need to capture the momentum. When you're in flow with an AI partner, ride the wave. You can always clean up the git history later (but why would you want to?).