# Animation & Production Scripts — MHP Data Engineer Masterclass 2026

**One-stop production document**: voiceover scripts + LibTV CLI commands + image/video prompts + asset checklist for all module intro videos.

**Output files**: `workshop-2026-v2/media/modules/mod-XX-*.mp4`

**Delivery spec**: 1920×1080 · 30 fps · H.264 · English voiceover + English burned-in subtitles · **All on-screen text in English only**

**Related docs**:

- [module-shot-plan.md](module-shot-plan.md) — per-shot editorial table (narrative beat, VO speaker, duration, strategy)
- [reflection-prompts.md](../trainer/reflection-prompts.md) — questions to ask immediately after each video
- [animation-style-guide.md](animation-style-guide.md) — brand colors, character specs, delivery requirements
- [TRAINING_MATERIAL_MIGRATION_PLAN.md](../../TRAINING_MATERIAL_MIGRATION_PLAN.md) — story arc and Priya / Power BI thread

---

## 1. Production Setup

### 1.1 Model Selection (LibTV)

All generation uses the **`libtv` CLI**. Model choices are fixed for cost/quality optimization.

| Role | Model Key | Display Name | CLI `-s model=` | Notes |
|------|-----------|--------------|-----------------|-------|
| **Video generation** | `star-video2` | Seedance 2.0 VIP | `-s model=star-video2` | VIP-only, ~2min/clip, 15s audio-video sync, cost-efficient |
| **Storyboard keyframes** | `nebula-ultra` | Lib Navo Pro | `-s model=nebula-ultra` | Best consistency for 16:9 keyframes, 2K/4K support |
| **Script node (archive prompt)** | `aurora-3-prime` | Aurora 3 Prime | `-s model=aurora-3-prime` | Master prompt stored on script node; **production rows** come from `configs/mod-XX.json` (see §2.8), not `--run` by default |
| **Text with English** | `lib-image` | Lib Image | `-s model=lib-image` | Title cards and scenes requiring clear English text rendering |
| **Quick iteration** | `libnavo-2` | LibNavo 2 | `-s model=libnavo-2` | Fast previews, chapter cards |

**Video model parameters** (`star-video2`):

| CLI flag | Value | Description |
|----------|-------|-------------|
| `-s ratio=16:9` | `16:9` | Aspect ratio (all modules) |
| `-s resolution=720p` | `720p` | Resolution |
| `-s duration=N` | `4`–`15` | Duration in seconds (match scene timing) |
| `-s modeType=singleImage2video` | `singleImage2video` | Image-to-video mode (default for storyboard → video) |
| `-s enableSound=off` | `off` | Disable built-in audio (voiceover uploaded separately per module) |

**Image model parameters** (`nebula-ultra`):

| CLI flag | Value | Description |
|----------|-------|-------------|
| `-s ratio=16:9` | `16:9` | Aspect ratio |
| `-s quality=2K` | `2K` | Resolution quality |

### 1.2 Logo Assets

Use local logos from `shared/assets/images/logos/` (official SVG). Only YellowLine NYC custom brand logo needs to be created manually.

| Logo | Local file | Fallback download | Used in modules |
|------|-----------|-------------------|-----------------|
| **Databricks** | `shared/assets/images/logos/logo_databricks_01.svg` | `vectorlogo.zone/logos/databricks/` | 0, 1, 2, 4, 5, 6, 7, 9 |
| **Snowflake** | `shared/assets/images/logos/logo_snowflake_01.svg` | `vectorlogo.zone/logos/snowflake/` | 0, 1, 3, 4, 5, 6, 7, 9 |
| **dbt** | `shared/assets/images/logos/logo_dbt.svg` | `vectorlogo.zone/logos/getdbt/` | 0, 1, 4, 5, 7, 9 |
| **Apache Spark** | `shared/assets/images/logos/logo_apache_spark.svg` | `vectorlogo.zone/logos/apache_spark/` | 2, 9 |
| **Power BI** | `shared/assets/images/logos/logo_powerbi.svg` | `vectorlogo.zone/logos/microsoft_power_bi/` | 0, 1, 2, 3, 4, 7, 8, 9 |
| **Apache Kafka** | `shared/assets/images/logos/logo_apache_kafka.svg` | `vectorlogo.zone/logos/apache_kafka/` | 8 |
| **GitHub Actions** | `shared/assets/images/logos/logo_github.svg` | `vectorlogo.zone/logos/github/` | 5 |
| **Azure** | `shared/assets/images/logos/logo_azure_01.svg` | `vectorlogo.zone/logos/microsoft_azure/` | 2, 3, 5 |
| **Python** | `shared/assets/images/logos/logo_python.svg` | `vectorlogo.zone/logos/python/` | 2, 9 |
| **MLflow** | `shared/assets/images/logos/logo_mlflow.svg` | `vectorlogo.zone/logos/mlflow/` | 9 |
| **AWS** | `shared/assets/images/logos/logo_aws.svg` | `vectorlogo.zone/logos/amazon_aws/` | 0, 1 (eval matrix) |
| **Cloudera** | `shared/assets/images/logos/logo_cloudera.svg` | — | 0, 1 (eval matrix) |

**Custom logos / brand assets**:

| Logo | Local file | Specification | Used in |
|------|-----------|--------------|---------|
| **MHP** | `shared/assets/images/mhp-logo.png` | Navy blue `#01065c`, company brand guidelines | 0, 1, 7 |
| **YellowLine NYC** | `shared/assets/images/logos/png/logo-yellowline.png` | Yellow `#FBBF24`, navy `#01065c`, taxi shield badge | 0, 1, 7 |

**Upload logos to LibTV**:

```bash
# Upload each logo as an image node for reuse across modules
libtv upload "logo-databricks" -f ./shared/assets/images/logos/logo_databricks_01.svg
libtv upload "logo-snowflake" -f ./shared/assets/images/logos/logo_snowflake_01.svg
libtv upload "logo-dbt" -f ./shared/assets/images/logos/logo_dbt.svg
libtv upload "logo-powerbi" -f ./shared/assets/images/logos/logo_powerbi.svg
libtv upload "logo-mhp" -f ./shared/assets/images/mhp-logo.png
libtv upload "logo-spark" -f ./shared/assets/images/logos/logo_apache_spark.svg
libtv upload "logo-azure" -f ./shared/assets/images/logos/logo_azure_01.svg
libtv upload "logo-python" -f ./shared/assets/images/logos/logo_python.svg
libtv upload "logo-kafka" -f ./shared/assets/images/logos/logo_apache_kafka.svg
libtv upload "logo-github" -f ./shared/assets/images/logos/logo_github.svg
libtv upload "logo-mlflow" -f ./shared/assets/images/logos/logo_mlflow.svg
libtv upload "logo-yellowline" -f ./shared/assets/images/logos/png/logo-yellowline.png
```

### 1.3 Character References

Six characters with **American animation** flat 2D corporate style and **MHP blue-white** scenes (see [animation-style-guide.md](animation-style-guide.md)). Garment colors are per-character — not all navy/yellow. Generate via LibTV CLI — see [Section 2.3](#23-character-generation-workflow).

| Character | Role | Skin Tone | Modules | Lower-third text | Voice (edge-tts) |
|-----------|------|-----------|---------|-----------------|-----------------|
| **Marcus Chen** | Operations Manager · YellowLine NYC | warm beige (浅棕暖肤色) | 0, 3–9 | `Marcus Chen · Operations Manager` | `en-US-ChristopherNeural` |
| **Elena Vasquez** | Data Architect · MHP | fair warm (白皙偏暖肤色) | 0–2, 4–9 | `Elena Vasquez · Data Architect` | `en-US-JennyNeural` |
| **Bob Muller** | Junior Data Engineer · MHP | fair (白皙肤色) | 0–2, 4–9 | `Bob Muller · Junior Data Engineer` | `en-US-GuyNeural` |
| **Sofia Alvarez** | Senior Data Engineer · MHP | fair warm (白皙偏暖肤色) | 0, 2–3, 8–9 | `Sofia Alvarez · Senior Data Engineer` | `en-US-EmmaNeural` |
| **Priya Sharma** | BI Analyst · MHP | light brown (浅棕色肤色) | 0–4, 6–9 | `Priya Sharma · BI Analyst` | `en-US-AriaNeural` |
| **James Okonkwo** | Data Analyst · MHP | dark brown (深棕色肤色) | 0–1, 4, 6, 8–9 | `James Okonkwo · Data Analyst` | `en-US-EricNeural` |
| **Narrator** | — | — | All | — | `en-US-AndrewNeural` |

**Voice lock (mandatory)**: All module clips share the same edge-tts voice per character. Canonical source: `workshop-2026-v2/scripts/pipeline/configs/characters.json`. Pipeline TTS rate: **`rate="+0%"`** (`ttsRate` in `characters.json`). Default BGM: `defaultAudio` in the same file; override per module via `module.audio` in `mod-XX.json`. LibTV video clips use **`enableSound=off`** — never use embedded clip audio; compose with `produce-module.js` voiceover phase only. Changing a voice ID requires regenerating voiceover for **all** modules.

See also: `scripts/pipeline/lib/voice-lock.js` · master prompts list `voice=` on each VO line.

### 1.4 Language Policy (Mandatory)

| Content type | Language | Rule |
|-------------|----------|------|
| Voiceover / TTS | **English** | See per-module scripts below |
| Burned-in subtitles | **English** | ≤ 70 chars/line; match voiceover script |
| Chapter cards / slogans / lower-thirds | **English** | See "Screen text" in each scene |
| Chart labels / icon labels | **English** | Bronze / Silver / Gold, Cost / Performance / Compliance |
| Power BI / product UI screenshots | **English** | Use real English screenshots from `powerbi/` |
| LibTV CLI prompts | Chinese OK | Internal prompts for operators only |
| This document body | English + Chinese | Production instructions; not viewer-facing |

**Every image prompt must end with the English-text suffix** (see [Section 2.5](#25-mandatory-image-prompt-suffix)).

---

## 2. Global LibTV CLI Setup

### 2.1 Login & Project

```bash
# Login (one-time)
libtv login web

# Create project
libtv project create "MHP-DE-Masterclass-2026"

# Set active project (write to .libtv/project.json)
libtv project use <PROJECT_UUID>

# Project UUID (existing): 5da20fc3ac8d43b6bece1b842471d08d
```

### 2.2 Production Phases (Read Before Starting)

**Phase 0 — Shared Assets: Characters + Logos (MUST complete before any module production)**:

All 6 character portraits, three-view turnarounds, and all brand logos must be generated/uploaded as project nodes **before** creating any module scene images. Both characters and logos persist as `image` nodes in the canvas project and are referenced via `--left` connections in every scene that features them.

```
Phase 0: Shared assets   → Generate 6 portraits + turnarounds, upload all logos (Section 2.3, Section 1.2)
Phase 1: Module scenes   → Reference character & logo nodes via --left for each scene image
Phase 2: Video clips     → Image-to-video from scene images
Phase 3: Composition     → Assemble video + audio into final MP4
```

**Why Phase 0 first?** Each module's scene image prompts reference the same character turnaround nodes and logo nodes. If these shared assets are not locked before module production, cross-module visual consistency is impossible. Once approved, they become **read-only references** for all subsequent modules — never regenerate or re-upload.

**Character node naming convention** (persists across all modules):

| Character | Node name | Used in modules |
|-----------|-----------|-----------------|
| Marcus Chen | `char-marcus-turnaround` | 0, 3–9 |
| Elena Vasquez | `char-elena-turnaround` | 0–2, 4–9 |
| Bob Muller | `char-bob-turnaround` | 0–2, 4–9 |
| Sofia Alvarez | `char-sofia-turnaround` | 0, 2–3, 8–9 |
| Priya Sharma | `char-priya-turnaround` | 0–4, 6–9 |
| James Okonkwo | `char-james-turnaround` | 0–1, 4, 6, 8–9 |

**Logo node naming convention** (persists across all modules):

| Logo | Node name | Used in modules |
|------|-----------|-----------------|
| Databricks | `logo-databricks` | 0, 1, 2, 4, 5, 6, 7, 9 |
| Snowflake | `logo-snowflake` | 0, 1, 3, 4, 5, 6, 7, 9 |
| dbt | `logo-dbt` | 0, 1, 4, 5, 7, 9 |
| Power BI | `logo-powerbi` | 0, 1, 2, 3, 4, 7, 8, 9 |
| Apache Spark | `logo-spark` | 2, 9 |
| Apache Kafka | `logo-kafka` | 8 |
| GitHub Actions | `logo-github` | 5 |
| Azure | `logo-azure` | 2, 3, 5 |
| Python | `logo-python` | 2, 9 |
| MLflow | `logo-mlflow` | 9 |
| MHP | `logo-mhp` | 0, 1, 7 |
| YellowLine NYC | `logo-yellowline` | 0, 1, 7 |

Upload all logos once via `libtv upload` (see [Section 1.2](#12-logo-assets) for full file paths):

```bash
libtv upload "logo-databricks" -f ./shared/assets/images/logos/logo_databricks_01.svg
libtv upload "logo-snowflake" -f ./shared/assets/images/logos/logo_snowflake_01.svg
# ... all logos per Section 1.2 ...
```

**How to reference characters in scene images** (Phase 1):

```bash
# Example: Story Scene 1 features Marcus
libtv node create "mod-00-s1-image" -t image \
  --prompt "NYC night skyline, dispatch center, {{Image 1}} at desk, stressed but professional...All visible text on screen must be in English only. No Chinese characters." \
  --left "char-marcus-turnaround" \
  -s model=nebula-ultra -s ratio=16:9
libtv node "mod-00-s1-image" --run

# Example: Story Scene 3 features multiple characters
libtv node create "mod-00-s3-image" -t image \
  --prompt "Uploaded logo {{Image 1}} centered top header; five character cards {{Image 2–6}} in one horizontal row below..." \
  --left "logo-mhp" \
  --left "char-elena-turnaround" \
  --left "char-bob-turnaround" \
  --left "char-sofia-turnaround" \
  --left "char-priya-turnaround" \
  --left "char-james-turnaround" \
  -s model=nebula-ultra -s ratio=16:9
```

**Phase 0 QA gate** (check before proceeding to Phase 1):

- [ ] All 6 character portrait nodes created and approved
- [ ] All 6 three-view turnaround nodes generated via `libtv image shortcut "角色三视图"`
- [ ] Character node names follow `char-{name}-turnaround` convention
- [ ] All brand logos uploaded via `libtv upload` and node names follow `logo-{name}` convention
- [ ] YellowLine NYC custom logo created and uploaded (the only logo not available in `shared/assets/`)
- [ ] Cross-module consistency verified: same node referenced, not regenerated or re-uploaded

### 2.3 Character Generation Workflow

**Canonical source**: Portrait prompts, `visualLock`, and turnaround text live in [`scripts/pipeline/configs/characters.json`](../scripts/pipeline/configs/characters.json). The blocks below mirror that file for operators reading this doc; if they diverge, update **both** places or regenerate from JSON via `produce-module.js --phase fix-character`.

For each of the 6 characters, follow this CLI sequence:

**Step 1 — Initial portrait** (image node):

```bash
libtv node create "char-{name}-portrait" -t image \
  --prompt "<PORTRAIT_PROMPT>" \
  -s model=nebula-ultra -s ratio=16:9 -s quality=2K
libtv node "char-{name}-portrait" --run
```

**Step 2 — Three-view turnaround** (Slash from portrait):

```bash
# List available Slash commands — find the "角色三视图" scene key
libtv image shortcut list

# Run character three-view turnaround (creates new image node to the right of portrait)
libtv image shortcut "角色三视图" -n "char-{name}-portrait"

# Verify the turnaround node was created
libtv node list
```

**Step 3 — Confirm and rename turnaround node**:

```bash
# The shortcut creates a new image node. Rename it to the standard convention:
libtv node "<auto-generated-name>" --name "char-{name}-turnaround"
# The turnaround node now persists in the project for cross-module reuse via --left.
```

**Step 4 — If turnaround quality is insufficient, re-run**:

```bash
# Don't just change the prompt — re-run the shortcut on the same portrait node
libtv image shortcut "角色三视图" -n "char-{name}-portrait"
# Then rename the new result and delete the old one
```

#### Character Portrait Prompts

**Marcus Chen** (skin: warm beige):

```text
Half-body illustrated avatar, Marcus Chen, Operations Manager, age 40s, East Asian or mixed-heritage NYC executive, warm beige skin tone, business casual open-collar shirt, yellow YellowLine English name badge, neat short dark hair, slightly stressed professional, flat 2D corporate training, light gray background, navy #01065c and yellow #FBBF24 accents, NOT photorealistic. All visible text on screen must be in English only. No Chinese characters.
```

**Elena Vasquez** (skin: fair warm):

```text
Half-body avatar, Elena Vasquez, Data Architect, Latina woman age 40s, fair warm skin, DISTINCT wavy chestnut-brown shoulder-length hair LOOSE on shoulders (NOT ponytail), navy MHP structured blazer #01065c over white collared blouse, gold architect lapel pin, dry-erase marker in blazer pocket, calm authoritative, flat 2D corporate e-learning, white background, NOT photorealistic. All visible text on screen must be in English only. No Chinese characters.
```

**Bob Muller** (skin: fair):

```text
Half-body avatar, Bob Muller, Junior Data Engineer, age 25-32, fair skin tone, smart casual shirt, laptop, curious eager smile, flat 2D corporate training, light gray background, NOT photorealistic. All visible text on screen must be in English only. No Chinese characters.
```

**Sofia Alvarez** (skin: fair warm):

```text
Half-body avatar, Sofia Alvarez, Senior Data Engineer, Latina woman early 30s, fair warm skin, DISTINCT black hair in LOW PONYTAIL with yellow hair tie (NOT loose hair), navy MHP zip polo #01065c with yellow collar stripe rolled sleeves (NOT blazer), tablet with notebook, supportive confident engineer, flat 2D corporate e-learning, white background, NOT photorealistic. All visible text on screen must be in English only. No Chinese characters.
```

**Priya Sharma** (skin: light brown):

```text
Half-body avatar, Priya Sharma, BI Analyst, South Asian woman 28-38, light brown skin tone, straight black hair HALF-UP with green #107c10 barrette, charcoal-gray blazer over green #107c10 shell top (NOT navy blazer), small KPI card English labels, presenter pose, flat 2D corporate training, NOT photorealistic. All visible text on screen must be in English only. No Chinese characters.
```

**James Okonkwo** (skin: dark brown):

```text
Half-body avatar, James Okonkwo, Data Analyst, Black man 28-38, dark brown skin tone, business casual, tablet with abstract SQL icons, analytical focused, flat 2D corporate e-learning, light gray background, NOT photorealistic. All visible text on screen must be in English only. No Chinese characters.
```

#### Character Bible (CAST Block)

Paste into script node "character description" field, or as a text node connected to all character-related image nodes.

```text
【CAST — MHP Data Engineer Masterclass 2026 · 全模块同一套设计】

Marcus Chen · Operations Manager · YellowLine NYC (CLIENT)
- 年龄感 40 左右；务实、略疲惫但专业；商务休闲（开领衬衫，无厚重西装）
- 标志物：黄色 YellowLine NYC 工牌（英文）、调度中心/Excel 多屏背景
- 出现模块：0, 3–9；语气 pragmatic, business-focused
- 下三分之一：Marcus Chen · Operations Manager

Elena Vasquez · Data Architect · MHP (CONSULTANT LEAD)
- 年龄感 40 左右；冷静权威、结构化讲解
- **发型**：栗棕波浪披肩发（松散，**禁止马尾/盘发**）
- **服装**：海军蓝 MHP **西装外套 + 白衬衫领** + 金色 architect 胸针（**不是 Polo/拉链衫**）
- 标志物：白板马克笔（插在外套口袋）
- 出现模块：0–2, 4–9；白板/架构图场景为主
- 下三分之一：Elena Vasquez · Data Architect

Bob Muller · Junior Data Engineer · MHP (TRAINEE PROXY — 学员代入)
- 年龄感 25–32；好奇 eager；休闲商务 + 笔记本电脑
- 标志物：笔记本、举手发言、Databricks 模块主角
- 出现模块：0–2, 4–9；与 Sofia 搭档
- 下三分之一：Bob Muller · Junior Data Engineer

Sofia Alvarez · Senior Data Engineer · MHP (MENTOR)
- 年龄感 30 出头；支持性、动手型工程师
- **发型**：**黑色低马尾 + 黄色发圈**（**禁止披肩散发** — 与 Elena 区分）
- **服装**：海军蓝 MHP **拉链 Polo/工装衫 + 黄色领边 + 卷袖**（**禁止西装外套/白衬衫** — 与 Elena 区分）
- 标志物：平板/笔记本（ADLS2、Delta Bronze 场景）
- 出现模块：0, 2–3, 8–9；Module 2 辅导 Bob 为重点
- 下三分之一：Sofia Alvarez · Senior Data Engineer

Priya Sharma · BI Analyst · MHP (DASHBOARD OWNER)
- 年龄感 28–38；南亚裔、**浅棕肤色**（与 Elena/Sofia 暖白肤区分）
- **发型**：**黑色半扎发 + 绿色 #107c10 发夹**（非波浪棕发、非马尾）
- **服装**：**炭灰西装 + 绿色 #107c10 内搭**（**禁止海军蓝 MHP 西装** — 与 Elena 区分）
- 标志物：Power BI 屏幕、KPI 卡片、绿色图表点缀
- 出现模块：0–4, 6–9；所有 PBI 合成场景
- 下三分之一：Priya Sharma · BI Analyst

James Okonkwo · Data Analyst · MHP (VALIDATION / SQL)
- 年龄感 28–38；理性、数据验证；平板或显示器上 SQL 图标
- 标志物：与 Priya 同框、SQL/验证场景
- 出现模块：0–1, 4, 6, 8–9
- 下三分之一：James Okonkwo · Data Analyst

【跨角色规则】
- 六人体现 NYC 团队多样性；respectful 企业插画，无刻板漫画
- 同一角色：发型、服装色、脸型简笔画全模块一致
- **三位女性互斥造型**：Elena 披肩发+西装+马克笔 / Sofia 马尾+Polo+平板 / Priya 半扎发+绿内搭+KPI 卡 — 禁止两人共用「海军蓝西装+深色长发」
- 首次出场必须显示英文下三分之一
```

#### Three-View Turnaround Prompts

Use the satisfactory portrait output as reference → `/` → **角色三视图** → paste corresponding block below. Layout: front · profile · back; white background.

**Marcus Chen**:

```text
【中文】Marcus Chen 正/侧/背三视图，YellowLine NYC 英文工牌三面可见，商务休闲一致，2D 扁平，白底，无写实

【English】Character turnaround sheet, Marcus Chen Operations Manager, front side back, same outfit and hair, yellow YellowLine NYC English badge all views, flat 2D corporate training, white background, NOT photorealistic
```

**Elena Vasquez**:

```text
【中文】Elena 三视图，海军蓝 MHP 西装+白衬衫领，栗棕波浪披肩发（禁止马尾），马克笔+金胸针，2D 白底

【English】Turnaround sheet, Elena Vasquez Data Architect, front side back, navy MHP blazer over white blouse, LOOSE wavy chestnut-brown shoulder-length hair (never ponytail), marker accessory, gold lapel pin, calm authoritative, flat 2D illustration, white background, NOT photorealistic
```

**Bob Muller**:

```text
【中文】Bob 三视图，休闲商务 + 电脑/背包， curious 一致，2D 白底

【English】Turnaround sheet, Bob Muller Junior Data Engineer, front side back, smart casual, laptop or bag on back view, curious friendly, flat 2D corporate training, white background, NOT photorealistic
```

**Sofia Alvarez**:

```text
【中文】Sofia 三视图，海军蓝 MHP 拉链 Polo+黄领边+卷袖（禁止西装），黑色低马尾+黄发圈，平板 prop，2D 白底

【English】Turnaround sheet, Sofia Alvarez Senior Data Engineer, front side back, navy zip polo with yellow collar stripe (NOT blazer), black hair LOW PONYTAIL with yellow tie, tablet notebook prop, rolled sleeves, supportive confident, flat 2D e-learning, white background, NOT photorealistic
```

**Priya Sharma**:

```text
【中文】Priya 三视图，炭灰西装+绿色内搭，黑色半扎发+绿发夹，KPI 卡片，2D 白底

【English】Turnaround sheet, Priya Sharma BI Analyst, front side back, charcoal blazer green #107c10 shell top, half-up black hair with green barrette, KPI card prop, presenter stance, flat 2D corporate training, white background, NOT photorealistic
```

**James Okonkwo**:

```text
【中文】James 三视图，商务休闲一致，平板 prop， analytical 一致，2D 白底

【English】Turnaround sheet, James Okonkwo Data Analyst, front side back, business casual consistent, tablet prop, analytical focused, flat 2D illustration, white background, NOT photorealistic
```

#### Script Node Binding Template

Paste into script node along with the uploaded turnaround image for each character.

**Template**:

```text
Character: {NAME} · {ROLE}
Visual lock: Match uploaded turnaround exactly — face, hair, outfit, props.
On-screen lower-third (English, first appearance): {NAME} · {ROLE}
Flat 2D illustration only. Do not change ethnicity or costume between scenes.
```

**Six-character quick-copy**:

```text
Character: Marcus Chen · Operations Manager · yellow YellowLine badge · pragmatic · lower-third Marcus Chen · Operations Manager
Character: Elena Vasquez · Data Architect · LOOSE chestnut wavy hair navy BLAZER white blouse marker pen · NOT ponytail NOT polo · lower-third Elena Vasquez · Data Architect
Character: Bob Muller · Junior Data Engineer · laptop · curious eager · lower-third Bob Muller · Junior Data Engineer
Character: Sofia Alvarez · Senior Data Engineer · LOW PONYTAIL yellow tie navy ZIP POLO rolled sleeves tablet · NOT blazer NOT loose hair · lower-third Sofia Alvarez · Senior Data Engineer
Character: Priya Sharma · BI Analyst · HALF-UP hair green barrette charcoal blazer green top KPI card · NOT navy MHP blazer · lower-third Priya Sharma · BI Analyst
Character: James Okonkwo · Data Analyst · SQL tablet · analytical · lower-third James Okonkwo · Data Analyst
```

#### Expression & Pose Variants

Append to scene image prompts containing the character (uses same turnaround asset):

```text
Same character as reference turnaround. Expression: {neutral | curious | presenting | concerned | approving}. Pose: {standing | seated | pointing at screen | whiteboard gesture}. Minimal change, flat 2D.
```

| Character | Common modules | Append English |
|-----------|---------------|----------------|
| Marcus | 0, 3–9 | `slightly stressed reviewing reports` / `pushback concerned` / `production decision serious` / `live dispatch frustrated` / `tip prediction curious` |
| Elena | 0–2, 4–9 | `teaching at whiteboard, loose chestnut hair blazer marker` / `calm approval nod` |
| Bob | 0–2, 4–9 | `raising hand pitching` / `learning beside Sofia` |
| Sofia | 0, 2–3, 8–9 | `mentoring ponytail zip polo pointing at notebook ADLS2` |
| Priya | 0–4, 6–9 | `presenting dashboard half-up hair green top` / `pointing at KPI` |
| James | 0–1, 4, 6, 8–9 | `validating data thoughtful` |

#### Lower-Third Labels (English)

Create as text nodes for each character's first appearance:

```text
Marcus Chen · Operations Manager
Elena Vasquez · Data Architect
Bob Muller · Junior Data Engineer
Sofia Alvarez · Senior Data Engineer
Priya Sharma · BI Analyst
James Okonkwo · Data Analyst
```

Style: navy `#01065c` bar + white text + optional yellow `#FBBF24` accent line; 3–5 second fade.

#### Character QA Checklist

| Check | Pass criteria |
|-------|--------------|
| Turnaround | Front/side/back consistent; white background; node renamed to `char-{name}-turnaround` |
| Cross-module | All shots reference the **same** turnaround node via `--left` |
| Text | Badges, lower-thirds **English only** |
| Style | 2D flat; not photorealistic |
| Failure | Face drift → re-run `libtv image shortcut "角色三视图" -n "char-{name}-portrait"`, don't just change prompt |

### 2.4 Global Style Prompt

Paste once per module canvas (as a text node connected to all image generators):

```text
STYLE: American animation corporate e-learning motion graphics. NOT cinematic, NOT photorealistic, NOT anime, NOT 3D render.
Flat 2D illustrated characters with clean outlines and solid color blocks.
Scene and UI: MHP blue-white palette — navy #01065c, white #FFFFFF, light gray #F1F5F9 backgrounds; yellow #FBBF24 for YellowLine/taxi accents only.
Character garment colors follow configs/characters.json outfitColors (not all navy).
16:9 1920x1080, stable camera, even bright lighting.
MANDATORY: All visible on-screen text in English only — titles, lower-thirds, labels, UI. No Chinese characters anywhere in frame.
English burned-in subtitles match voiceover script.
```

**Character style lock** (paste alongside the style prompt; mirrors `characters.json` → `style.characterLock`):

```text
【角色画风 — 六人统一】
美式动画 / 企业培训 2D 扁平插画，非写实、非 3D、非动漫大眼。
简化几何面部：小鼻、少阴影、纯色皮肤块、清晰轮廓线。
场景与 UI：MHP 蓝白（海军蓝 #01065c、白 #FFFFFF、浅灰 #F1F5F9）；黄 #FBBF24 仅用于 YellowLine NYC/出租车点缀。
每位角色服装色见 characters.json outfitColors（Marcus 灰衬衫+黄工牌；Elena 海军蓝西装+白衬衫；Sofia 海军蓝 Polo+黄领条；Priya 炭灰西装+绿色上衣；Bob 亮蓝衬衫；James 青绿衬衫）。
三位女性发型/轮廓必须区分（Elena 披肩栗发；Sofia 低马尾；Priya 半扎发+绿色发夹）。
光照均匀明亮；工牌/名牌仅英文。
```

```text
CHARACTER STYLE LOCK: American animation / corporate e-learning illustration (flat 2D, clean outlines, solid color blocks).
NOT photorealistic, NOT 3D render, NOT anime.
Scene and UI use MHP blue-white palette; character OUTFIT colors follow each character outfitColors hex codes in characters.json.
Even bright lighting, no dramatic shadows. English text only on badges and name tags. No Chinese characters.
```

### 2.5 Mandatory Image Prompt Suffix

**Append to every image prompt** to prevent Chinese text in generated images:

```text
All visible text on screen must be in English only: chapter titles, lower-thirds, chart labels, sticky notes, UI text. No Chinese characters. English subtitles will be burned in separately — do not render Chinese in the image.
```

### 2.6 Video Prompt Template (Seedance 2.0 VIP)

Default video generation prompt for all scenes:

```text
Subtle slow pan or gentle zoom, completely stable camera, 2D corporate training illustration style, minimal movement, no fast cuts, approximately {DURATION} seconds, maintain readability.
```

Chapter card variant:

```text
English title text gentle fade-in, background geometric shapes slowly moving, no character movement, approximately {DURATION} seconds.
```

Montage variant:

```text
Quick but clean hard cuts (1-2 seconds per card), consistent MHP blue-white palette with yellow #FBBF24 accents only where needed, no blur transitions.
```

**Forbidden**: handheld shake, cinematic depth of field, large character movements, facial close-up animation.

### 2.7 Standard Per-Module CLI Workflow

**Prerequisite**: Phase 0 (characters + logos) must be complete. See [Section 2.2](#22-production-phases-read-before-starting).

For each module, follow this sequence:

```bash
# 1. Create module group and bind as default scope
libtv group create "mod-{XX}-{slug}"
libtv group use "mod-{XX}-{slug}"
#    After `group use`, all subsequent node/upload commands default to this group.

# 2. Upload module-specific reference assets (PBI screenshots, etc.)
#    Logos and character turnarounds are already in the project from Phase 0 — do NOT re-upload.
libtv upload "mod-{XX}-pbi-overview" -f ./powerbi/mod-{XX}-overview.png

# 3. Create script node — archive master prompt + inject production rows
#    ⚠️ Prefer produce-module.js --phase setup (§2.8), not manual --run on mod-XX-master.txt:
#    full master prompts often hit LibTV policy (10023); Windows CLI cannot pass full rows JSON (ENAMETOOLONG).
libtv node create "mod-{XX}-script" -t script \
  --prompt "<MODULE_SCRIPT_PROMPT>" \
  -s model=aurora-3-prime
# Rows: from configs/mod-XX.json via pipeline (config-rows-api), not:
#   libtv node "mod-{XX}-script" --run
#   libtv node "mod-{XX}-script" -u rows=@path   # broken — stores path as string

# 4. Generate storyboard image group from script
#    This auto-creates a "storyboard image" group with one image child node per script row.
#    Child nodes are named after the storyboard row titles (check with `libtv node list`).
libtv script storyboard "mod-{XX}-script" \
  -s model=nebula-ultra -s ratio=16:9

# 4a. Verify generated storyboard images
libtv node list
#    Check all storyboard image nodes have results. If any failed, re-run that specific node:
#    libtv node "mod-{XX}-s{N}-image" --run

# 4b. ⚠️ CRITICAL: Switch storyboard nodes to image2image mode BEFORE linking references
#    libtv script storyboard creates nodes in text2image mode by default.
#    Without modeType=image2image, --left character/logo references are IGNORED.
#    For each storyboard image node that contains characters or logos:
libtv node "mod-{XX}-s{N}-image" \
  -s modeType=image2image \
  --left "char-{name}-turnaround" \
  --run
#    Pure scene shots (no characters/logos) can stay text2image — no switch needed.

# 5. Link character turnaround & logo nodes to storyboard scene images (Phase 0 → Phase 1)
#    The storyboard step created image nodes automatically. Now link character/logo references
#    to those nodes so they can be used as {{Image 1}}, {{Image 2}} in re-generation.
#    Use `libtv node list` to find the auto-created image node names, then:
#    Example: Module 2 Scene 2 features Sofia mentoring Bob
libtv node "mod-02-s2-image" \
  --left "char-sofia-turnaround" \
  --left "char-bob-turnaround" \
  --run

#    Example: Module 1 Scene 3 features logos on evaluation board
libtv node "mod-01-s3-image" \
  --left "logo-databricks" \
  --left "logo-snowflake" \
  --left "logo-dbt" \
  --run

# 6. Generate video clips from storyboard images (per scene)
libtv node create "mod-{XX}-scene-{N}-video" -t video \
  --prompt "<VIDEO_PROMPT>" \
  -s model=star-video2 \
  -s modeType=singleImage2video \
  -s ratio=16:9 \
  -s resolution=720p \
  -s duration={SECONDS} \
  -s enableSound=off \
  --left "mod-{XX}-scene-{N}-image"
libtv node "mod-{XX}-scene-{N}-video" --run

# 7. Upload voiceover audio
libtv upload "mod-{XX}-voiceover" -f ./audio/mod-{XX}.wav

# 8. Video composition — connect all video + audio → video-clip node
libtv node create "mod-{XX}-final" -t video-clip
#    Link all scene video nodes and voiceover as inputs:
libtv node "mod-{XX}-final" \
  --left "mod-{XX}-scene-1-video" \
  --left "mod-{XX}-scene-2-video" \
  --left "mod-{XX}-voiceover"
#    Then set clipTimelineData via canvas UI (I/O trimming, clip ordering) and:
#    libtv node "mod-{XX}-final" --run

# 9. Verify all nodes succeeded
libtv node list
#    If any node shows status=3 (failed), re-run only that node:
#    libtv node "<failed-node-name>" --run
```

**Error handling** (do NOT rebuild the entire canvas on failure):

- Use `libtv node list` to identify failed nodes (look for missing result URLs or `status=3`)
- Re-run individual failed nodes: `libtv node "<name>" --run`
- For auth/project binding errors: `libtv project use <UUID>` to rebind
- For batch scripts, prefix with `set -euo pipefail` so failures stop immediately (see §5.4)
- Use `libtv project` for a quick canvas structure overview (nodes + edges)

### 2.8 Automated pipeline (`produce-module.js`)

**Recommended** for Story + Modules 1–7 after Phase 0 (characters + logos). Orchestrates setup → storyboard → binding → videos → voiceover → compose.

```powershell
cd <repo-root>
node workshop-2026-v2/scripts/pipeline/produce-module.js 00 --phase setup
node workshop-2026-v2/scripts/pipeline/produce-module.js 00 --phase storyboard
# Or full run from setup:
node workshop-2026-v2/scripts/pipeline/produce-module.js 00 --from-phase setup
```

| Topic | Behavior |
|-------|----------|
| **Script rows source** | `configs/mod-XX.json` → `scenes[]` (18 shots for mod-00). Set `module.scriptRowsMode`: `config` (default), `ai`, or `auto`. |
| **Master prompt** | `prompts/mod-XX-master.txt` kept on script node as **archive only** when `scriptRowsMode: config`. |
| **Large rows upload** | Payload over Windows CLI limit → HTTP `POST https://api.liblib.tv/api/canvas/nodes/batch` via `lib/script-rows-api.js` (credentials: `~/.libtv/credentials.json`, project: `.libtv/project.json`). |
| **Manual re-upload** | `node workshop-2026-v2/scripts/pipeline/upload-script-rows.js 00` (rows only, no full setup). |
| **Storyboard binding** | Pipeline links script → turnarounds, patches `characterImageUrl`, runs `libtv script storyboard`, then **image2image** + `--left` per shot (`fix-i2i` uses same logic). |
| **Skip duplicate storyboard** | If canvas already has all `分镜 #N` nodes, **skips** `script storyboard`. Use `--force-storyboard` only to regenerate the full batch. Partial shots: skip batch, use `--phase fix-i2i`. |
| **CLI exit 1** | Usually **≥1 shot failed** (`status=3`) or **no published image** (even if a line shows `status=2`). Pipeline audits canvas and auto-retries `libtv node <分镜> --run` for empty shots. |
| **Phases** | `setup` → `storyboard` → `discover` → `fix-i2i` → `fix-character` → `videos` → `layout` → `validate` → `voiceover` → `compose` |

**Do not use** `libtv node … -u rows=@file` — LibTV CLI does not expand `@file`; it corrupts `data.rows`.

**Troubleshooting** (also in [libtv-production-guide.zh.md](libtv-production-guide.zh.md) §9):

| Symptom | Cause / fix |
|---------|-------------|
| Error **10023** on `script --run` | Content policy on full master prompt → use `scriptRowsMode: config` or pipeline `auto` fallback |
| **ENAMETOOLONG** on `-u rows=…` | Full JSON exceeds CreateProcess limit → use API upload (pipeline does this automatically) |
| Characters missing on storyboard | `text2image` default — pipeline sets `image2image` + turnaround `--left` after storyboard |

§2.7 manual CLI steps remain valid for debugging; prefer §2.8 for repeatable module builds.

---

## 3. Module Production Scripts

### 3.0 Narrative beats vs production shots

Each module has **two layers**:

| Layer | What it is | Where defined |
|-------|------------|---------------|
| **Narrative beat** | Story unit for trainers and VO script (typically 5–7 per module) | Voiceover sections below + `narrativeBeat` in configs |
| **Production shot** | One LibTV storyboard row = one clip with its own `videoDuration` | `scripts/pipeline/configs/mod-XX.json` |

**Editorial patterns** (see [module-shot-plan.md](module-shot-plan.md) for per-shot tables):

| Pattern | When | Modules |
|---------|------|---------|
| **Single VO shot** | One speaker, one idea; speech ≤ ~35s | Sofia mentors (mod-02), NYC dataset intro (mod-01) |
| **VO + B-roll pair** | Brief digest after VO (**4s** default) | All modules 01–07 after major beats |
| **Split dialogue** | Two speakers — separate reactions, not one crowded clip | Marcus → B-roll → Elena (mod-03, 04); Marcus → Elena (mod-06, 07) |
| **Montage VO + B-roll + punchline** | Narrator lists steps; **6s** montage B-roll; second speaker closes | Pipeline montage (mod-02), platform patterns (mod-05) |
| **Split long monologue** | One speaker covers multiple visual topics | Priya dashboard tour (mod-07) |
| **Trainer pause cue** | On-screen prompt only (**3s** B-roll); trainer **presses pause** for discussion/writing | mod-07 write + discussion; any VO ending with “Discuss:” |

**Cost rule**: Longer video = higher LibTV cost. **Do not** embed classroom discussion or writing time in the MP4. B-roll is for visual absorption (4–6s), not live facilitator pauses.

**Canonical sources** (regenerate after edits):

```powershell
node workshop-2026-v2/scripts/pipeline/build-module-configs.js
node workshop-2026-v2/scripts/pipeline/sync-config-durations.js
node workshop-2026-v2/scripts/pipeline/build-master-prompts.js --check
```

Outputs: `configs/mod-XX.json`, `prompts/mod-XX-master.txt`, [module-shot-plan.md](module-shot-plan.md).

### Timing Summary

| File | Narrative beats | Production shots | Pipeline duration |
|------|-----------------|------------------|-------------------|
| `mod-00-welcome.mp4` | 7 | 18 | 3:35 |
| `mod-01-fundamentals.mp4` | 5 | 14 | 3:36 |
| `mod-02-databricks.mp4` | 5 | 11 | 2:52 |
| `mod-03-snowflake.mp4` | 5 | 11 | 2:56 |
| `mod-04-dbt.mp4` | 5 | 12 | 3:13 |
| `mod-05-production.mp4` | 3 | 8 | 1:58 |
| `mod-06-ai.mp4` | 3 | 11 | 2:06 |
| `mod-07-wrapup.mp4` | 7 | 18 | 4:08 |
| `mod-08-streaming.mp4` | 6 *(optional)* | 6 *(optional)* | ~2:45 *(optional)* |
| `mod-09-ml.mp4` | 6 *(optional)* | 6 *(optional)* | ~2:45 *(optional)* |

**Note**: Durations include short B-roll digests (4s) and montage holds (6s). **Discussion and writing time is not in the video** — trainers pause at “Discuss:” / “Write your recommendation” cues.

**Optional modules**: Module 8 (streaming) and Module 9 (ML) are **not** part of the main-day eight-video set unless you deliver the advanced sessions.

---

### Story — `mod-00-welcome.mp4`

**Title card**: *YellowLine NYC — The Analytics Challenge*
**Pipeline length**: ~3:35 pipeline (7 narrative beats → **18 production shots**; see [module-shot-plan.md](module-shot-plan.md))

#### Narrative beat → production shot map

| Beat | Production shots | Editorial choice |
|------|------------------|------------------|
| 1 — The problem | 1–3 | Long Marcus intro VO + B-roll close-up + lower-third hold |
| 2 — The guessing game | 4–5 | Narrator VO + clock B-roll |
| 3 — MHP arrives | 6–7 | Elena + Bob dual VO (team card) + assembly B-roll |
| 4 — Priya's questions | 8–10 | Priya VO + James VO + wireframe B-roll |
| 5 — Medallion preview | 11–13 | Elena + narrator VO + two layer B-rolls |
| 6 — Tools ahead | 14–15 | Elena VO + Bob VO (split speakers) |
| 7 — Call to action | 16–18 | Narrator VO + slogan B-roll + title card B-roll |

#### Voiceover Script

##### Scene 1 — The problem (0:00–0:40)

**Visual**: Night NYC skyline. Dispatch center. Marcus at a desk with multiple Excel windows. Red delay indicators on a fleet map.

**SFX**: City ambience, keyboard clicks, distant horns.

**Marcus (VO)**:

> I'm Marcus Chen, Operations Manager at YellowLine NYC. We run yellow taxis across all five boroughs — millions of trips every month.
>
> But here's my problem: everyone has a different number. Finance uses one spreadsheet. Dispatch uses another. I can't tell you — with confidence — where we make money, or when we should send drivers where.

##### Scene 2 — The guessing game (0:40–1:10)

**Visual**: Heatmap of NYC — Manhattan bright, outer boroughs dim. Clock spins through 24 hours.

**Marcus (VO)**:

> Should we add cars in Midtown at six p.m.? Pull them from the airport at midnight? We're guessing. And while we guess, ride-hail apps don't.

**Narrator (VO)**:

> YellowLine NYC doesn't need another report. They need an analytics foundation.

##### Scene 3 — MHP arrives (1:10–1:35)

**Visual**: Uploaded **`logo-mhp`** centered at top (not an AI-drawn logo). Five character cards in one horizontal row below (max width for portraits): Elena, Bob, Sofia, Priya, James. Pipeline: `referenceOrder` puts `logo-mhp` as `{{Image 1}}`, turnarounds as `{{Image 2–6}}` via `fix-i2i`.

**Elena (VO)**:

> Marcus asked MHP for help. I'm Elena Vasquez, Data Architect. We won't start with tools — we start with architecture and the questions the business actually needs answered.

**Bob (VO)**:

> I'm Bob. I'll be building the pipelines — and learning which platforms fit Marcus's team.

##### Scene 4 — Priya's questions (1:35–2:05)

**Visual**: Empty Power BI wireframe. Sticky notes animate onto the frame.

**Priya (VO)**:

> I'm Priya, BI Analyst. Before Bob writes a line of code, I need to know what Marcus wants on his dashboard.
>
> When are peak revenue hours? Which pickup zones drive volume? Which routes matter? How does revenue vary by borough? And are bad records — null fares, zero-distance trips — skewing our decisions?

**James (VO)**:

> I'll validate the logic in SQL once the data is clean. Priya connects the Gold KPIs to Power BI.

##### Scene 5 — Medallion preview (2:05–2:40)

**Visual**: Elena draws three layers. Taxi icons flow: messy → clean → star KPIs.

**Elena (VO)**:

> We standardize on **medallion architecture**. Bronze holds raw trip data exactly as it arrives. Silver cleans, enriches, and joins zone lookups. Gold delivers analytics-ready KPI tables Priya can report from.

**Narrator (VO)**:

> Same dataset. Clear layers. One path from source to dashboard.

##### Scene 6 — Tools ahead (2:40–3:10)

**Visual**: Icons appear — Databricks, Snowflake, dbt, AWS, Cloudera — evaluation board, no winner yet.

**Elena (VO)**:

> Then we choose how to build. Lakehouse platforms. Cloud warehouses. Transformation frameworks. Each has strengths. Bob will prototype and we'll decide what Marcus's team can run in production.

**Bob (VO)**:

> Three pipelines. One NYC Taxi dataset. Let's see what works.

##### Scene 7 — Call to action (3:10–3:45)

**Visual**: Screen transitions to classroom silhouette. Text animates: *What would YOU design?*

**Narrator (VO)**:

> Before the first notebook opens — pause and think. You are Bob today.
>
> How would **you** design the solution for YellowLine NYC?

**Title card**: *MHP Data Engineer Masterclass 2026*

**Music**: Resolve and fade.

#### LibTV CLI Production — Story (18 shots)

**Per-shot voiceover** (source: `configs/mod-00.json` → `prompts/mod-00-master.txt`):

| Shot | Time | VO |
|------|------|-----|
| 1 | 0:00–0:26 | Marcus — intro through “Finance / Dispatch spreadsheets” |
| 2 | 0:26–0:36 | Marcus — “where we make money / when to send drivers” (close-up) |
| 3 | 0:36–0:50 | Marcus — Midtown / airport guessing (lower-third) |

**Audio**: TTS per shot + optional low BGM bed (`module.audio.bgmEnabled`). LibTV `preview-vo` muxes scene MP3 for canvas QA; **compose** uses silent raw clips + single `mod-00-welcome.mp3`.

**Module Master prompt** (paste into script node after Global Style):

```text
MODULE: 0 — Welcome & Setup
DURATION: 3:35
SCENES: 18
NARRATIVE_BEATS: 7
TAGLINE: "What would YOU design?"
STORY: Marcus lacks a single source of truth for NYC taxi operations. MHP team assembles. Priya defines KPI questions. Elena introduces medallion Bronze/Silver/Gold. Multiple tools evaluated but no winner chosen. Trainees step into Bob's shoes.
ENDING: Reflection question card + masterclass title
VISUAL MOTIFS: Bronze → Silver → Gold layers, ADLS2 → pipeline → Gold KPI → Power BI, Cost / Performance / Compliance
```

##### Scene 1 — Image prompt:

```text
NYC night skyline, corporate training illustration, dispatch center with multiple monitors showing Excel chaos, simplified fleet map with red delay indicators, Marcus Chen 2D illustrated avatar at desk, stressed but professional, lower-third text "Marcus Chen · Operations Manager", navy #01065c and yellow #FBBF24 palette, clean motion graphics, flat 2D, NOT photorealistic, 16:9. All visible text on screen must be in English only. No Chinese characters.
```

**Video**: `-s model=star-video2 -s modeType=singleImage2video -s ratio=16:9 -s duration=10`
**Video prompt**: `Subtle slow pan across dispatch center, Marcus slight head nod, Excel windows flicker gently, map delay points pulse slowly, stable camera, corporate documentary pace`
**Screen text**: `Marcus Chen · Operations Manager`

##### Scene 2 — Image prompt:

```text
NYC boroughs heatmap infographic, Manhattan bright peak, outer boroughs dim, 24-hour clock, corporate infographic style, NOT photorealistic map, yellow #FBBF24 accent, white background, navy #01065c panels, American animation corporate illustration style, flat 2D, 16:9. All visible text on screen must be in English only. No Chinese characters.
```

**Video**: `-s duration=8` · `Heatmap subtle zoom to Manhattan, clock hand slow rotation, borough colors gradient pulse, stable camera`

##### Scene 3 — Image prompt:

```text
Uploaded logo-mhp ({{Image 1}}) centered at top in compact header; five character portrait cards in one wide horizontal row below (Elena, Bob, Sofia, Priya, James — Marcus is the client and does not appear). Do NOT generate a new MHP logo. White background, American animation corporate motion graphics, flat illustration, 16:9. All visible text on screen must be in English only. No Chinese characters.
```

**Video**: `-s duration=8` · `Avatar cards slide in left to right with slight bounce, logo gentle fade-in`
**Screen text**: `Elena Vasquez · Data Architect` / `Bob Muller · Junior Data Engineer`

##### Scene 4 — Image prompt:

```text
Empty Power BI wireframe dashboard in rounded device frame, animated sticky notes appearing with KPI questions, Priya Sharma avatar left side, James Okonkwo avatar right side, clean BI mockup, green #107c10 accents, 2D corporate illustration, 16:9. All visible text on screen must be in English only. No Chinese characters.
```

**Video**: `-s duration=10` · `Sticky notes fade in one by one onto wireframe, chart placeholder boxes pulse gently`
**Screen text**: `Priya Sharma · BI Analyst` / `James Okonkwo · Data Analyst`

##### Scene 5 — Image prompt:

```text
Elena at whiteboard drawing Bronze Silver Gold three layers, taxi icons flowing from messy to clean to star KPIs, central medallion stack diagram, American animation layered data visualization, 2D illustration, navy #01065c and yellow #FBBF24, 16:9. All visible text on screen must be in English only. No Chinese characters.
```

**Video**: `-s duration=10` · `Whiteboard layer labels reveal sequentially, pipeline arrows flow left to right, medallion layers subtle highlight`
**Screen text**: `Bronze` / `Silver` / `Gold`

##### Scene 6 — Image prompt:

```text
Evaluation board with tool logos: Databricks, Snowflake, dbt, AWS, Cloudera, no winner highlighted, question mark accent, corporate comparison matrix, flat icons, navy #01065c background, 2D training style, 16:9. All visible text on screen must be in English only. No Chinese characters.
```

**Video**: `-s duration=8` · `Logos fade in sequentially, question mark subtle pulse, no selection animation`

##### Scene 7 — Image prompt:

```text
Classroom silhouette audience, large animated text "What would YOU design?", title card "MHP Data Engineer Masterclass 2026", yellow #FBBF24 underline accent, navy #01065c gradient background, MHP blue-white chapter card, American animation style, 16:9. All visible text on screen must be in English only. No Chinese characters.
```

**Video**: `-s duration=10` · `Slogan text fade-in then hold, title card subtle zoom, background geometric shapes slow movement, chapter card motion`
**Screen text**: `What would YOU design?` · `MHP Data Engineer Masterclass 2026`

---

### Module 1 — `mod-01-fundamentals.mp4`

**Title card**: *Module 1 — Data Engineering Fundamentals*
**Pipeline length**: ~3:36 · **14 production shots** · [shot plan](module-shot-plan.md#module-01--data-engineering-fundamentals)

| Beat | Shots | Editorial choice |
|------|-------|------------------|
| 1 — Architecture | 1–2 | Long Elena monologue + whiteboard B-roll |
| 2 — ETL vs ELT | 3–6 | Split: Elena diagram → B-roll → narrator dataset intro → hold |
| 3 — Priya waits | 7–9 | Split dialogue: Priya wireframe → B-roll → James schema VO |
| 4 — Tool landscape | 10–12 | Split: Elena matrix VO → B-roll → Bob punchline |
| 5 — Transition | 13–14 | Chapter card VO + hold |

#### Voiceover Script

##### Scene 1 — Architecture first (0:00–0:35)

**Visual**: Elena at whiteboard. Bronze, Silver, Gold layers expand with labels.

**Elena (VO)**:

> Marcus has data. He doesn't have **layers**. In medallion architecture, Bronze is raw ingestion — Parquet from Azure, no filters. Silver is where we enforce quality, standardize columns, and join taxi zone lookups. Gold is business-ready — aggregated KPI tables, one per question Priya needs.

##### Scene 2 — ETL vs ELT (0:35–1:05)

**Visual**: Split animation — transform box before load (ETL) vs transform inside warehouse (ELT).

**Elena (VO)**:

> Classic **ETL** transforms before load — great for legacy systems. Modern cloud platforms favor **ELT**: load raw first, transform in place. That's our approach. Raw stays reproducible; Silver and Gold can be rebuilt anytime.

**Narrator (VO)**:

> Today you'll work with the NYC Taxi dataset — the same public trip records used in data engineering training worldwide.

##### Scene 3 — Priya waits (1:05–1:35)

**Visual**: Priya at empty Power BI wireframe. Placeholder boxes for charts.

**Priya (VO)**:

> My dashboard wireframe is ready. My KPI list is ready. I'm waiting for Bob's Gold tables. Overview charts, maps, revenue pages — they all depend on the pipeline layers Elena just described.

**James (VO)**:

> I'll explore the schema with you before we build — Vendor ID, pickup times, fares, zones — so Silver rules match what Marcus expects.

##### Scene 4 — Tool landscape (1:35–2:15)

**Visual**: Logos on evaluation matrix — no selection highlighted.

**Elena (VO)**:

> Databricks brings Spark and lakehouse scale. Snowflake brings SQL-first analytics. dbt brings transformation as code with tests and lineage. AWS, Cloudera — options in enterprise landscapes. We evaluate against Marcus's skills, cost, and maintainability — not hype.

**Bob (VO)**:

> I won't pick a winner yet. First I need to understand the fundamentals we're about to use all day.

##### Scene 5 — Transition (2:15–2:45)

**Visual**: Module title. Question text: *What belongs in Silver vs Gold?*

**Narrator (VO)**:

> Module 1: fundamentals. After this video — discuss with your partner. What belongs in Silver? What belongs in Gold? Then we go deep on theory and the NYC Taxi schema.

**Title card**: *Module 1 — Think first. Then build.*

#### LibTV CLI Production — Module 1

**Module Master prompt**:

```text
MODULE: 1 — Data Engineering Fundamentals
DURATION: 3:36
SCENES: 14
NARRATIVE_BEATS: 5
TAGLINE: "Module 1 — Think first. Then build."
STORY: Elena explains medallion layering and ELT. Priya waits for Gold tables. Tool landscape evaluation without picking a winner. Transition to Silver vs Gold reflection.
```

##### Scene 1 — Image prompt:

```text
Elena at whiteboard expanding Bronze Silver Gold labels, Bronze with ADLS2 Parquet icon, Silver with quality and join symbols, Gold with KPI table icons, corporate training architecture diagram, 2D illustration, layer-by-layer reveal, 16:9. All visible text on screen must be in English only. No Chinese characters.
```

**Video**: `-s duration=10` · `Three layer labels reveal left to right, each layer subtle highlight 2 seconds, arrows flow downward`
**Screen text**: `Bronze` / `Silver` / `Gold`

##### Scene 2 — Image prompt:

```text
Split screen animation: left side ETL with transform box before database load, right side ELT with raw load first then transform inside warehouse, animated arrows, minimal American animation corporate style, 2D flat illustration, 16:9. All visible text on screen must be in English only. No Chinese characters.
```

**Video**: `-s duration=8` · `Left and right arrows animate simultaneously, transform boxes subtle pulse, stable split screen`
**Screen text**: `ETL` / `ELT`

##### Scene 3 — Image prompt:

```text
Priya at empty Power BI wireframe, placeholder chart boxes with gentle pulse, waiting state, reflection blue #2563eb accent, lower-third subtitle space, 2D corporate illustration, 16:9. All visible text on screen must be in English only. No Chinese characters.
```

**Video**: `-s duration=8` · `Placeholder boxes slow pulse, Priya slight shrug or check watch, calm mood`

##### Scene 4 — Image prompt:

```text
Logo evaluation matrix: Databricks, Snowflake, dbt, AWS, Cloudera with cost skill maintainability rows, no selection highlighted, corporate infographic, flat icons, 16:9. All visible text on screen must be in English only. No Chinese characters.
```

**Video**: `-s duration=10` · `Matrix rows fade in sequentially, criteria icons subtle appearance`
**Screen text**: `Cost` / `Performance` / `Compliance`

##### Scene 5 — Image prompt:

```text
Chapter card "Module 1 — Data Engineering Fundamentals", question text "What belongs in Silver vs Gold?", navy #01065c gradient background, American animation corporate training style, MHP blue-white palette, 16:9. All visible text on screen must be in English only. No Chinese characters.
```

**Video**: `-s duration=8` · `Chapter card fade-in, question text subtle underline animation`
**Screen text**: `Module 1 — Data Engineering Fundamentals` · `What belongs in Silver vs Gold?`

---

### Module 2 — `mod-02-databricks.mp4`

**Title card**: *Module 2 — Databricks Pipeline*
**Pipeline length**: ~2:52 · **11 production shots** · [shot plan](module-shot-plan.md#module-02--databricks-pipeline)

| Beat | Shots | Editorial choice |
|------|-------|------------------|
| 1 — Bob's pitch | 1–2 | Dual VO (Bob + Elena) single clip + meeting B-roll |
| 2 — Sofia mentors | 3–4 | Single VO + notebook B-roll |
| 3 — Pipeline montage | 5–7 | **Split montage**: narrator lists layers → B-roll cards → Bob punchline |
| 4 — Priya Overview | 8–9 | Single VO + chart B-roll |
| 5 — Transition | 10–11 | Chapter card VO + hold |

#### Voiceover Script

##### Scene 1 — Bob's pitch (0:00–0:30)

**Visual**: Tool evaluation room. Bob raises hand. Databricks logo glows.

**Bob (VO)**:

> Elena, I've worked with PySpark before. Databricks is powerful and popular for lakehouse workloads. Can we prototype ingest and transforms there first?

**Elena (VO)**:

> Approved — for scale and flexibility. Sofia will pair with you. Same medallion design: Bronze, Silver, Gold. Same KPIs Priya needs.

##### Scene 2 — Sofia mentors (0:30–0:50)

**Visual**: Sofia and Bob at notebook. ADLS2 path visible.

**Sofia (VO)**:

> Read Parquet directly from Azure ADLS2. Land it in Delta Bronze. Don't skip raw — Marcus may ask us to reprocess December when January looks wrong.

##### Scene 3 — Pipeline montage (0:50–1:40)

**Visual**: Fast cuts — `spark.read.parquet` → Bronze Delta table → Silver filters and joins → Gold aggregations. Unity Catalog labels appear.

**Narrator (VO)**:

> Bronze: external Parquet into Delta Lake. Silver: filter invalid fares, compute trip duration, enrich with borough and zone. Gold: twelve KPI tables — trips by hour, by day, revenue bands, efficiency metrics.

**Bob (VO)**:

> One dataset. One architecture. Databricks executes the first full path.

##### Scene 4 — Priya's Overview page (1:40–2:10)

**Visual**: Power BI Overview page animates — line chart fills (trips by hour), bar chart (by day), KPI cards pulse.

**Priya (VO)**:

> Bob, I just connected `kpi_trips_by_hour` and `kpi_trips_by_day`. My Overview page is alive. Marcus can finally see peak hours — but I still need maps and revenue pages from the next layers.

##### Scene 5 — Transition (2:10–2:30)

**Narrator (VO)**:

> Module 2: your hands-on Databricks lab. Think first: how would you ingest Parquet at scale? What breaks if you skip Bronze? Then open the notebooks and build.

**Title card**: *Module 2 — Databricks: Bronze → Silver → Gold*

#### LibTV CLI Production — Module 2

**Module Master prompt**:

```text
MODULE: 2 — Databricks Pipeline
DURATION: 2:52
SCENES: 11
NARRATIVE_BEATS: 5
TAGLINE: "Module 2 — Databricks: Bronze → Silver → Gold"
STORY: Bob pitches Databricks. Sofia mentors on ADLS2 ingest. Pipeline montage. Priya's Overview page fills. Lab transition.
```

##### Scene 1 — Image prompt:

```text
Meeting room, Bob raising hand, Databricks logo with soft glow, Elena approving nod, 2D illustrated avatars, corporate training scene, navy #01065c and yellow #FBBF24, 16:9. All visible text on screen must be in English only. No Chinese characters.
```

**Video**: `-s duration=8` · `Bob raising hand gesture, logo gentle glow pulse, Elena nodding`
**Screen text**: `Bob Muller · Junior Data Engineer`

##### Scene 2 — Image prompt:

```text
Notebook mockup showing ADLS2 path abfss://, Sofia and Bob side by side, Delta Lake Bronze label, UI device frame, 2D training style, 16:9. All visible text on screen must be in English only. No Chinese characters.
```

**Video**: `-s duration=5` · `Code lines subtle highlight scroll, Bronze label fade-in`
**Screen text**: `Sofia Alvarez · Senior Data Engineer`

##### Scene 3 — Image prompt:

```text
Training montage: spark.read.parquet arrow, Bronze Delta table, Silver filter and join, Gold twelve KPI tables, Unity Catalog label, kinetic but readable motion graphics, 2D flat illustration, navy and yellow palette, 16:9. All visible text on screen must be in English only. No Chinese characters.
```

**Video**: `-s duration=12` · `Montage hard cuts 3-5 seconds per card, pipeline arrows flow right, consistent color palette`
**Screen text**: `Bronze` → `Silver` → `Gold`

##### Scene 4 — Image prompt:

```text
Composite: real Power BI Overview screenshot in rounded device frame, trips-by-hour line chart filling, trips-by-day bar chart, KPI cards turning green pulse, Priya avatar presenting, do NOT AI-generate dashboard text, 16:9. All visible text on screen must be in English only. No Chinese characters.
```

**Video**: `-s duration=8` · `Chart data subtle fill animation (or light pan if static screenshot), Priya gesture`

##### Scene 5 — Image prompt:

```text
Chapter card "Module 2 — Databricks: Bronze → Silver → Gold", pipeline arrow animation, navy #01065c background, 16:9. All visible text on screen must be in English only. No Chinese characters.
```

**Video**: `-s duration=5` · `Chapter card + arrow flow left to right`
**Screen text**: `Module 2 — Databricks: Bronze → Silver → Gold`

---

### Module 3 — `mod-03-snowflake.mp4`

**Title card**: *Module 3 — Snowflake Pipeline*
**Pipeline length**: ~2:56 · **11 production shots** · [shot plan](module-shot-plan.md#module-03--snowflake-pipeline)
**Mandatory tagline**: *Same architecture. Different implementation philosophy.*

| Beat | Shots | Editorial choice |
|------|-------|------------------|
| 1 — Marcus pushes back | 1–3 | **Split dialogue**: Marcus concern → B-roll → Elena Snowflake path |
| 2 — Same KPIs | 4–5 | Narrator + Bob dual VO + schema B-roll |
| 3 — Snowflake montage | 6–7 | Sofia single VO + worksheet B-roll |
| 4 — Priya Map | 8–9 | Single VO + map B-roll |
| 5 — Transition | 10–11 | Chapter card VO + hold |

#### Voiceover Script

##### Scene 1 — Marcus pushes back (0:00–0:35)

**Visual**: Marcus reviews Databricks dashboard printout. SQL team members in background.

**Marcus (VO)**:

> Bob, this works. I believe the numbers. But my team lives in SQL. They're not going to maintain PySpark notebooks after MHP leaves. Is there a path with less programming — something **they** can extend?

**Elena (VO)**:

> Fair constraint. Bob — rebuild the same medallion pipeline on Snowflake. SQL-first. Snowpark optional for engineers who want Python.

##### Scene 2 — Same KPIs, new engine (0:35–1:05)

**Visual**: Side-by-side — Databricks Gold schema | Snowflake Gold schema — columns align. End-card text fades in: ***Same architecture. Different implementation philosophy.***

**Narrator (VO)**:

> Identical business logic. Identical Gold table names. Different platform — external stage from ADLS2, `COPY INTO` Bronze, SQL transforms for Silver and Gold. **Same architecture. Different implementation philosophy.**

**Bob (VO)**:

> Marcus's analysts can read every transform in a worksheet. That's the maintainability test.

##### Scene 3 — Snowflake montage (1:05–1:35)

**Visual**: Worksheet tabs — stage setup, Bronze load, Silver CTAS, Gold KPI scripts.

**Sofia (VO)** *(brief)*:

> Don't reinvent the rules — port the Silver filters from PySpark line by line. Consistency matters for Priya's dashboard.

##### Scene 4 — Priya's Map page (1:35–2:05)

**Visual**: Power BI Map page — borough filled map colors by revenue; bubble map for top zones.

**Priya (VO)**:

> I pointed the same Power BI report at Snowflake Gold. `kpi_borough_analysis`, `kpi_top_pickup_zones` — the Map page works unchanged. Identical KPIs. Different engine. That's the point.

##### Scene 5 — Transition (2:05–2:30)

**Narrator (VO)**:

> Module 3: Snowflake lab. Discuss: who maintains this after MHP? How do you load Parquet without Spark? Then build Bronze → Silver → Gold in SQL.

**Title card**: *Module 3 — Snowflake: SQL the team can own*

#### LibTV CLI Production — Module 3

**Module Master prompt**:

```text
MODULE: 3 — Snowflake Pipeline
DURATION: 2:56
SCENES: 11
NARRATIVE_BEATS: 5
ENDING TAGLINE: "Same architecture. Different implementation philosophy."
STORY: Marcus demands SQL maintainability. Databricks and Snowflake Gold schemas align side by side. Snowflake SQL montage. Priya's Map page works unchanged. Lab transition.
```

##### Scene 1 — Image prompt:

```text
Marcus reviewing printed dashboard, SQL team silhouettes in background, speech bubble "My team lives in SQL", corporate boardroom 2D illustration, navy #01065c and yellow #FBBF24, NOT photorealistic, 16:9. All visible text on screen must be in English only. No Chinese characters.
```

**Video**: `-s duration=10` · `Marcus flipping paper, speech bubble fade-in, slight zoom in`

##### Scene 2 — Image prompt:

```text
Side by side: Databricks Gold schema and Snowflake Gold schema, column names aligned and highlighted, ending text card "Same architecture. Different implementation philosophy.", Snowflake blue-white training aesthetic, 2D flat, 16:9. All visible text on screen must be in English only. No Chinese characters.
```

**Video**: `-s duration=8` · `Two schema columns sync scroll alignment, highlighted columns pulse, ending tagline card fade-in hold 5 seconds`
**Screen text**: `Same architecture. Different implementation philosophy.`

##### Scene 3 — Image prompt:

```text
Snowflake worksheet tabs: external stage ADLS2, COPY INTO Bronze, Silver CTAS, Gold KPI SQL, clean UI mock animation, 2D training style, 16:9. All visible text on screen must be in English only. No Chinese characters.
```

**Video**: `-s duration=8` · `Tabs subtle horizontal pan, SQL lines gentle highlight`

##### Scene 4 — Image prompt:

```text
Composite Power BI Map page: borough filled map colored by revenue, bubble map for top zones, labeled kpi_borough_analysis and kpi_top_pickup_zones, Priya presenting, real screenshot in device frame, 16:9. All visible text on screen must be in English only. No Chinese characters.
```

**Video**: `-s duration=8` · `Map subtle pan, annotation callouts fade-in`

##### Scene 5 — Image prompt:

```text
Chapter card "Module 3 — Snowflake: SQL the team can own", Snowflake logo subtle not dominant, 2D training style, 16:9. All visible text on screen must be in English only. No Chinese characters.
```

**Video**: `-s duration=5` · `Chapter card fade-in`
**Screen text**: `Module 3 — Snowflake: SQL the team can own`

---

### Module 4 — `mod-04-dbt.mp4`

**Title card**: *Module 4 — dbt Pipeline*
**Pipeline length**: ~3:13 · **12 production shots** · [shot plan](module-shot-plan.md#module-04--dbt-pipeline)

| Beat | Shots | Editorial choice |
|------|-------|------------------|
| 1 — Audit requirement | 1–3 | **Split dialogue**: Marcus board ask → B-roll → Elena dbt answer |
| 2 — dbt lineage | 4–6 | **Split montage**: Bob DAG VO → B-roll → narrator tagline |
| 3 — Priya revenue | 7–8 | Single VO + lineage B-roll |
| 4 — Portability | 9–10 | Elena single VO + dual-target B-roll |
| 5 — Transition | 11–12 | Chapter card VO + hold |

#### Voiceover Script

##### Scene 1 — Audit requirement (0:00–0:35)

**Visual**: Boardroom. Compliance icon. Marcus with printed dashboard.

**Marcus (VO)**:

> Priya's dashboard is exactly what I wanted. But my board asked a harder question: *Where does each number come from?* We need documentation. Data lineage. And tests — so an analyst's Friday edit doesn't break Monday's revenue tile silently.

**Elena (VO)**:

> We keep Snowflake as the engine. We add **dbt** for transformations — SQL models, automated tests, and generated docs. Bob, dbt is not a replacement warehouse. It's how we govern the transform layer.

##### Scene 2 — dbt lineage (0:35–1:15)

**Visual**: dbt DAG animates — staging → silver → gold. `dbt test` checkmarks. Docs site lineage graph.

**Bob (VO)**:

> Every Gold model references Silver with `ref()`. Tests catch null keys and negative fares. `dbt docs generate` shows Marcus's auditors the full chain — source Parquet to dashboard tile.

**Narrator (VO)**:

> dbt runs on Snowflake. Same KPIs. Added discipline.

##### Scene 3 — Priya's revenue & quality pages (1:15–1:50)

**Visual**: Power BI Revenue page and quality scorecard. Animated lineage arrow from tile → dbt model → Silver table.

**Priya (VO)**:

> Revenue by hour, payment type breakdown — connected. And `kpi_data_quality_metrics` feeds my quality scorecard. When Marcus asks "can we trust this data?", I show him the number **and** the lineage behind it.

##### Scene 4 — Portability note (1:50–2:15)

**Visual**: dbt project with two targets — Snowflake and Databricks badges.

**Elena (VO)**:

> The same dbt models can target Databricks or Snowflake. Today we focus on Snowflake. The pattern travels with you.

##### Scene 5 — Transition (2:15–2:45)

**Narrator (VO)**:

> Module 4: dbt lab. Discuss: how do you prove where a KPI comes from? What tests would you add? Then run models, tests, and generate the docs Marcus's board wants.

**Title card**: *Module 4 — dbt: Transform with lineage*

#### LibTV CLI Production — Module 4

**Module Master prompt**:

```text
MODULE: 4 — dbt Pipeline
DURATION: 3:13
SCENES: 12
NARRATIVE_BEATS: 5
TAGLINE: "Module 4 — dbt: Transform with lineage"
STORY: Board audit requirement. dbt DAG lineage and tests. Priya's Revenue and quality pages. dbt dual-target portability. Lab transition.
```

##### Scene 1 — Image prompt:

```text
Boardroom, Marcus holding printed dashboard, compliance shield icon, Elena introducing dbt layer on top of Snowflake not replacing, corporate training 2D illustration, 16:9. All visible text on screen must be in English only. No Chinese characters.
```

**Video**: `-s duration=10` · `Shield icon fade-in, Elena teaching gesture`

##### Scene 2 — Image prompt:

```text
dbt DAG animation: staging to silver to gold, ref() arrows, dbt test green checkmarks, docs lineage graph, data governance infographic Snowflake style, 2D flat, 16:9. All visible text on screen must be in English only. No Chinese characters.
```

**Video**: `-s duration=10` · `DAG nodes light up sequentially, arrows flow, green checkmarks pop-in`

##### Scene 3 — Image prompt:

```text
Composite Power BI Revenue page and quality scorecard, animated lineage arrow from tile to dbt model to Silver table, trust narrative visual, real screenshots, 16:9. All visible text on screen must be in English only. No Chinese characters.
```

**Video**: `-s duration=10` · `Lineage arrow draw-on animation, tile subtle highlight`

##### Scene 4 — Image prompt:

```text
dbt project diagram with dual targets: Snowflake and Databricks badges, same models feeding two engines, split screen motion, 2D corporate training, 16:9. All visible text on screen must be in English only. No Chinese characters.
```

**Video**: `-s duration=5` · `Two engine badges symmetric fade-in, center models box pulse`

##### Scene 5 — Image prompt:

```text
Chapter card "Module 4 — dbt: Transform with lineage", navy #01065c gradient, 16:9. All visible text on screen must be in English only. No Chinese characters.
```

**Video**: `-s duration=8` · `Chapter card fade-in`
**Screen text**: `Module 4 — dbt: Transform with lineage`

---

### Module 5 — `mod-05-production.mp4`

**Title card**: *Module 5 — Production Patterns*
**Pipeline length**: ~1:58 · **8 production shots** · [shot plan](module-shot-plan.md#module-05--production-patterns)

| Beat | Shots | Editorial choice |
|------|-------|------------------|
| 1 — Nightly question | 1–3 | **Split dialogue**: Elena questions → B-roll → Marcus one-liner |
| 2 — Platform patterns | 4–6 | Narrator list VO → B-roll → Bob punchline |
| 3 — Transition | 7–8 | **Split visual**: narrator chapter card VO → checklist B-roll (no combined clip) |

#### Voiceover Script

##### Scene 1 — The nightly question (0:00–0:35)

**Visual**: Empty office at night. Calendar flips. Pipeline icons idle.

**Elena (VO)**:

> The pipeline works when you're in the room. Marcus needs it to work when you're not. What runs every night? What happens when Silver fails at two a.m.? Who gets paged?

**Marcus (VO)**:

> I don't need heroes. I need schedules, retries, and alerts.

##### Scene 2 — Platform patterns (0:35–1:20)

**Visual**: Databricks Workflow graph → Snowflake Task chain → GitHub Actions running `dbt build`.

**Narrator (VO)**:

> Databricks Workflows and Delta Live Tables. Snowflake Tasks and Streams. dbt in CI/CD with pull request checks. Incremental loads instead of full rewrites. Idempotent jobs — safe to rerun.

**Bob (VO)**:

> Production isn't more code. It's the same code — with reliability wrapped around it.

##### Scene 3 — Transition (1:20–2:15)

**Visual**: Checklist animates — schedule, monitor, alert, version control.

**Narrator (VO)**:

> Module 5: production patterns. Discuss: what breaks in production that never breaks in class? How would you schedule tonight's run on each platform?

**Title card**: *Module 5 — Production: Run without MHP in the room*

#### LibTV CLI Production — Module 5

**Module Master prompt**:

```text
MODULE: 5 — Production Patterns
DURATION: 1:58
SCENES: 8
NARRATIVE_BEATS: 3
TAGLINE: "Module 5 — Production: Run without MHP in the room"
STORY: Nighttime reliability question. Platform scheduling patterns. Checklist transition.
```

##### Scene 1 — Image prompt:

```text
Empty office at night, calendar flipping, pipeline icons idle, Elena and Marcus 2D avatars, calm not scary, clock showing 2am, corporate training illustration, 16:9. All visible text on screen must be in English only. No Chinese characters.
```

**Video**: `-s duration=10` · `Clock hands slow move, calendar flip one page, lights subtle dim`

##### Scene 2 — Image prompt:

```text
Three-column motion graphics: Databricks Workflow diagram, Snowflake Task chain, GitHub Actions running dbt build, incremental load icons, idempotent job badges, 2D infographic, 16:9. All visible text on screen must be in English only. No Chinese characters.
```

**Video**: `-s duration=12` · `Three columns reveal sequentially, workflow nodes subtle pulse`

##### Scene 3 — Image prompt:

```text
Animated checklist: schedule, monitor, alert, version control, chapter card "Module 5 — Production: Run without MHP in the room", navy #01065c and yellow #FBBF24, 2D flat, 16:9. All visible text on screen must be in English only. No Chinese characters.
```

**Video**: `-s duration=15` · `Checklist items check off sequentially with fade-in, chapter card appears last`
**Screen text**: `Module 5 — Production: Run without MHP in the room`

---

### Module 6 — `mod-06-ai.mp4`

**Title card**: *Module 6 — AI Features*
**Pipeline length**: ~2:06 · **11 production shots** · [shot plan](module-shot-plan.md#module-06--ai-features)

| Beat | Shots | Editorial choice |
|------|-------|------------------|
| 1 — Marcus AI question | 1–3 | **Split dialogue**: Marcus ask → B-roll → Elena governed-AI response |
| 2 — AI demos | 4–9 | **Triple split**: Priya → James → narrator, each VO + card B-roll |
| 3 — Transition | 10–11 | Chapter card VO + medallion B-roll |

#### Voiceover Script

##### Scene 1 — Marcus's AI question (0:00–0:30)

**Visual**: Marcus watches Priya filter a dashboard. Speech bubble with question mark.

**Marcus (VO)**:

> Priya answers my questions in minutes now. But my analysts ask dozens of questions a day. Can AI help them explore data faster — without breaking what Bob built?

**Elena (VO)**:

> AI can accelerate exploration — not replace your medallion pipeline or Priya's KPI definitions.

##### Scene 2 — AI demos montage (0:30–1:20)

**Visual**: Natural language query → suggested SQL. Copilot-style assist in worksheet. Genie-style dashboard hint.

**Priya (VO)**:

> I still own the dashboard. AI might help James draft SQL against Silver — under human review.

**James (VO)**:

> If AI writes the query, I still validate against Gold. Garbage in, garbage out — AI doesn't fix bad Bronze.

**Narrator (VO)**:

> Cortex, Genie, Copilot — assistants on top of a governed stack. Phase 1.5, not a shortcut past Bronze, Silver, and Gold.

##### Scene 3 — Transition (1:20–2:15)

**Visual**: AI icon sits above medallion diagram — not replacing it.

**Narrator (VO)**:

> Module 6: AI features. Discuss: where is AI useful in this project? Where would you not trust it? Then try the exercises — always anchored to the pipeline you built.

**Title card**: *Module 6 — AI: Augment, don't replace*

#### LibTV CLI Production — Module 6

**Module Master prompt**:

```text
MODULE: 6 — AI Features
DURATION: 2:06
SCENES: 11
NARRATIVE_BEATS: 3
TAGLINE: "Module 6 — AI: Augment, don't replace"
STORY: Marcus asks about AI for analysts. Demo montage within governed stack. AI sits above medallion, not replacing it.
```

##### Scene 1 — Image prompt:

```text
Marcus watching Priya filter dashboard, speech bubble with question mark, corporate office 2D illustration, navy #01065c and yellow #FBBF24, 16:9. All visible text on screen must be in English only. No Chinese characters.
```

**Video**: `-s duration=8` · `Marcus pointing at screen, question bubble pop-in`

##### Scene 2 — Image prompt:

```text
Three-card montage: natural language query to SQL suggestion, Copilot-style worksheet assist, Genie-style dashboard hint, all within governed medallion frame, purple #7c3aed AI accent, 2D training style, 16:9. All visible text on screen must be in English only. No Chinese characters.
```

**Video**: `-s duration=12` · `Montage three cards hard cut, AI accent purple pulse`

##### Scene 3 — Image prompt:

```text
AI icon floating above Bronze Silver Gold medallion stack, NOT replacing it, chapter card "Module 6 — AI: Augment, don't replace", 2D flat, 16:9. All visible text on screen must be in English only. No Chinese characters.
```

**Video**: `-s duration=15` · `AI icon gentle float, medallion layers static, chapter card fade-in`
**Screen text**: `Module 6 — AI: Augment, don't replace`

---

### Module 7 — `mod-07-wrapup.mp4`

**Title card**: *Module 7 — Dashboard Payoff & Tool Choice*
**Pipeline length**: ~4:08 · **18 production shots** · [shot plan](module-shot-plan.md#module-07--dashboard-payoff--tool-choice)
**Opening**: *Six months later — YellowLine NYC HQ.*
**Closing**: *Technology is a decision. Architecture is responsibility.*
**Trainer pause**: After “Write your recommendation” VO (~3s on-screen cue) and before open discussion — **pause playback**; do not extend video for writing or debate.

| Beat | Shots | Editorial choice |
|------|-------|------------------|
| 0 — Time jump | 0–1 | Narrator VO + skyline B-roll |
| 1 — Full dashboard | 2–6 | **Split monologue**: Priya Overview/Map → pages B-roll → Priya Time/Revenue/Efficiency → B-roll → Marcus reaction |
| 2 — Production question | 7–9 | **Split dialogue**: Marcus ask → B-roll → Elena defer-to-trainees |
| 3 — Recap montage | 10–11 | Narrator VO + flashback B-roll |
| 4 — Silent prompt | 12–13 | Narrator cue VO + **3s** on-screen text (`trainerCue`); trainer pauses for writing |
| 5 — Open discussion | 14–15 | Narrator cue VO + **3s** whiteboard (`trainerCue`); trainer pauses for debate |
| 6 — Closing | 16–17 | Elena + narrator dual VO + title-card B-roll |

#### Voiceover Script

##### Scene 0 — Time jump (0:00–0:15)

**Visual**: Fade-up on a desk calendar flipping pages. Caption text: ***Six months later — YellowLine NYC HQ.*** NYC skyline through window. The MHP team has gone; Marcus and his SQL analysts run the platform.

**Narrator (VO)** *(quiet, reflective)*:

> Six months later. MHP's engagement is over. YellowLine NYC runs the platform now — SQL analysts on rotation, dashboards refreshing nightly. Let's see what stuck.

##### Scene 1 — Full dashboard (0:15–1:25)

**Visual**: Priya presents to Marcus and Elena (recorded recap; framed as the hand-off she gave six months ago). Screen cycles through five Power BI pages — all green, all live.

**Priya (VO)**:

> Marcus — your operations dashboard. Overview: trips and revenue by hour and day. Map: borough performance and top pickup zones. Time analysis: rush hours and heatmaps. Revenue and payments: fare breakdown and card versus cash. Efficiency: distance bands, speed, revenue per minute.
>
> Twelve Gold KPI tables. One schema. Built by Bob across three tool paths — consumed here in Power BI.

**Marcus (VO)**:

> This is what I asked for on day one. Now I need the harder answer.

##### Scene 2 — The production question (1:10–1:45)

**Visual**: Three platform icons — Databricks, Snowflake, dbt — on evaluation board.

**Marcus (VO)**:

> You proved they all work. What should YellowLine NYC **run in production**? What can my SQL team maintain? What would **you** choose — and why?

**Elena (VO)**:

> There isn't one universal answer. Platform, transform layer, and dashboard are separate decisions. Today **you** recommend — not me.

##### Scene 3 — Recap montage (1:45–2:30)

**Visual**: Quick flashbacks — Databricks notebook, Snowflake worksheet, dbt lineage, Priya's charts.

**Narrator (VO)**:

> One NYC Taxi dataset. Three implementations. Medallion architecture throughout. Priya's Power BI at the finish line. You built as Bob. Now think as the architect.

##### Scene 4 — Silent prompt (2:30–2:50)

**Visual**: Classroom. Text: *Write your recommendation.*

**Narrator (VO)** *(quiet, direct)*:

> Take a moment. Which stack would you recommend for YellowLine NYC? One sentence why. You'll discuss with the room next.

##### Scene 5 — Open discussion (2:50–3:20)

**Visual**: Diverse trainee silhouettes in discussion. Whiteboard with comparison columns.

**Narrator (VO)**:

> Compare Databricks, Snowflake, and dbt. Defend your choice. Challenge your classmates. Revisit what you designed this morning — what would you change now?

##### Scene 6 — Closing (3:20–3:45)

**Visual**: Slow zoom on a single title card.

**Title card sequence** (3 beats, ~5 s each):

1. *MHP Data Engineer Masterclass 2026 — Choose with evidence.*
2. *Three constraints — Cost · Performance · Compliance.*
3. ***Technology is a decision. Architecture is responsibility.***

**Elena (VO)** *(brief, over card 1)*:

> The best engineers don't pick tools from hype. They pick from constraints, skills, and proof.

**Narrator (VO)** *(over final card, deliberate)*:

> Technology is a decision. Architecture is responsibility.

**Music**: Fade out.

#### LibTV CLI Production — Module 7

**Module Master prompt**:

```text
MODULE: 7 — Dashboard Payoff & Tool Choice
DURATION: 4:08
SCENES: 18 (including Scene 0 time jump)
NARRATIVE_BEATS: 7
OPENING: "Six months later — YellowLine NYC HQ."
CLOSING: "Technology is a decision. Architecture is responsibility."
STORY: Time jump. Priya's full dashboard handoff. Marcus's production selection question. Previous module flashbacks. Silent writing. Discussion. Three-beat closing cards.
SPECIAL: Scene 4 (2:30-2:50) NO background music
```

##### Scene 0 — Image prompt:

```text
Desk calendar pages flipping rapidly, caption text "Six months later — YellowLine NYC HQ", NYC skyline through window, MHP team gone, quiet reflective tone, 2D illustration, 16:9. All visible text on screen must be in English only. No Chinese characters.
```

**Video**: `-s duration=5` · `Calendar rapid flip 3-4 pages then stop, caption fade-in hold`
**Screen text**: `Six months later — YellowLine NYC HQ`

##### Scene 1 — Image prompt:

```text
Composite montage: five Power BI pages Overview Map Time Revenue Efficiency all green live, Priya presenting to Marcus handoff moment, real screenshots in device frames, optional flashback border, 16:9. All visible text on screen must be in English only. No Chinese characters.
```

**Video**: `-s duration=15` · `Five pages hard cut 10-12 seconds each, charts subtle pulse`

##### Scene 2 — Image prompt:

```text
Evaluation board with three platform icons: Databricks, Snowflake, dbt, Marcus asking production selection, Elena deferring to trainees, 2D boardroom illustration, 16:9. All visible text on screen must be in English only. No Chinese characters.
```

**Video**: `-s duration=10` · `Three icons highlight pulse sequentially, Marcus gesture`

##### Scene 3 — Image prompt:

```text
Quick flashback montage: Databricks notebook, Snowflake worksheet, dbt lineage graph, Priya's charts, split-second cuts, navy and yellow palette consistent, previous module visual callbacks, 16:9. All visible text on screen must be in English only. No Chinese characters.
```

**Video**: `-s duration=10` · `Montage 1-2 seconds per card hard cut`

##### Scene 4 — NO MUSIC — Image prompt:

```text
Classroom scene with large text "Write your recommendation", minimal composition, no character movement, 2D training style, generous whitespace, 16:9. All visible text on screen must be in English only. No Chinese characters.
```

**Video**: `-s duration=5` · `Text static hold, almost no motion, only subtle fade-in, NO background music`
**Screen text**: `Write your recommendation`

##### Scene 5 — Image prompt:

```text
Trainee silhouettes in discussion, whiteboard with comparison columns Databricks Snowflake dbt, collaborative training scene, 2D illustration, 16:9. All visible text on screen must be in English only. No Chinese characters.
```

**Video**: `-s duration=8` · `Whiteboard columns fade-in, silhouettes subtle nodding`

##### Scene 6 — Image prompt:

```text
Three sequential title cards each 5 seconds: (1) "Choose with evidence" (2) "Three constraints: Cost Performance Compliance" (3) "Technology is a decision. Architecture is responsibility.", slow zoom, music fade out, 16:9. All visible text on screen must be in English only. No Chinese characters.
```

**Video**: `-s duration=8` · `Three cards hard cut each hold 5 seconds, slight slow zoom, final card hold to end`
**Screen text** (three cards):
1. `MHP Data Engineer Masterclass 2026 — Choose with evidence.`
2. `Three constraints — Cost · Performance · Compliance.`
3. `Technology is a decision. Architecture is responsibility.`

---

### Module 8 — `mod-08-streaming.mp4` *(Optional)*

**Title card**: *Module 8 — Streaming: When Minutes Matter*
**Target length**: ~2:45
**Delivery**: Advanced session after main day — **Modules 2–3 required**, Module **4 recommended**

#### Voiceover Script

##### Scene 1 — Marcus escalates (0:00–0:35)

**Visual**: Dispatch map with stale timestamp. Batch job icon shows "last run: 6 hours ago."

**Marcus (VO)**:

> The dashboard Bob and Priya built is perfect — for **yesterday**. But when a concert lets out in Brooklyn, I need to know **now**, not at midnight. Can we see demand as it happens?

**Elena (VO)**:

> Phase 2: streaming. Different latency, different cost, different failure modes. We only build it if the business truly needs sub-hour answers.

##### Scene 2 — Batch vs streaming (0:35–1:05)

**Visual**: Split clock — batch calendar vs continuous event stream.

**Narrator (VO)**:

> Batch reads a snapshot on a schedule. Streaming processes events as they arrive — seconds to minutes of latency. Streaming adds checkpoints, watermarks, and always-on compute. Most teams still need batch. Some need both.

**James (VO)**:

> Rule of thumb: if Marcus can wait an hour, keep batch. If empty taxis in the wrong zone costs money in ten minutes, consider streaming.

##### Scene 3 — Kafka intro (1:05–1:30)

**Visual**: Aiven Kafka topic with two partitions; events flow as Avro messages.

**Sofia (VO)**:

> Events land on a Kafka topic — partitioned for parallel consumers. Offsets track progress. Databricks Structured Streaming reads directly. Snowflake often uses a relay to files, then Snowpipe — same medallion idea, different ingest path.

##### Scene 4 — Training proxy dataset (1:30–1:50)

**Visual**: Label: *Training: Aiven User Activity* overlaid on YellowLine NYC map ghosted out.

**Elena (VO)**:

> For this workshop we stream simulated user-activity events from Aiven — not live taxi GPS. The architecture patterns are the same: Bronze append, Silver clean, Gold windowed aggregates.

**Bob (VO)**:

> I'll build streaming Bronze, Silver, and Gold on Databricks and Snowflake — and dbt dynamic tables on Snowflake.

##### Scene 5 — Priya's live dashboard (1:50–2:15)

**Visual**: Power BI DirectQuery page — bar chart by country updates; "Last refreshed" card ticks.

**Priya (VO)**:

> Import mode is for batch. For streaming Gold, I use DirectQuery with one-minute page refresh — aligned to Snowflake Dynamic Table lag. When events stop, the chart flatlines. When they resume, Marcus sees it within minutes.

##### Scene 6 — Transition (2:15–2:45)

**Narrator (VO)**:

> Module 8: optional advanced lab. Think first: does Marcus really need streaming? Then Kafka, Structured Streaming, Dynamic Tables, and live Power BI.

**Title card**: *Module 8 — Streaming (Optional)*

#### LibTV CLI Production — Module 8

**Module Master prompt**:

```text
MODULE: 8 — Streaming (Optional)
DURATION: 2:45
SCENES: 6
TAGLINE: "Module 8 — Streaming (Optional)"
STORY: Marcus needs live dispatch. Batch vs stream comparison. Kafka intro. Aiven training proxy honestly labeled. Priya DirectQuery. Lab transition.
```

##### Scene 1 — Image prompt:

```text
Dispatch map with stale timestamp, batch job icon "last run 6 hours ago", Marcus frustrated, live demand narrative, 2D training illustration, 16:9. All visible text on screen must be in English only. No Chinese characters.
```

**Video**: `-s duration=10` · `Timestamp red pulse, Marcus frown gesture`

##### Scene 2 — Image prompt:

```text
Split screen clock: batch calendar vs continuous event stream, comparison infographic, corporate training diagram, 2D flat, 16:9. All visible text on screen must be in English only. No Chinese characters.
```

**Video**: `-s duration=8` · `Left right comparison arrows animate, stream side continuous flow`

##### Scene 3 — Image prompt:

```text
Kafka topic with two partitions, events flowing as Avro messages, simplified technical diagram not cluttered, 2D training style, 16:9. All visible text on screen must be in English only. No Chinese characters.
```

**Video**: `-s duration=5` · `Event dots flow along partition lines`

##### Scene 4 — Image prompt:

```text
Label overlay "Training: Aiven User Activity" on ghosted YellowLine NYC map, honesty badge workshop dataset, 2D flat, 16:9. All visible text on screen must be in English only. No Chinese characters.
```

**Video**: `-s duration=5` · `Label fade-in hold, map ghosted static`
**Screen text**: `Training: Aiven User Activity`

##### Scene 5 — Image prompt:

```text
Power BI DirectQuery page: bar chart by country updating, "Last refreshed" card ticking every one minute, real or mock screenshot, 16:9. All visible text on screen must be in English only. No Chinese characters.
```

**Video**: `-s duration=5` · `Refresh time tick animation, bar subtle change`

##### Scene 6 — Image prompt:

```text
Chapter card "Module 8 — Streaming (Optional)", navy #01065c gradient, 16:9. All visible text on screen must be in English only. No Chinese characters.
```

**Video**: `-s duration=8` · `Chapter card fade-in`
**Screen text**: `Module 8 — Streaming (Optional)`

---

### Module 9 — `mod-09-ml.mp4` *(Optional)*

**Title card**: *Module 9 — Machine Learning: Predict the Tip*
**Target length**: ~2:45
**Delivery**: Advanced session — **Modules 2–3 required**, Module **4 recommended**. Deliver **after Module 8** in story order.

#### Voiceover Script

##### Scene 1 — Marcus's new ask (0:00–0:35)

**Visual**: Priya's Revenue dashboard. Marcus highlights credit-card tip variance by hour.

**Marcus (VO)**:

> Priya showed me when we earn — but not **why** some trips tip well and others don't. Can we predict tips on card trips? Maybe adjust incentives before the evening rush?

**Elena (VO)**:

> Phase 2: machine learning. Same NYC Taxi data you already cleaned in Silver — new question, not a new dataset. Different skills, different tools, same medallion discipline.

##### Scene 2 — Where ML sits (0:35–1:05)

**Visual**: Lifecycle diagram — Silver branches to Gold KPIs and to ML feature table → model → predictions Gold → Power BI.

**Narrator (VO)**:

> Data engineers own Bronze through Silver — and often the **feature table**. Data scientists train models. BI consumes predictions from Gold, just like KPIs. Today you play both roles.

**James (VO)**:

> The target is `tip_amount` on credit-card trips. But watch **leakage** — if your features include the answer, the model looks brilliant and fails in production.

##### Scene 3 — Leakage warning (1:05–1:25)

**Visual**: Red X on `total_amount` and `tip_percentage`. Green check on `fare_amount`, `trip_distance`, `pickup_hour`, borough.

**Sofia (VO)**:

> Never use `total_amount` — it already contains the tip. Filter to credit card only; cash trips record zero tip digitally and would fool the model.

##### Scene 4 — Three tool paths (1:25–1:55)

**Visual**: Split screen — Databricks notebook + MLflow, Snowflake SQL with `ML.FORECAST`, Snowpark Python, dbt feature model in Git.

**Bob (VO)**:

> Databricks: sklearn and MLflow — full flexibility. Snowflake Cortex: ML in pure SQL for analysts. Snowpark ML: Python training without moving data out of the warehouse.

**Elena (VO)**:

> And dbt? It does not train models. It defines and **tests** the feature table both platforms read. Always separate **features** from **training**.

##### Scene 5 — Priya's prediction view (1:55–2:15)

**Visual**: Power BI scatter — actual vs predicted tip; feature importance bar chart optional.

**Priya (VO)**:

> I'll add predicted tips beside actuals on a new page — fed from a Gold scoring table Bob batch-writes after training. Same connector, new table, same governance story.

##### Scene 6 — Transition (2:15–2:45)

**Narrator (VO)**:

> Module 9: optional ML lab. Think first: what is leakage? Who owns features? Then train on three paths and compare RMSE, effort, and who can maintain each approach.

**Title card**: *Module 9 — Machine Learning (Optional)*

#### LibTV CLI Production — Module 9

**Module Master prompt**:

```text
MODULE: 9 — Machine Learning (Optional)
DURATION: 2:45
SCENES: 6
TAGLINE: "Module 9 — Machine Learning (Optional)"
STORY: Tip prediction need. ML lifecycle within medallion. Leakage warning. Three tool paths. Priya's prediction view. Lab transition.
```

##### Scene 1 — Image prompt:

```text
Priya Revenue dashboard, Marcus highlighting tip variance by hour, prediction use case, corporate training 2D illustration, 16:9. All visible text on screen must be in English only. No Chinese characters.
```

**Video**: `-s duration=10` · `Hour bar chart highlight pulse, Marcus pointing`

##### Scene 2 — Image prompt:

```text
Lifecycle diagram: Silver branching to Gold KPIs and ML feature table to model to predictions Gold to Power BI, lifecycle infographic, 2D flat, 16:9. All visible text on screen must be in English only. No Chinese characters.
```

**Video**: `-s duration=8` · `Branch arrows flow left to right, nodes light up sequentially`

##### Scene 3 — Image prompt:

```text
Warning card: red X on total_amount and tip_percentage, green checkmarks on fare_amount trip_distance pickup_hour borough, 2D training style, 16:9. All visible text on screen must be in English only. No Chinese characters.
```

**Video**: `-s duration=5` · `Red X pop-in, green checkmarks appear sequentially`

##### Scene 4 — Image prompt:

```text
Four-panel: Databricks MLflow, Snowflake Cortex ML SQL, Snowpark Python, dbt feature model Git, four-panel training graphic, 2D flat, 16:9. All visible text on screen must be in English only. No Chinese characters.
```

**Video**: `-s duration=8` · `Four panels reveal sequentially, logos subtle pulse`

##### Scene 5 — Image prompt:

```text
Power BI scatter plot: actual vs predicted tip, optional feature importance bars, Gold scoring table callout, screenshot in device frame, 16:9. All visible text on screen must be in English only. No Chinese characters.
```

**Video**: `-s duration=5` · `Scatter dots subtle fade-in, callout fade-in`

##### Scene 6 — Image prompt:

```text
Chapter card "Module 9 — Machine Learning (Optional)", navy #01065c gradient, 16:9. All visible text on screen must be in English only. No Chinese characters.
```

**Video**: `-s duration=8` · `Chapter card fade-in`
**Screen text**: `Module 9 — Machine Learning (Optional)`

---

## 4. Audio & Subtitles

### 4.1 Voiceover Recording / TTS

**Primary path**: `node workshop-2026-v2/scripts/pipeline/produce-module.js XX --from-phase voiceover`

All modules use the **same voice lock** from `configs/characters.json` (see §1.3). Do not assign different voices per module or per clip.

| Method | Tool | Output | Priority |
|--------|------|--------|----------|
| **edge-tts** (recommended) | `produce-module.js` voiceover phase | Per-segment MP3 → padded per shot + SRT | ★ Primary |
| **Human recording** | External DAW | Per-module WAV, English | Alternative |
| **ElevenLabs** | ElevenLabs Studio | Per-module WAV export | Alternative |

**Do not use** LibTV / Minimax embedded clip audio for character dialogue — it will not match across modules.

**Minimax TTS prompt prefix** (paste before English voiceover text):

```text
Speak as a corporate e-learning narrator: clear, neutral American English, moderate pace, professional training video tone, no drama, suitable for data engineering workshop audience. Pronounce clearly: YellowLine NYC, Databricks, Snowflake, dbt, ADLS2, Power BI, Unity Catalog.
```

**Character voice prefixes** (separate audio nodes):

```text
Character voice — Marcus Chen, Operations Manager: pragmatic, slightly stressed but professional American English, short sentences, not theatrical.
```

```text
Character voice — Elena Vasquez, Data Architect: calm, authoritative, teaching tone, American English, measured pace.
```

**CLI upload**:

```bash
libtv upload "mod-{XX}-voiceover" -f ./audio/mod-{XX}.wav
```

### 4.2 Subtitle Rules

- **English only** — match voiceover script exactly
- Max **70 characters per line**, max 2 lines
- Preserve product names: YellowLine NYC, Power BI, Snowflake, Databricks, dbt, ADLS2
- Module 7 Scene 4 (2:30–2:50): **no background music**, subtitles only

### 4.3 Music Rules

| Segment | Music |
|---------|-------|
| Standard scenes | Light corporate underscore, duck under voiceover (~−12 dB) |
| Module 7 Scene 4 (2:30–2:50) | **NO MUSIC** — silence for reflection |
| Module 7 Scene 6 closing | Music fade out with final title card |

### 4.4 edge-tts Voiceover Pipeline

**Tool**: Python `edge_tts` via `produce-module.js` (or fallback `prompts/generate-voiceover-srt.py` for mod-00 only).

**Canonical voice map** — `workshop-2026-v2/scripts/pipeline/configs/characters.json`:

| Key | Character | Voice ID |
|-----|-----------|----------|
| marcus | Marcus Chen | `en-US-ChristopherNeural` |
| elena | Elena Vasquez | `en-US-JennyNeural` |
| bob | Bob Muller | `en-US-GuyNeural` |
| sofia | Sofia Alvarez | `en-US-EmmaNeural` |
| priya | Priya Sharma | `en-US-AriaNeural` |
| james | James Okonkwo | `en-US-EricNeural` |
| narrator | Narrator | `en-US-AndrewNeural` |

**TTS settings (all modules, all segments)**:
- `rate="+0%"` (from `characters.json` → `ttsRate`)
- Same voice ID for each key in mod-00 through mod-07
- Per-shot padding: speech fits inside `videoDuration`; silence after last word in clip

**Pipeline steps** (`produce-module.js`):
1. Load `VOICES` from `characters.json` (voice lock — no per-module overrides)
2. For each scene, generate one MP3 per `voiceover[]` line with the mapped voice
3. Pad each scene track to `videoDuration`; concat scenes → `audio/mod-XX-*.mp3`
4. Write `mod-XX-*.srt` from the same lines and timings

**Output**: `workshop-2026-v2/audio/mod-XX-*.mp3` + `.srt`

**Notes**:
- Verify voices: `edge-tts --list-voices | grep en-US`
- After changing any voice in `characters.json`, re-run voiceover for **every** module
- Master prompts show `voice=` on each line — use for QA only; configs are source of truth

---

## 5. Export, QA & Batch Production

### 5.1 Export

**Recommended: ffmpeg composition** (supports burned-in subtitles, precise encoding control):

```bash
# 1. Download all scene video clips from LibTV (get URLs from libtv node <id>)
# 2. Create concat list
# concat_list.txt:
# file 'shot01.mp4'
# file 'shot02.mp4'
# ...

# 3. Concatenate video clips in shot order
ffmpeg -f concat -safe 0 -i concat_list.txt -c copy mod-{XX}-concat.mp4

# 4. Overlay voiceover + burn subtitles
ffmpeg -i mod-{XX}-concat.mp4 \
  -i audio/mod-{XX}-voiceover.mp3 \
  -vf "subtitles=audio/mod-{XX}-subtitles.srt:force_style='FontName=Arial,FontSize=22,PrimaryColour=&H00FFFFFF,OutlineColour=&H00000000,Outline=2,Alignment=2'" \
  -c:v libx264 -preset medium -crf 18 \
  -c:a aac -b:a 192k \
  -shortest \
  mod-{XX}-final.mp4
```

**Alternative: LibTV canvas video-clip node** (quick preview, no burned-in subtitles):

```bash
# Connect all scene video nodes + voiceover audio → video-clip composition
# (Use canvas UI for timeline editing and final render)
```

Save to: `workshop-2026-v2/media/modules/mod-{XX}-slug.mp4`

Run scaffold to verify Quarto embedding:

```bash
python workshop-2026-v2/scripts/_scaffold_generate.py
```

### 5.2 QA Checklist

- [ ] 1920×1080, 30 fps, H.264
- [ ] Duration within ±5s of voiceover script target
- [ ] Subtitles and on-screen text **English only** (chapter cards, lower-thirds, chart labels, UI — **no Chinese characters**)
- [ ] Subtitle proofreading (Marcus Chen, YellowLine NYC, product name spellings)
- [ ] Six character turnaround assets created; cross-scene reference same node
- [ ] All brand logos uploaded and referenced via `logo-{name}` nodes (not re-uploaded per module)
- [ ] Character first appearance: lower-third in English
- [ ] No photorealistic faces (2D flat illustration)
- [ ] Video nodes have `enableSound=off` (voiceover is a separate audio track)
- [ ] `libtv node list` verified — no failed nodes (status=3) in any module group
- [ ] Storyboard nodes with characters have `modeType=image2image` set before linking references
- [ ] Stale reference check: all video nodes reference current (latest) storyboard images
- [ ] Skin tone descriptions present in all character-containing image prompts
- [ ] Per-shot video prompts are unique (not generic templates)
- [ ] Module 3 ending card: *Same architecture. Different implementation philosophy.*
- [ ] Module 7 opening: *Six months later...*; closing: *Technology is a decision. Architecture is responsibility.*
- [ ] Module 7 Scene 4: no background music
- [ ] Priya screenshots match current `powerbi/` exports
- [ ] File path: `media/modules/mod-XX-slug.mp4`
- [ ] Ran `scripts/_scaffold_generate.py`; preview module page embeds correctly

### 5.3 Batch Production Order

| Order | Module | Reusable assets |
|-------|--------|----------------|
| 1 | mod-00 | Character bible, medallion diagram, empty PBI wireframe, all logos |
| 2 | mod-01 | Medallion, characters, tool matrix |
| 3 | mod-02 | Pipeline diagram, Overview PBI screenshot |
| 4 | mod-03 | Gold schema comparison, Map PBI screenshot |
| 5 | mod-04 | dbt lineage DAG |
| 6 | mod-05 | Scheduling icons |
| 7 | mod-06 | Medallion + AI overlay |
| 8 | mod-07 | All PBI pages, previous module flashbacks |
| 9 | mod-08 | *(optional)* Kafka diagram |
| 10 | mod-09 | *(optional)* ML lifecycle diagram |

### 5.4 CLI Batch Example (Story end-to-end)

**Preferred**: §2.8 `produce-module.js` (config rows + API upload + storyboard bind).

**Manual CLI** (reference / debugging):

```bash
#!/usr/bin/env bash
set -euo pipefail  # Stop on error, undefined var, or pipe failure

# Prerequisites: login + project use
libtv login web
libtv project use <PROJECT_UUID>   # see .libtv/project.json

# Create module group and bind as default scope
libtv group create "mod-00-welcome"
libtv group use "mod-00-welcome"

# Upload reference assets
libtv upload "mod-00-style-ref" -f ./refs/mhp-style-guide.png
libtv upload "mod-00-pbi-wireframe" -f ./powerbi/wireframe-empty.png

# Script node: archive prompt + inject rows from config (do NOT --run master on Windows)
# node workshop-2026-v2/scripts/pipeline/upload-script-rows.js 00
libtv node create "mod-00-script" -t script \
  --prompt "$(cat workshop-2026-v2/prompts/mod-00-master.txt)" \
  -s model=aurora-3-prime

# Generate storyboard images (uses nebula-ultra)
# Auto-creates one image node per script row
libtv script storyboard "mod-00-script" \
  -s model=nebula-ultra -s ratio=16:9 -s quality=2K

# Verify storyboard images generated
libtv node list

# Generate video clips per scene (uses star-video2 / Seedance 2.0 VIP)
# Scene 1
libtv node create "mod-00-s1-video" -t video \
  --prompt "Subtle slow pan across dispatch center, stable camera, corporate documentary pace" \
  -s model=star-video2 -s modeType=singleImage2video \
  -s ratio=16:9 -s resolution=720p -s duration=10 -s enableSound=off \
  --left "mod-00-s1-image"
libtv node "mod-00-s1-video" --run

# Scene 2
libtv node create "mod-00-s2-video" -t video \
  --prompt "Heatmap subtle zoom to Manhattan, clock rotation, stable camera" \
  -s model=star-video2 -s modeType=singleImage2video \
  -s ratio=16:9 -s resolution=720p -s duration=8 -s enableSound=off \
  --left "mod-00-s2-image"
libtv node "mod-00-s2-video" --run

# ... repeat for scenes 3-7

# Upload voiceover
libtv upload "mod-00-voiceover" -f ./audio/mod-00-welcome.wav

# Final composition: connect all video + audio → video-clip node
libtv node create "mod-00-final" -t video-clip
libtv node "mod-00-final" \
  --left "mod-00-s1-video" --left "mod-00-s2-video" \
  --left "mod-00-voiceover"
# Set clipTimelineData via canvas UI (clip ordering, I/O trims), then:
# libtv node "mod-00-final" --run

# Final verification
libtv node list
```

**Batch error handling notes**:
- `set -euo pipefail` at the top ensures the script stops on any failure
- If a generation node fails, the script halts — check `libtv node list` to identify which node failed
- Re-run only the failed node: `libtv node "<failed-node>" --run`
- Never rebuild the entire canvas for a single node failure
- To capture stderr for debugging: append `2>> pipeline.log` to the batch script invocation

---

## Appendix — Subtitle & Localization Notes

- Keep subtitles **≤ 42 characters per line** where possible.
- German dub: translate Marcus's constraints formally (*"Mein Team arbeitet in SQL"*); keep product names in English (Databricks, Snowflake, dbt, Power BI, MLflow).
- On-screen text (*What would YOU design?*) must match spoken language per track.

---

## Appendix — B-Roll & Asset Checklist

| Asset | Used in modules | Source |
|-------|-----------------|--------|
| NYC skyline / dispatch center | 0 | AI-generated keyframe |
| Excel chaos screens | 0 | Screenshot or mock |
| Character avatar cards (×6) | 0, 1 | LibTV turnaround (Section 2.3) |
| Medallion layer diagram | 0, 1 | AI-generated keyframe |
| Empty Power BI wireframe | 0, 1 | `powerbi/` export |
| Power BI Overview (partial) | 2 | `powerbi/` screenshot |
| Power BI Map page | 3 | `powerbi/` screenshot |
| Power BI Revenue + quality | 4 | `powerbi/` screenshot |
| Power BI full five pages | 7 | `powerbi/` screenshot |
| Databricks notebook UI mock | 2, 9 | Screenshot or mock |
| Snowflake worksheet mock | 3, 9 | Screenshot or mock |
| dbt lineage graph | 4, 9 | `dbt docs generate` output |
| Workflow / Task / CI icons | 5 | AI-generated or logo set |
| AI assistant UI mock | 6 | AI-generated keyframe |
| Kafka topic animation | 8 | AI-generated keyframe |
| Power BI DirectQuery refresh | 8 | `powerbi/` screenshot |
| ML lifecycle diagram | 9 | AI-generated keyframe |
| MLflow experiment UI mock | 9 | Screenshot or mock |
| Feature importance chart | 9 | AI-generated keyframe |
| Actual vs predicted scatter | 9 | AI-generated keyframe |
| **Logos** (Databricks, Snowflake, dbt, etc.) | All | [vectorlogo.zone](https://www.vectorlogo.zone/) |

Capture dashboard screenshots from [`powerbi/`](../media/powerbi/) after Gold tables are populated in a reference environment.

---

## Document History

| Date | Change |
|------|--------|
| 2026-05-23 | Initial voiceover scripts for Modules 0–7 |
| 2026-05-23 | Added optional Module 8 streaming voiceover script |
| 2026-05-23 | Added optional Module 9 ML voiceover script; fixed appendix structure |
| 2026-05-23 | Aligned optional prerequisites (2–3 required, 4 recommended) |
| 2026-05-30 | Merged with libtv-production-guide.zh.md: added LibTV CLI commands, model selection (Seedance 2.0 VIP + nebula-ultra), logo sources (vectorlogo.zone), per-scene production prompts |
| 2026-05-30 | Replaced manual web UI workflow with `libtv` CLI commands throughout |
| 2026-05-30 | Renamed from `animation-voiceover-scripts.md` to `animation-production-scripts.md`; merged character bible (CAST), three-view turnaround prompts, script binding template, expression/pose variants, lower-third labels, character QA checklist, and Chinese style prompt from `libtv-production-guide.zh.md` §4.2 |
| 2026-05-31 | Harmonized character module ranges with voiceover scripts (Elena/Bob/Priya extended to 8–9; Sofia added 3, 8–9; James corrected to 0–1, 4, 6, 8–9; Marcus corrected to 0, 3–8); updated Expression & Pose Variants table |
| 2026-05-31 | Female cast differentiation: Elena loose chestnut hair + blazer; Sofia ponytail + zip polo; Priya half-up hair + green top; negativeVisual in characters.json |
| 2026-05-31 | Editorial shot plan: split dialogue/monologue beats per module; B-roll holds after every major VO; `narrativeBeat` + `shotStrategy` in pipeline configs; [module-shot-plan.md](module-shot-plan.md) auto-generated; §3.0 narrative vs production shots |
| 2026-05-31 | §2.8 `produce-module.js`: config script rows, HTTP batch API upload (bypass ENAMETOOLONG), storyboard image2image bind; §2.7/§5.4 warn against `script --run` (10023) and `rows=@file`; model table clarifies archive vs production rows |
| 2026-05-31 | §2.4 + scene prompts: retire “Snowflake training” as *visual* style; align with MHP blue-white + American animation + per-character outfitColors (`characters.json`, `training-style.js`) |
