Use Case — YellowLine NYC
Fictional yellow-taxi operator · NYC TLC-inspired data
Story Animation
Client
YellowLine NYC — a fictional NYC yellow-taxi operator (inspired by TLC public trip data, not a real company). They operate across all five boroughs and compete with ride-hail on price, wait time, and driver earnings.
Business challenge
Trip volume is stable, but revenue per mile and driver utilization vary by hour, borough, and route. Leadership suspects they are:
- Over-supplying taxis in low-demand zones at the wrong times
- Under-pricing or mis-allocating fleet during peak windows
- Losing visibility because reports are built manually from spreadsheets
They need an analytics platform their SQL-heavy internal team can maintain after MHP leaves.
Success criteria
| # | Criterion | How the training proves it |
|---|---|---|
| 1 | Ingest TLC trip data reliably | Bronze layer in all three tool tracks |
| 2 | Clean, enriched trip records | Silver with quality rules + zone lookup |
| 3 | Answer 12 operational KPI questions | Gold layer (identical KPIs across platforms) |
| 4 | Executive dashboard | Power BI on Gold tables |
| 5 | Team can extend pipelines | Snowflake SQL + dbt maintainability story |
| 6 | Documented lineage and quality | dbt docs, tests, lineage graph |
| 7 | Informed platform choice | Module 7 open discussion |
Data source
Azure ADLS2: mhpdeworkshopsa / nyc-taxi-data
├── raw/trips/ → yellow_tripdata_*.parquet
└── raw/lookup/ → taxi_zone_lookup.csv
Technical detail: Architecture · Data model
Priya’s five KPI questions
- When are our peak revenue hours?
- Which pickup zones drive the most trips?
- Which routes are most popular?
- How does revenue vary by borough and distance band?
- Are we losing money on bad data (null fares, zero-distance trips)?
All five map to the 12 Gold tables in the data model reference.
The Three Constraints
Marcus will judge every recommendation against three orthogonal dimensions. Keep them in your head from the story introduction to Module 7 — Elena revisits them in the wrap-up.
| Constraint | Marcus’s question | Where it lands |
|---|---|---|
| Cost | Can YellowLine NYC afford this in year 3? | Module 7 Round 2 Card B |
| Performance | Fast enough for live dispatch later? | Module 8 streaming |
| Compliance | Audit-ready lineage by Q3? | Module 4 dbt; Module 7 Card D |
By end of day you should answer all three — not just “which tool is best?”