MHP

Databricks Personas – Spark vs. SQL Users

Advanced User (Spark-centric)
Primary Language: PySpark / Scala
Typical Work: Complex pipelines, streaming, ML
Output: Reusable libraries, high-scale data flows
Best fit: Flexibility and performance for complex needs
Basic User (SQL-centric)
Primary Language: SQL
Typical Work: Clean tables for reporting, aggregations
Output: BI-ready datasets and data marts
Best fit: Repeatable, governed SQL transformations

dbt bridges the gap — giving SQL users the same engineering rigor as Spark users