Lab · Snowflake (Codebase, optional)
Module 3 stretch — Snowflake CLI + Snowpark from local terminal
title: “Lab · Snowflake (Codebase, optional)” subtitle: “Module 3 stretch — Snowflake CLI + Snowpark from local terminal” —
The required lab is Exercise: Snowflake: A.1 (Snowsight SQL) and B.1 (Workspaces Notebook) — both run entirely inside Snowsight.
This optional page covers the codebase alternatives that run from a local terminal (Codespaces or your machine). These paths are common in production CI/CD environments.
Module 3 does not require this page. Try it if you have extra time after completing the main lab.
Module: 3 — Snowflake Pipeline · Required lab: Exercise: Snowflake · Setup: Snowflake
When to use each path
| A.1 / B.1 — Snowsight ✅ | A.2 / B.2 — Codebase | |
|---|---|---|
| Where code runs | Inside Snowflake (browser) | Your machine / Codespaces |
| How you connect | Git workspace + Snowflake session | .env credentials |
| Setup needed | Git sync only | .env + pip install (or snow CLI pre-installed) |
| Best for | Interactive development, workshop labs | CI/CD pipelines, automation, terminal-first workflows |
| When in practice | Module 3 labs, debugging, ad-hoc queries | Production deployments, GitHub Actions, scheduled jobs |
- A.1 (Snowsight SQL) runs on a warehouse — same SQL engine as A.2.
- B.1 (Workspaces Notebook) runs on Container Runtime (SPCS) via a compute pool — the notebook kernel executes inside Snowflake.
- A.2 (Snowflake CLI) sends SQL to a warehouse via the
snowCLI from your terminal. - B.2 (Snowpark Local) uses the Snowpark client library, which translates DataFrame operations to SQL and sends them to a warehouse. No compute pool needed.
Key insight: Snowsight paths (A.1, B.1) are managed by Snowflake — no credentials on your machine. Codebase paths (A.2, B.2) require .env with your Snowflake credentials — more setup, but full control for automation.
Prerequisites
- Completed the main Snowflake lab (Bronze, Silver, Gold tables exist)
- Role
DE_WORKSHOP_ROLE, warehouseDE_WORKSHOP_WHStarted
Configure .env
Both codebase paths need a local .env. Snowsight paths (A.1, B.1) do not — credentials are managed by Snowflake.
Create
.envfrom the template:cp .env.template .envcopy .env.template .envcp .env.template .envEdit
.env— in Codespaces: click.envin the Explorer sidebar (left) to open it in the built-in editor — fill in these values:Variable Value Where to find it ATTENDEE_IDe.g. de_01_aliceAssigned by trainer — see My Workshop SNOWFLAKE_ACCOUNTe.g. el30551.west-europe.azureSnowsight URL: …/west-europe.azure/el30551/…→ copy the two segments (nohttps://, no.snowflakecomputing.com)SNOWFLAKE_USERyour Snowflake username Trial signup email SNOWFLAKE_PASSWORDyour Snowflake password Trial signup Leave defaults for:
SNOWFLAKE_WAREHOUSE=DE_WORKSHOP_WH,SNOWFLAKE_DATABASE=DE_MASTERCLASS,SNOWFLAKE_ROLE=DE_WORKSHOP_ROLE.Load into your terminal:
source .envGet-Content .env | ForEach-Object { if ($_ -match '^([^#][^=]+)=(.*)$') { [System.Environment]::SetEnvironmentVariable($matches[1], $matches[2]) } }
You can skip Databricks variables (
DATABRICKS_HOST,DATABRICKS_TOKEN,DATABRICKS_HTTP_PATH) — they are only needed for Module 4 (dbt).
.env powers each codebase path
- A.2 (Snowflake CLI):
setup-environment.shreads.envand generates~/.snowflake/config.toml— the CLI picks up connection details automatically. - B.2 (Snowpark):
_snowpark_bootstrap.pyreads.envdirectly to create a Snowpark session.
A.2 — Snowflake CLI
The devcontainer has the Snowflake CLI (snow) pre-installed. After configuring .env, run each .sql file from the terminal:
snow sql -f snowflake/sql/bronze/01_ingest_trips.sql -c default
snow sql -f snowflake/sql/silver/01_create_cleaned.sql -c default
snow sql -f snowflake/sql/silver/02_create_enriched.sql -c default
snow sql -f snowflake/sql/gold/01_create_kpis.sql -c defaultSet attendee_id in the SQL file’s SET statement before running (replace de_XX_yourname with your actual ID), or pass it as a CLI variable:
snow sql -f snowflake/sql/bronze/01_ingest_trips.sql -c default -D "attendee_id=$ATTENDEE_ID"B.2 — Snowpark (Local Terminal)
The Python process runs locally; DataFrame operations are pushed down to the Snowflake warehouse as SQL. Connection details come from snowflake/snowpark/_snowpark_bootstrap.py via .env environment variables (setup).
All packages are pre-installed from pyproject.toml. Skip to running the scripts:
source .env
python snowflake/snowpark/00_snowpark_setup.py # optional — verify connectivity
python snowflake/snowpark/01_bronze_ingestion.py
python snowflake/snowpark/02_silver_cleaning.py
python snowflake/snowpark/03_gold_kpis.pyInstall packages first, then run:
uv pip install --system .
# Load .env (see step 3 above)
python snowflake/snowpark/00_snowpark_setup.py # optional — verify connectivity
python snowflake/snowpark/01_bronze_ingestion.py
python snowflake/snowpark/02_silver_cleaning.py
python snowflake/snowpark/03_gold_kpis.pyuv pip install --system .
source .env
python snowflake/snowpark/00_snowpark_setup.py # optional — verify connectivity
python snowflake/snowpark/01_bronze_ingestion.py
python snowflake/snowpark/02_silver_cleaning.py
python snowflake/snowpark/03_gold_kpis.py- Run
01_bronze_ingestion.py— look for[PASS]on all Bronze verification lines - Run
02_silver_cleaning.py - Run
03_gold_kpis.py— confirmkpi_top_pickup_zonesshows 20 rows and all 12kpi_*tables exist
Expected Results
Same as the main lab — Bronze ~3M rows, Silver ~2.5-2.8M, Gold KPIs identical.
Cleanup
Suspend your warehouse when finished:
ALTER WAREHOUSE DE_WORKSHOP_WH SUSPEND;No compute pool cleanup needed — codebase paths (A.2, B.2) use the warehouse only, not SPCS.