Running Models¶
There are three runners that drive dbt-cerebro models in production, plus the plain dbt run command for ad-hoc work. This page walks through each one with concrete examples.
If you haven't read Incremental Strategies yet, do so first — it explains why the same model behaves differently across these runners.
TL;DR¶
| Goal | Command |
|---|---|
| Run a single model right now (default daily-incremental behavior) | docker exec dbt dbt run --select <model> --project-dir /app --profiles-dir /app |
| Bring an annotated incremental model up to today via per-day slices | docker exec dbt python /app/scripts/refresh/dbt_incremental_runner.py --select <model> |
| Backfill a model month-by-month from its declared start date | docker exec dbt python /app/scripts/full_refresh/refresh.py --select <model> |
| Recover after the prices source skipped a day | docker exec dbt /app/scripts/maintenance/refill_after_price_gap.sh --from-date YYYY-MM-DD |
Mode 1 — Daily cron (run_dbt_observability.sh)¶
The production cron runs scripts/run_dbt_observability.sh, which delegates each batch to scripts/refresh/dbt_incremental_runner.py. For models with no microbatch annotation, the runner falls back to plain dbt run --select <batch> with no vars set — that triggers the daily incremental mode (delete+insert over the macro's lookback).
You don't normally invoke this manually. To reproduce a single batch the cron would run:
docker exec dbt dbt run \
--select int_execution_blocks_daily \
--project-dir /app --profiles-dir /app
What ClickHouse sees:
INSERT INTO dbt.int_execution_blocks_daily SELECT ...
WHERE toStartOfMonth(toDate(block_timestamp)) >= ( -- last month's start
SELECT toStartOfMonth(addDays(max(toDate(x1.date)), 0)) FROM dbt.int_execution_blocks_daily AS x1
)
AND toDate(block_timestamp) >= (
SELECT addDays(max(toDate(x2.date)), 0) FROM dbt.int_execution_blocks_daily AS x2
);
ALTER TABLE dbt.int_execution_blocks_daily DELETE WHERE date IN (...inserted dates...);
The mutation deletes the previous version of the rewritten dates so RMT doesn't have to dedupe. Cheap on small daily windows; expensive on large windows (which is why the next two modes exist).
Mode 2 — Microbatch runner (dbt_incremental_runner.py)¶
For models tagged microbatch (and configured in meta.full_refresh.incremental in their schema.yml), the runner slices the gap between max(target_date) and today into per-day windows.
docker exec dbt python /app/scripts/refresh/dbt_incremental_runner.py \
--select int_consensus_validators_income_daily \
--project-dir /app --profiles-dir /app
Internally the runner emits one dbt run per slice, with incremental_end_date set:
dbt run --select int_consensus_validators_income_daily \
--vars '{"incremental_end_date": "2026-04-21", "validator_index_start": 0, "validator_index_end": 100000}'
dbt run --select int_consensus_validators_income_daily \
--vars '{"incremental_end_date": "2026-04-22", "validator_index_start": 0, "validator_index_end": 100000}'
...
Two important effects:
- Strategy flips to
appendbecause the strategy expression is('append' if (start_month or incremental_end_date) else 'delete+insert'). No mutation per slice. apply_monthly_incremental_filtertakes its no-overlap branch —WHERE date > max(target_date) AND date <= incremental_end_date. Re-running the same slice writes nothing (idempotent).
Common runner flags¶
| Flag | Effect |
|---|---|
--select <selector> | Same syntax as dbt run; supports +, tag:, path: |
--max-end-date YYYY-MM-DD | Cap slicing at this date (otherwise stops at today) |
--max-slices-per-stage N | Refuse if the gap is longer than N days (default 30) — large gaps should go through Mode 3 |
--dry-run | Print the slice plan, no DB writes |
--resume | Skip slices already completed in target/incremental_microbatch_state.json |
When the runner refuses¶
The microbatch path is for daily catch-up, not historical backfill. For long gaps, run Mode 3 once to fill the hole, then microbatch resumes naturally on the next cron tick.
Mode 3 — Full-refresh batched (full_refresh.py)¶
For historical backfill, monthly batches are the right granularity. Each model declares its history under meta.full_refresh in schema.yml:
- name: int_consensus_validators_income_daily
meta:
full_refresh:
start_date: "2021-12-01"
batch_months: 1
stages:
- name: validators_0_100k
start_date: "2021-12-01"
vars: { validator_index_start: 0, validator_index_end: 100000 }
- name: validators_100k_200k
...
Run the whole annotated history:
docker exec dbt python /app/scripts/full_refresh/refresh.py \
--select int_consensus_validators_income_daily \
--project-dir /app --profiles-dir /app
The runner iterates (stage × month) and emits one dbt run per batch with start_month / end_month and the stage's vars. Strategy flips to append.
Resume on failure¶
Each completed batch is written to target/full_refresh_state.json. Re-invoking the same command picks up where it left off.
Transient retries¶
Code: 241 / 159 / 209 / 210 and MEMORY_LIMIT_EXCEEDED / OvercommitTracker errors are auto-retried with exponential backoff (30s → 60s → 120s → 240s → 480s, max 5 attempts). Logs show [transient] retry n/5 in Ns between attempts.
Restricting to a subset¶
# A single month
--start-date 2024-04-01 --end-date 2024-04-30
# A single stage
--stages validators_0_100k
# A single (stage, month)
--stages validators_0_100k --start-date 2024-04-01 --end-date 2024-04-30
Mode 4 — Refill recovery (refill_after_price_gap.sh)¶
For incidents — a Dune prices skip, an upstream backfill, a model that drifted out of sync. Two-phase by design:
| Phase | What | How |
|---|---|---|
| 1 | For each affected month, append-rewrite every tag:refill_append model in DAG order, then OPTIMIZE PARTITION '<month>' FINAL DEDUPLICATE each one | dbt run --select tag:refill_append --vars '{"start_month":"<m>","end_month":"<m>"}', then a dbt run-operation optimize_partition_final per model |
| 2 | Re-pull every prices descendant not in Phase 1 with the wider lookback | dbt run --select int_execution_token_prices_daily+ --exclude tag:refill_append --vars '{"price_lookback_days": 12}' |
See Recovering from a Prices Gap for the full walk-through.
Plain dbt run recipes¶
For ad-hoc work outside the runners.
A single model with default daily behavior¶
docker exec dbt dbt run \
--select int_execution_blocks_daily \
--project-dir /app --profiles-dir /app
A model and everything downstream of it¶
docker exec dbt dbt run \
--select int_execution_token_prices_daily+ \
--project-dir /app --profiles-dir /app
Full-refresh of a single incremental model (drops & rebuilds)¶
Avoid for large incremental tables — prefer Mode 3, which batches the rebuild into months.
docker exec dbt dbt run \
--select int_execution_blocks_daily --full-refresh \
--project-dir /app --profiles-dir /app
Compile only (debug the rendered SQL without writing)¶
docker exec dbt dbt compile \
--select int_execution_tokens_balances_daily \
--vars '{"price_lookback_days": 12}' \
--project-dir /app --profiles-dir /app
# Inspect target/compiled/.../<model>.sql
Append-mode rewrite of a single month (the Phase-1 primitive)¶
docker exec dbt dbt run \
--select int_execution_tokens_balances_daily \
--vars '{"start_month": "2026-04-01", "end_month": "2026-04-01"}' \
--project-dir /app --profiles-dir /app
docker exec dbt dbt run-operation optimize_partition_final \
--args '{database: dbt, table_name: int_execution_tokens_balances_daily, partition: "2026-04-01"}' \
--project-dir /app --profiles-dir /app
The OPTIMIZE collapses any duplicates RMT would otherwise merge lazily. Skip it if you don't need immediate convergence — background merges will eventually do the same work.