This file is the shared source of truth for AI agents working in Spellbook. Claude Code reads it through CLAUDE.md; opencode and Cursor read it natively.
Spellbook is Dune Analytics' open-source interpretation layer for blockchain data. It is a dbt monorepo that transforms raw blockchain data into clean, usable datasets using SQL and Jinja on DuneSQL, which is Trino-based.
The repo is organized as independent dbt sub-projects under dbt_subprojects/:
daily_spellbook- Default location for new spells, refreshed dailyhourly_spellbook- Promoted higher-frequency spells, Dune-team manageddex- DEX and DEX aggregator data, includingdex.tradesnft- NFT-related modelssolana- Solana-specific modelstokens- Token metadata, transfers, balances, and prices
Shared resources live at the repo root:
sources/- Shared source YAML files for raw table definitionsdbt_macros/- Shared macros such asexpose_spells,optimize_spell,mark_as_spell, andincremental_predicatescripts/- Development utilities, including query runners and source generatorsdocs/- Documentation for models, tests, seeds, macros, CI, and best practices.claude/skills/- Shared task-specific skills for Claude Code, opencode, and Cursor
Start with these docs before changing models:
docs/general/repo_navigation.md- GitHub issues, PRs, discussions, and Actionsdocs/general/best_practices.md- Development workflow, performance, and incremental model guidancedocs/models/model_overview.md- Model layout, naming, materialization, and schema YAML expectationsdocs/models/model_config_block.md- Required dbt model config patternsdocs/tests/test_overview.md- Required model, seed, and generic test patternsdocs/ci_test/ci_test_overview.md- How Spellbook PR CI runs and where to find test tablesdocs/general/faq_and_common_issues.md- Common contributor and CI problems
Set up the locked Python environment from the repo root:
uv sync --lockedRun dbt commands through uv run. Commands that operate on a sub-project must target or run from that sub-project:
uv run dbt deps --project-dir dbt_subprojects/<subproject>/
uv run dbt compile --project-dir dbt_subprojects/<subproject>/
uv run dbt compile --select <model_name> --project-dir dbt_subprojects/<subproject>/
uv run dbt test --select <model_name> --project-dir dbt_subprojects/<subproject>/To compile and run a model against Dune API, use the query helper from the repo root:
uv run python scripts/dune_query.py "@model_name" --limit 100
uv run python scripts/dune_query.py "SELECT * FROM dex.trades LIMIT 10"
uv run python scripts/dune_query.py --sql-file query.sqlscripts/dune_query.py requires DUNE_API_KEY in .env and auto-detects the sub-project for @model_name syntax.
- Start with a working query in the Dune app.
- Convert the query to a dbt model with the correct materialization.
- Compile locally with dbt and inspect or test the compiled SQL on Dune.
- Use short hardcoded date filters for initial development and CI iteration on large sources.
- Submit a PR and use CI tables for QA.
- Revert temporary hardcoded date filters before merge; production models should use normal incremental logic.
Always use source() and ref() for table references. Do not hardcode upstream table names in model SQL.
Model paths depend on spell type:
- Dedicated sector sub-project:
dbt_subprojects/<project>/models/<alias>/<blockchain>/platforms/ - Sector-level in daily/hourly:
dbt_subprojects/<project>/models/_sector/<sector>/<alias>/<blockchain>/ - Project-level:
dbt_subprojects/<project>/models/_project/<project_name>/<blockchain>/
Model files are named with the pattern <project_version_blockchain_alias.sql>, for example uniswap_v3_arbitrum_base_trades.sql. The file name is the dbt model name.
Every model SQL file must have a corresponding _schema.yml entry in the same directory.
All models must explicitly declare:
schema- Dune app schema namealias- Dune app table or view namematerialized-view,table, orincremental
Additional materialization requirements:
tablerequiresfile_format='delta'.incrementalrequiresfile_format='delta',incremental_strategy, andunique_key.- If a model is partitioned, include partition columns in
unique_key. - Use
incremental_predicate()forincremental_predicates; do not hardcode equivalent predicates. - Use
incremental_predicate()only for time-series data. Omit it when full-history lookups are required, such as pool creation events.
Config-only changes that affect physical tables may not trigger CI because CI selects with state:modified.body and state:modified.macros. If the SQL body otherwise does not change, add or bump a body stamp comment for changes to materialized, partition_by, incremental_strategy, file_format, schema, or alias:
-- ci-stamp: 1- Unique key columns must not contain NULL values. Trino NULL merge lookups fail silently and can create duplicates.
- Use
coalesce()ordbt_utils.generate_surrogate_key()when key columns can be NULL. - Keep unique key columns consistent across the model config,
_schema.ymltests, and seed match columns. - Use trailing Jinja whitespace control only, for example
{% if is_incremental() -%},{% else -%},{% endif -%}. - Apply incremental filters to all time-series sources in the query.
- The non-incremental path should filter from the earliest date of activity.
- Use
varbinaryfor addresses and hashes. Hex literals should be unquoted, for example0x039e..., not'0x039e...'. - Use
uint256andint256for large numbers. block_dateisDATE;block_timeisTIMESTAMP.- Prefer native bytearray and numeric functions over casts.
- Never use
select *on large tables. - Filter partition columns such as
block_date,block_time, andblockchainwhere applicable. - Put larger tables on the left side of joins when manually tuning performance.
- Include time filters in both
onandwhereclauses when joining on partition columns. - Use
union allunless deduplication is required. - Do not use
order bywithoutlimiton large result sets. - Only partition tables when each partition is expected to hold roughly 1M+ rows.
The standard sector lineage pattern is:
- Platform base spells per project, version, and chain. These should contain raw data and are commonly incremental.
- Chain-level union models that union platform spells on one chain. Materialize as tables to isolate chains.
- Cross-chain union models that union chain-level spells. Materialize as views.
- Final sector spells that enrich the cross-chain base with metadata.
Use macros for repeated logic across chains or forked protocols. Keep upstream base models simple and save metadata enrichment for downstream models.
Use dbt 1.10+ generic test syntax. For tests with inputs, put inputs under arguments: and configs under config:.
Required for materialized models:
dbt_utils.unique_combination_of_columnson the exact unique key columns from the model config.not_nullon each unique key column.- Model descriptions and key column descriptions.
Use data_tests: for new or edited model-level tests. Do not use deprecated test argument syntax, and do not introduce misspelled keys such as cdata_tests.
Example:
data_tests:
- dbt_utils.unique_combination_of_columns:
arguments:
combination_of_columns:
- block_date
- tx_hash
- evt_indexSeeds are required for sector-level spells and encouraged for standalone project spells.
- Register seeds in the directory
_schema.ymlwith proper column data types. - Include all unique key columns and the fields being tested.
- Keep seeds small; a handful of representative rows is enough.
- Seed unique key columns must exactly match the model
unique_key. - Seed column types must match the DuneSQL types produced by the model.
- CI runs
dbt seedfor modified seeds before comparing model output duringdbt test.
PRs trigger per-sub-project GitHub Actions such as daily_spellbook.yml, dex.yml, hourly_spellbook.yml, nft.yml, solana.yml, and tokens.yml. These call the shared reusable workflow in .github/workflows/dbt_run.yml.
CI builds and tests only modified models where possible. Test tables are visible on Dune for about 24 hours. Current CI schemas use the pattern:
dune.dune_spellbook_ci__tmp_pr<PR>_<run_id>_<attempt>.<model_name>Older CI runs may use legacy tables like:
dune.test_schema.git_dunesql_<GIT_HASH>_<schema>_<alias>A green CI check is necessary but not sufficient. Query CI tables to verify data quality, especially for model logic changes.
Use .claude/skills/<name>/SKILL.md for shared task-specific workflows. This path is read by Claude Code, opencode, and Cursor. Do not add shared instructions to .cursor/rules; promote always-on conventions to this file and procedural workflows to skills.