Skip to content

docs: add custom data loader guide#194

Merged
warren618 merged 1 commit into
HKUDS:mainfrom
mvanhorn:docs/178-custom-data-loader-guide
Jun 9, 2026
Merged

docs: add custom data loader guide#194
warren618 merged 1 commit into
HKUDS:mainfrom
mvanhorn:docs/178-custom-data-loader-guide

Conversation

@mvanhorn

@mvanhorn mvanhorn commented Jun 9, 2026

Copy link
Copy Markdown
Contributor

Summary

Adds agent/backtest/loaders/README.md, a guide for registering a custom
historical OHLCV data loader
, and a one-line pointer to it from
CONTRIBUTING.md. It documents the loader contract end to end: the
DataLoaderProtocol shape, the @register decorator, the registry / config
wiring, and how to select the loader via source="<your loader>". This helps
contributors who want to feed a data source the project does not ship a
connector for into the backtest data layer.

Why

Issue #178 asks how to feed a custom data source into the backtest data layer.
The registration recipe currently lives only in a maintainer reply, not in the
repo — so a new contributor has no in-tree path to follow. This guide closes
that gap and also documents two steps that are easy to miss: adding the module
to the lazy import list so @register fires, and adding the source name to the
config schema's allowlist so config.json validation accepts it.

Changes

  • New agent/backtest/loaders/README.md:
    • The loader contract (name / markets / requires_auth / is_available()
      / fetch() returning {symbol: DataFrame} with a trade_date index and
      open/high/low/close/volume columns).
    • A 5-step quickstart: create the module, add it to _loader_modules in
      registry.py, add the name to _VALID_SOURCES in runner.py, optionally
      add to FALLBACK_CHAINS, and select via source=.
    • An explicit "out of scope: real-time data" section pointing at the broker
      connectors, plus a checklist.
  • Edit CONTRIBUTING.md: a short "Adding a Custom Data Loader" section
    linking to the new README.

Deliberately out of scope: real-time tick/depth ingestion. There is no
first-class interface for pushing a custom real-time stream today (real-time
flows through the broker connectors), so the guide documents the historical path
only and points readers there.

Test Plan

Documentation only — no code paths change. Every symbol, signature, and path in
the guide was verified against the current source:

  • agent/backtest/loaders/base.pyDataLoaderProtocol and the fetch
    signature / OHLCV return contract; check_budget, retry_with_budget,
    cached_loader_fetch, loader_cache_get/put, validate_date_range.
  • agent/backtest/loaders/registry.pyregister, the _loader_modules list
    in _ensure_registered(), FALLBACK_CHAINS, get_loader_cls_with_fallback.
  • agent/backtest/loaders/yfinance_loader.py, okx.py, akshare_loader.py
    the concrete loader pattern the examples mirror.
  • agent/backtest/runner.pyBacktestConfigSchema._VALID_SOURCES /
    valid_source (the config-validation step) and _get_loader.

No live, broker, MCP, network, secret, or deployment surface is touched.
Rollback: revert the single commit with git revert.

Checklist

  • Docs-only change; no code paths modified.
  • Every symbol/path in the guide verified against the current source files.
  • Commit signed off (git commit -s) per the DCO requirement.

Fixes #178

Signed-off-by: Matt Van Horn <455140+mvanhorn@users.noreply.github.com>
@warren618 warren618 force-pushed the docs/178-custom-data-loader-guide branch from 9124a39 to 21a48c1 Compare June 9, 2026 11:06
@warren618

Copy link
Copy Markdown
Collaborator

Thanks @mvanhorn — really clean PR, and I appreciate you verifying every symbol against the source.

One change on the way in: instead of a standalone agent/backtest/loaders/README.md, we folded this into the Detailed Capabilities → Custom Data Source section of the READMEs (all 5 locales), so the people who actually hit this — users wiring up their own data source — find it right next to the existing data-source docs instead of in a separate file. I pushed that reshape to your branch and kept your authorship and sign-off intact. Substance is unchanged; I only:

  • trimmed the parts that restated DataLoaderProtocol (it already lives in base.py, now pointed to as the source of truth),
  • switched the example to the duck-typed shape (@register on a class that satisfies the protocol — no base class to inherit), and
  • noted the steps edit package source, so it's a from-clone (pip install -e .) workflow.

Merging now — thanks for closing the gap on #178! 🙏

@warren618 warren618 merged commit 51b317d into HKUDS:main Jun 9, 2026
1 check passed
@mvanhorn

Copy link
Copy Markdown
Contributor Author

Thanks for merging the loader guide too, @warren618. Custom data loaders are a lot more approachable documented.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

怎么送入自定义来源的实时ticks和深度数据呢

2 participants