Skip to content

[Bug] Swarm workers can bypass market data loader and misroute continuation runs #198

Description

@BillDin

Description

What happened:

While running the Web UI swarm flow for an NVDA investment committee request, I saw two related issues:

  1. The requested investment_committee swarm completed, but some workers generated their own ad-hoc yfinance scripts instead of using the existing market data loader / get_market_data path. One worker hit a dirty latest daily bar from Yahoo/yfinance: the row had Volume but empty Open/High/Low/Close, which propagated into NaN metrics and non-strict JSON output.
  2. After the completed committee run, the agent launched an additional continuation swarm with a prompt like Continue and finish report.... Because that continuation prompt did not contain the original preset/ticker context, preset matching fell back to equity_research_team. That run then failed on task-stock with output contract not met: plan-only stub (no executed analysis / conclusion), and the aggregate task was blocked.

What I expected:

  • Swarm workers that need OHLCV data should prefer the repository's normalized loader / get_market_data interface instead of generating raw yf.Ticker(...).history(...) / yf.download(...) code for core price bars.
  • Dirty or partial latest bars should be dropped or repaired before metrics/JSON serialization.
  • Follow-up/continuation swarm calls should preserve the original preset and key variables, or avoid launching a new inferred-preset swarm when the prior swarm already completed.

Steps to Reproduce

  1. Start the Web UI/backend with vibe-trading serve.

  2. Submit a prompt similar to:

    [Swarm Team Mode] Use the investment_committee preset to evaluate whether to go long or short on NVDA given current market conditions
    
  3. Inspect the generated swarm runs/artifacts.

  4. In my local run:

    • First swarm: investment_committee, status completed.
    • A worker-generated script used raw yfinance and selected the latest row without validating OHLC fields.
    • The generated CSV contained a latest row like 2026-06-09 ... ,,,,,179602708,... where OHLC fields were empty but volume existed.
    • A second swarm was launched with preset=equity_research_team from a continuation prompt, and failed with plan-only stub on the stock task.

Error Logs

failed continuation swarm:
  preset: equity_research_team
  task-stock: failed
  error: output contract not met: plan-only stub (no executed analysis / conclusion)
  task-aggregate: blocked
  error: Blocked: upstream not completed (task-stock=failed)

raw yfinance artifact symptom:
  latest daily bar had Volume but missing Open/High/Low/Close
  downstream metrics included NaN / null-like values
  JSON was written with Python json.dump default allow_nan=True

Interface

Web UI (vibe-trading serve)

LLM Provider

Configured local provider via the app; provider probably is not central to the data-loading issue.

Version

Local checkout: 8b8024a

yfinance: 1.4.1

Environment

Windows / Python 3.13.13

Suspected Root Cause

There appear to be a few gaps working together:

  • The project already has safer loader behavior for OHLCV normalization. For example, agent/backtest/loaders/yfinance_loader.py drops rows missing open/high/low/close, and the loader cache has a staleness guard for ranges ending today.
  • However, swarm workers are prompted to write focused Python scripts and load the yfinance skill. The yfinance skill's quick examples mostly show raw yf.download / yf.Ticker, so workers can bypass the safer loader path.
  • Some presets with the yfinance skill do not expose a local get_market_data tool to workers. get_market_data exists through the MCP server, but it does not seem to be available as a normal swarm worker BaseTool in those preset tool whitelists.
  • The grounding instructions mention calling get_market_data, but grounding only matched suffixed symbols in this case and the worker may not have had the tool anyway. Bare NVDA did not appear to trigger pre-fetched grounded data.
  • Preset auto-routing currently falls back to equity_research_team when no keyword/preset score is found. That is risky for continuation prompts that have lost the original context.

Possible Fix Directions

I am not sure which direction maintainers prefer, but these seem worth considering:

  1. Add a local get_market_data swarm tool that reuses the same normalized loader logic as the MCP implementation, then include it in market-data-heavy presets such as investment_committee, equity_research_team, global allocation/equities, macro/rates/fx, and earnings desks.
  2. Update the yfinance skill so OHLCV examples are loader-first / get_market_data-first, with raw yfinance examples limited to fundamentals/options/holders or clearly marked as requiring OHLC cleaning.
  3. Add a shared helper/contract for generated scripts that drops rows missing OHLC, sorts the index, and serializes with allow_nan=False.
  4. Make continuation swarm calls preset-sticky by passing the original preset/variables/run id explicitly, or reject ambiguous continuation prompts instead of falling back to equity_research_team.
  5. Consider extending grounding for common bare US tickers when the prompt/preset context clearly indicates US equities, or make the UI/preset variable builder normalize NVDA to the expected market-data symbol.

This is partly a prompt/tool-affordance issue rather than only a yfinance transient failure: retrying the same raw-yfinance worker may still produce fragile or invalid analysis unless the worker is guided toward the existing loader path.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions