Skip to content

perf: optimize context extraction pipeline (~2x speedup)#1920

Merged
KRRT7 merged 3 commits into
mainfrom
cf-cpu-context-extraction
Mar 27, 2026
Merged

perf: optimize context extraction pipeline (~2x speedup)#1920
KRRT7 merged 3 commits into
mainfrom
cf-cpu-context-extraction

Conversation

@KRRT7

@KRRT7 KRRT7 commented Mar 27, 2026

Copy link
Copy Markdown
Contributor

Summary

  • Pre-compute base_defs once per file in the second loop of extract_all_contexts_from_files, passing them to remove_unused_definitions_by_function_names to skip redundant CST traversals (1.83s → 0.01s)
  • Skip FutureAliasedImportTransformer via a fast _has_aliased_future_imports check when no aliased __future__ imports exist (140ms → 0.02ms/call)
  • Replace MetadataWrapper + ParentNodeProvider in DependencyCollector with lightweight visit_Attribute/leave_Attribute id-tracking, eliminating expensive metadata computation (5.7x per-call speedup)

Details

extract_all_contexts_from_files runs two loops over helper files. The first loop already pre-computes base_defs and passes defs_with_usages to avoid re-traversal — but the second loop (HoH-only files) was calling remove_unused_definitions_by_function_names without it, forcing 5 redundant MetadataWrapper + DependencyCollector traversals.

DependencyCollector required MetadataWrapper solely for ParentNodeProvider in one place: checking if a Name is the .attr part of an Attribute inside class bodies. An id-based set populated by visit_Attribute/leave_Attribute replaces this with zero metadata overhead.

FutureAliasedImportTransformer traversed the full CST on every call even when no aliased __future__ imports existed (the common case). A fast O(imports) check short-circuits the traversal.

Benchmark

Profiled via cProfile on test_benchmark_extract (Python 3.13.7, macOS):

Function Before After
remove_unused_definitions_by_function_names 1.83s (14.7%) 0.01s
collect_top_level_defs_with_dependencies 38.9% cumtime 10.3% cumtime
gather_source_imports 10.7% cumtime 2.5% cumtime

Remaining cost dominated by libcst transform_module (43%) and Jedi inference (22%) — external library internals.

Test plan

  • 226 context extraction + static analysis tests pass
  • Micro-benchmark verified DependencyCollector produces identical results across 6 project files (352 definitions)
  • uv run prek clean

Eliminate redundant CST traversals in code context extraction by caching
dependency data, skipping unnecessary transforms, and removing MetadataWrapper.
@claude

claude Bot commented Mar 27, 2026

Copy link
Copy Markdown
Contributor

Claude finished @KRRT7's task in 3m 27s —— View job


PR Review Summary

Prek Checks

Clean — uv run prek run --from-ref origin/main passes with no issues.

Mypy errors in unused_definition_remover.py and code_extractor.py are pre-existing (outside the changed lines and present on origin/main).

Code Review

Size: SMALL (44 additions / 19 deletions across 3 files)

No bugs, security issues, or breaking API changes found. The three optimizations are all correct:

1. DependencyCollector — id-based attribute tracking (unused_definition_remover.py:284-312)

The visit_Attribute/leave_Attribute id-tracking replacement for ParentNodeProvider is sound. libcst traverses depth-first, so visit_Attribute fires before children are visited, which means id(node.attr) is in attr_name_ids by the time visit_Name is called for the .attr child. leave_Attribute then correctly discards it after the subtree is done.

One minor note: leave_Attribute receives original_node, and since DependencyCollector is a CSTVisitor (not a transformer), original_node is the same object as was visited, so id(original_node.attr) == id(node.attr) holds. No issue here.

2. _has_aliased_future_imports fast-path (code_extractor.py:429-447)

The check correctly short-circuits FutureAliasedImportTransformer when no aliased __future__ imports exist. The condition aligns with what the transformer actually modifies.

Minor: isinstance(s.names, (list, tuple)) works in practice since libcst uses tuples, but not isinstance(s.names, cst.ImportStar) would be more semantically precise. Not a bug.

3. Pre-compute base_defs in the second loop (code_context_extractor.py:399-401)

Consistent with the pattern already used in the first loop (lines 273–274). mark_defs_for_functions creates a fresh copy so base_defs is never mutated between calls. Correct.

Duplicate Detection

No duplicates detected. The new helper functions (_has_aliased_future_imports, _strip_future_aliases) are module-private and have no counterparts elsewhere.

Test Coverage

Skipped for SMALL PRs. PR author reports 226 context extraction + static analysis tests pass.


Open codeflash-ai[bot] PRs

Three open bot PRs (#1890, #1891, #1895) all target PR #1887's branch, not main. All have the same broad CI failures (unit-tests across all Python versions, js/tracer tests). This pattern indicates pre-existing failures on the base branch, not regressions from the bot PRs themselves. Leaving open.


Last updated: 2026-03-27T21:00 UTC

KRRT7 added 2 commits March 27, 2026 15:56
cst.Attribute branch was dead code since __future__ imports always use
a plain Name node.
@KRRT7 KRRT7 merged commit fb1381e into main Mar 27, 2026
26 of 27 checks passed
@KRRT7 KRRT7 deleted the cf-cpu-context-extraction branch March 27, 2026 22:37
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant