Skip to content

fix: detect and log Java compilation failures explicitly#1394

Merged
mashraf-222 merged 3 commits into
omni-javafrom
fix/detect-java-compilation-failures
Feb 10, 2026
Merged

fix: detect and log Java compilation failures explicitly#1394
mashraf-222 merged 3 commits into
omni-javafrom
fix/detect-java-compilation-failures

Conversation

@mashraf-222

Copy link
Copy Markdown

Problem Fixed

When Maven test execution fails, the error could be due to either:

  1. Compilation errors - invalid Java syntax (e.g., using reserved keywords)
  2. Test failures - runtime issues in valid code

Currently, both scenarios produce the same generic error message, making it difficult to diagnose the root cause.

Root Cause

_run_maven_tests combines compilation and test execution in a single Maven command. When Maven returns a non-zero exit code, the code doesn't distinguish between compilation vs runtime failures.

This is particularly problematic when AI-generated tests contain syntax errors (like using "provides" as a class name, which is a Java reserved keyword). The tests fail to compile, but the error appears as a generic "tests failed to run" message.

Solution Implemented

Added explicit detection of Maven compilation errors by checking output for compilation-specific indicators:

  • [ERROR] COMPILATION ERROR
  • [ERROR] Failed to execute goal org.apache.maven.plugins:maven-compiler-plugin
  • compilation failure
  • cannot find symbol
  • package .* does not exist

When compilation errors are detected:

  1. Logs ERROR-level message: "Maven compilation failed for {mode} tests"
  2. Suggests checking that generated test code is syntactically valid
  3. Includes first 50 lines of Maven output for diagnosis

Code Changes

File: codeflash/languages/java/test_runner.py

  • Lines 1046-1073: Added compilation error detection logic after subprocess.run
  • Checks both stdout and stderr for compilation error indicators
  • Logs detailed error context when compilation fails

Testing

This change improves error reporting for:

  • AI-generated tests with invalid Java syntax
  • Tests using reserved keywords as identifiers
  • Tests with missing imports or undefined symbols
  • Any Maven compilation failure scenario

Example improved output:

ERROR Maven compilation failed for behavior tests. Check that generated test code is syntactically valid Java. Return code: 1
ERROR Maven compilation error output:
[ERROR] COMPILATION ERROR
[ERROR] /path/to/Test.java:[143,50] cannot find symbol
  symbol:   class provides
  location: package com.example

Impact

This is a detection and logging enhancement with no behavioral changes to test execution. Benefits:

  • Faster diagnosis - immediately identify compilation vs runtime issues
  • Better debugging - see actual compilation errors in logs
  • Clearer errors - understand why tests failed to run

Related to E2E testing where AI-generated tests used Java reserved keywords, causing silent compilation failures.

🤖 Generated with Claude Code

mohamedashrraf222 and others added 2 commits February 5, 2026 16:23
…ile_path

For AI-generated tests, original_file_path is intentionally None.
When tests fail to run, the log now shows instrumented_behavior_file_path
(the actual path being executed) instead of original_file_path. This makes
debugging test execution failures much clearer.

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
When Maven fails during test execution, it's not immediately clear if the
failure is due to compilation errors (invalid Java code) or test failures
(runtime issues).

This change adds explicit detection of compilation errors by checking Maven's
output for compilation error indicators (e.g., "COMPILATION ERROR", "cannot
find symbol", "package does not exist").

When compilation errors are detected:
- Logs ERROR-level message indicating compilation failure
- Suggests checking that generated test code is syntactically valid
- Includes first 50 lines of Maven output for diagnosis

This makes it immediately obvious when AI-generated tests contain syntax
errors (like using Java reserved keywords as class names), rather than
appearing as silent test execution failures.

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
@mashraf-222

Copy link
Copy Markdown
Author

Regression Risk: Empty Test Filter Circuit Breaker Removed

This PR adds useful compilation error detection, but it also removes the empty test filter circuit breaker that was introduced in PR #1345 and validated during E2E testing.

What was removed

On omni-java, _run_maven_tests() in codeflash/languages/java/test_runner.py (lines 1050-1061) has:

    else:
        # CRITICAL: Empty test filter means Maven will run ALL tests
        # This is almost always a bug - tests should be filtered to relevant ones
        error_msg = (
            f"Test filter is EMPTY for mode={mode}! "
            f"Maven will run ALL tests instead of the specified tests. "
            f"This indicates a problem with test file instrumentation or path resolution."
        )
        logger.error(error_msg)
        # Raise exception to prevent running all tests unintentionally
        # This helps catch bugs early rather than silently running wrong tests
        raise ValueError(error_msg)

On this PR branch, the entire else clause is gone. The if test_filter: block at line 1039 no longer has the safety fallback. Additionally, the logger.debug(f"Added -Dtest={validated_filter} to Maven command") line was also removed.

Why this matters

Without the circuit breaker, if test_filter is empty (due to a bug in test file instrumentation or path resolution), Maven will silently run all tests in the project instead of the targeted subset. This was the exact bug that PR #1345 fixed -- it was discovered during E2E testing where an empty filter caused Maven to execute the entire test suite, producing misleading results and wasting significant time.

Recommendation

The compilation error detection added by this PR is valuable and can coexist with the empty filter protection. Consider preserving the else: raise ValueError(...) block alongside the new compilation error checking logic:

  1. Keep the else clause on if test_filter: that raises ValueError on empty filter
  2. Keep the logger.debug line for validated filter logging
  3. The compilation error detection after subprocess.run works independently and does not conflict

This way, the PR gains compilation error visibility without losing the safety mechanism that prevents Maven from running all tests when the filter is unexpectedly empty.

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
@mashraf-222 mashraf-222 merged commit fd6be1e into omni-java Feb 10, 2026
15 of 28 checks passed
@mashraf-222 mashraf-222 deleted the fix/detect-java-compilation-failures branch February 10, 2026 17:33
mashraf-222 pushed a commit that referenced this pull request Feb 10, 2026
Merge latest changes from base branch including:
- Java compilation error detection (PR #1394)
- Java formatter detection via google-java-format (PR #1400)
- Enhanced test coverage for comparator logic

Conflict resolution:
- tests/test_languages/test_java/test_comparison_decision.py: Used PR version
  that enforces strict correctness (no pass_fail_only fallback tests)
  to align with PR 1401's goal of removing pass_fail_only mode entirely.

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants