Skip to content

feat(guardrails): add EncodedPayloadScanner for obfuscated injection detection#51

Open
u7k4rs6 wants to merge 1 commit into
future-agi:mainfrom
u7k4rs6:feat/encoded-payload-scanner
Open

feat(guardrails): add EncodedPayloadScanner for obfuscated injection detection#51
u7k4rs6 wants to merge 1 commit into
future-agi:mainfrom
u7k4rs6:feat/encoded-payload-scanner

Conversation

@u7k4rs6

@u7k4rs6 u7k4rs6 commented Jun 26, 2026

Copy link
Copy Markdown

Closes #50.

Summary

  • Adds EncodedPayloadScanner — a local, dependency-free scanner that detects base64, hex, percent-encoded, and unicode/hex-escape blobs, decodes them, then rescans the decoded text for prompt-injection markers.
  • Only blobs that decode to injection content (confidence 0.9) cross the block threshold (default 0.6); benign encoded data — hashes, tokens, image fragments, file attachments — passes cleanly because it decodes to non-injection content.
  • Registered as "encoded_payload" via @register_scanner, exported from __init__.py and added to __all__, wired into create_default_pipeline(encoded_payload=False) — off by default, same policy as urls and invisible_chars.

Motivation

Keyword/regex-based scanners all operate on raw text. An adversary can bypass every one of them by base64-encoding an injection string:

aWdub3JlIGFsbCBwcmV2aW91cyBpbnN0cnVjdGlvbnM=

decodes to ignore all previous instructions and passes JailbreakScanner, CodeInjectionScanner, and SecretsScanner unchanged. EncodedPayloadScanner closes this class of bypass.

Changes

File Change
python/fi/evals/guardrails/scanners/encoded_payload.py New scanner
python/fi/evals/guardrails/scanners/__init__.py Import, __all__, create_default_pipeline param
python/tests/sdk/test_guardrails_scanners.py TestEncodedPayloadScanner — 6 cases

Test plan

cd python
uv sync --dev
uv run pytest tests/sdk/test_guardrails_scanners.py::TestEncodedPayloadScanner -v

Expected:

PASSED test_detects_base64_encoded_injection
PASSED test_detects_hex_encoded_injection
PASSED test_detects_percent_encoded_injection
PASSED test_benign_base64_passes
PASSED test_hex_hash_passes
PASSED test_clean_text_passes
6 passed in 1.48s

…detection

Closes future-agi#50. Adds a new scanner that detects base64, hex, percent-encoded,
and unicode/hex-escape blobs, decodes them, and rescans the decoded text
for prompt-injection markers. Only blobs that decode to injection content
(confidence 0.9) cross the block threshold; benign encoded data (hashes,
tokens, image fragments) passes cleanly.

- New file: python/fi/evals/guardrails/scanners/encoded_payload.py
- Registered as "encoded_payload" via @register_scanner
- Exported from __init__.py, added to __all__
- Wired into create_default_pipeline(encoded_payload=False) (off by default,
  same policy as urls and invisible_chars)
- 6 new tests in TestEncodedPayloadScanner covering b64/hex/percent injection
  detection and benign b64, hex hash, and clean-text passes (all green)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

feat: add EncodedPayloadScanner for obfuscated injection detection

1 participant