Skip to content

Commit 2df04fb

Browse files
authored
Merge pull request #210 from codelion/feat-pyproject-setup
move to pyproject
2 parents 74bcbdc + 8c35bb3 commit 2df04fb

File tree

6 files changed

+208
-70
lines changed

6 files changed

+208
-70
lines changed

.gitignore

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -169,3 +169,4 @@ cython_debug/
169169
.vscode/
170170

171171
scripts/results/
172+
results/

CLAUDE.md

Lines changed: 133 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,133 @@
1+
# CLAUDE.md
2+
3+
This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.
4+
5+
## Project Overview
6+
7+
OptILLM is an OpenAI API compatible optimizing inference proxy that implements state-of-the-art techniques to improve accuracy and performance of LLMs. It focuses on reasoning improvements for coding, logical, and mathematical queries through inference-time compute optimization.
8+
9+
## Core Architecture
10+
11+
### Main Components
12+
13+
1. **Entry Points**:
14+
- `optillm.py` - Main Flask server with inference routing
15+
- `optillm/inference.py` - Local inference engine with transformer models
16+
- Setup via `setup.py` with console script `optillm=optillm:main`
17+
18+
2. **Optimization Techniques** (`optillm/`):
19+
- **Reasoning**: `cot_reflection.py`, `plansearch.py`, `leap.py`, `reread.py`
20+
- **Sampling**: `bon.py` (Best of N), `moa.py` (Mixture of Agents), `self_consistency.py`
21+
- **Search**: `mcts.py` (Monte Carlo Tree Search), `rstar.py` (R* Algorithm)
22+
- **Verification**: `pvg.py` (Prover-Verifier Game), `z3_solver.py`
23+
- **Advanced**: `cepo/` (Cerebras Planning & Optimization), `rto.py` (Round Trip)
24+
25+
3. **Decoding Techniques**:
26+
- `cot_decoding.py` - Chain-of-thought without explicit prompting
27+
- `entropy_decoding.py` - Adaptive sampling based on token uncertainty
28+
- `thinkdeeper.py` - Reasoning effort scaling
29+
- `autothink/` - Query complexity classification with steering vectors
30+
31+
4. **Plugin System** (`optillm/plugins/`):
32+
- `spl/` - System Prompt Learning (third paradigm learning)
33+
- `deepthink/` - Gemini-like deep thinking with inference scaling
34+
- `longcepo/` - Long-context processing with divide-and-conquer
35+
- `mcp_plugin.py` - Model Context Protocol client
36+
- `memory_plugin.py` - Short-term memory for unbounded context
37+
- `privacy_plugin.py` - PII anonymization/deanonymization
38+
- `executecode_plugin.py` - Code interpreter integration
39+
- `json_plugin.py` - Structured outputs with outlines library
40+
41+
## Development Commands
42+
43+
### Installation & Setup
44+
```bash
45+
# Development setup
46+
python3 -m venv .venv
47+
source .venv/bin/activate
48+
pip install -r requirements.txt
49+
50+
# Package installation
51+
pip install optillm
52+
```
53+
54+
### Running the Server
55+
```bash
56+
# Basic server (auto approach detection)
57+
python optillm.py
58+
59+
# With specific approach
60+
python optillm.py --approach moa --model gpt-4o-mini
61+
62+
# With external endpoint
63+
python optillm.py --base_url http://localhost:8080/v1
64+
65+
# Docker
66+
docker compose up -d
67+
```
68+
69+
### Testing
70+
```bash
71+
# Run all approach tests
72+
python test.py
73+
74+
# Test specific approaches
75+
python test.py --approaches moa bon mcts
76+
77+
# Test with specific model/endpoint
78+
python test.py --model gpt-4o-mini --base-url http://localhost:8080/v1
79+
80+
# Single test case
81+
python test.py --single-test "specific_test_name"
82+
```
83+
84+
### Evaluation Scripts
85+
```bash
86+
# Math benchmark evaluation
87+
python scripts/eval_math500_benchmark.py
88+
89+
# AIME benchmark
90+
python scripts/eval_aime_benchmark.py
91+
92+
# Arena Hard Auto evaluation
93+
python scripts/eval_arena_hard_auto_rtc.py
94+
95+
# FRAMES benchmark
96+
python scripts/eval_frames_benchmark.py
97+
98+
# OptILLM benchmark generation/evaluation
99+
python scripts/gen_optillmbench.py
100+
python scripts/eval_optillmbench.py
101+
```
102+
103+
## Usage Patterns
104+
105+
### Approach Selection (Priority Order)
106+
1. **Model prefix**: `moa-gpt-4o-mini` (approach slug + model name)
107+
2. **extra_body field**: `{"optillm_approach": "bon|moa|mcts"}`
108+
3. **Prompt tags**: `<optillm_approach>re2</optillm_approach>` in system/user prompt
109+
110+
### Approach Combinations
111+
- **Pipeline** (`&`): `cot_reflection&moa` - sequential processing
112+
- **Parallel** (`|`): `bon|moa|mcts` - multiple responses returned as list
113+
114+
### Local Inference
115+
- Set `OPTILLM_API_KEY=optillm` to enable built-in transformer inference
116+
- Supports HuggingFace models with LoRA adapters: `model+lora1+lora2`
117+
- Advanced decoding: `{"decoding": "cot_decoding", "k": 10}`
118+
119+
### Plugin Configuration
120+
- MCP: `~/.optillm/mcp_config.json` for Model Context Protocol servers
121+
- SPL: Built-in system prompt learning for solving strategies
122+
- Memory: Automatic unbounded context via chunking and retrieval
123+
124+
## Key Concepts
125+
126+
### Inference Optimization
127+
The proxy intercepts OpenAI API calls and applies optimization techniques before forwarding to LLM providers (OpenAI, Cerebras, Azure, LiteLLM). Each technique implements specific reasoning or sampling improvements.
128+
129+
### Plugin Architecture
130+
Plugins extend functionality via standardized interfaces. They can modify requests, process responses, add tools, or provide entirely new capabilities like code execution or structured outputs.
131+
132+
### Multi-Provider Support
133+
Automatically detects and routes to appropriate LLM provider based on environment variables (`OPENAI_API_KEY`, `CEREBRAS_API_KEY`, etc.) with fallback to LiteLLM for broader model support.

MANIFEST.in

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1 +1,3 @@
11
include optillm/plugins/*.py
2+
include optillm/cepo/*.py
3+
include optillm/cepo/configs/*.yaml

optillm/__init__.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -2,7 +2,7 @@
22
import os
33

44
# Version information
5-
__version__ = "0.1.20"
5+
__version__ = "0.1.21"
66

77
# Get the path to the root optillm.py
88
spec = util.spec_from_file_location(

pyproject.toml

Lines changed: 71 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,71 @@
1+
[build-system]
2+
requires = ["setuptools>=64", "wheel"]
3+
build-backend = "setuptools.build_meta"
4+
5+
[project]
6+
name = "optillm"
7+
version = "0.1.21"
8+
description = "An optimizing inference proxy for LLMs."
9+
readme = "README.md"
10+
license = "Apache-2.0"
11+
authors = [
12+
{name = "codelion", email = "[email protected]"}
13+
]
14+
requires-python = ">=3.10"
15+
classifiers = [
16+
"Programming Language :: Python :: 3",
17+
"Operating System :: OS Independent",
18+
]
19+
dependencies = [
20+
"numpy",
21+
"networkx",
22+
"openai",
23+
"z3-solver",
24+
"aiohttp",
25+
"flask",
26+
"torch",
27+
"transformers",
28+
"azure-identity",
29+
"tiktoken",
30+
"scikit-learn",
31+
"litellm",
32+
"requests",
33+
"beautifulsoup4",
34+
"lxml",
35+
"presidio_analyzer",
36+
"presidio_anonymizer",
37+
"nbconvert",
38+
"nbformat",
39+
"ipython",
40+
"ipykernel",
41+
"peft",
42+
"bitsandbytes",
43+
"gradio<5.16.0",
44+
# Constrain spacy version to avoid blis build issues on ARM64
45+
"spacy<3.8.0",
46+
"cerebras_cloud_sdk",
47+
"outlines[transformers]",
48+
"sentencepiece",
49+
"mcp",
50+
"adaptive-classifier",
51+
# MLX support for Apple Silicon optimization
52+
'mlx-lm>=0.24.0; platform_machine=="arm64" and sys_platform=="darwin"',
53+
]
54+
55+
[project.urls]
56+
Homepage = "https://github.com/codelion/optillm"
57+
Repository = "https://github.com/codelion/optillm"
58+
Issues = "https://github.com/codelion/optillm/issues"
59+
60+
[project.scripts]
61+
optillm = "optillm:main"
62+
63+
[tool.setuptools.packages.find]
64+
include = ["optillm*"]
65+
66+
[tool.setuptools.package-data]
67+
optillm = [
68+
"plugins/*.py",
69+
"cepo/*.py",
70+
"cepo/configs/*.yaml",
71+
]

setup.py

Lines changed: 0 additions & 69 deletions
This file was deleted.

0 commit comments

Comments
 (0)