-
Notifications
You must be signed in to change notification settings - Fork 25
Pull requests: vercel-labs/agent-eval
Author
Label
Projects
Milestones
Reviews
Assignee
Sort
Pull requests list
[judge] Agentic LLM judge: judge codebases and transcripts
#161
opened Jun 26, 2026 by
gaojude
Collaborator
Loading…
[agents] Plugin model: per-agent definition + in-sandbox runner
#160
opened Jun 26, 2026 by
gaojude
Collaborator
Loading…
fix(agent-eval): delegate demultiplexing to docker-modem's docker demuxStream()
#158
opened Jun 26, 2026 by
huang-julien
Loading…
fix: reuse cached results for single-experiment runs
#153
opened Jun 11, 2026 by
benjamincanac
Loading…
7 tasks done
[Agents] Add Mistral Vibe CLI agent with direct API and streaming-json transcript support
#137
opened May 21, 2026 by
joeyspagnoli
Loading…
Skip npm install when package.json is absent
#125
opened May 6, 2026 by
allenzhou101
Contributor
Loading…
ProTip!
Mix and match filters to narrow down what you’re looking for.