feat(kagent-adk): remove litellm as dependency from kagent-adk#1540
Conversation
Signed-off-by: JM Huibonhoa <jm.huibonhoa@solo.io>
Signed-off-by: JM Huibonhoa <jm.huibonhoa@solo.io>
There was a problem hiding this comment.
Minor — stale LiteLLM references in docstrings
A few docstrings in _memory_service.py still reference LiteLLM after this change:
- Line 27 (class docstring):
"Generates embeddings using LiteLLM" - Lines 60, 71 (
add_session_to_memory/_add_session_to_memory_backgrounddocstrings):"Optional ADK model object (e.g., LiteLlm, OpenAI)" - Line 447 (
_summarize_session_content_asyncdocstring): same
These are cosmetic but worth updating for accuracy.
Comment left by Claude on behalf of @iplay88keys
Signed-off-by: JM Huibonhoa <jm.huibonhoa@solo.io>
Signed-off-by: JM Huibonhoa <jm.huibonhoa@solo.io>
There was a problem hiding this comment.
Pull request overview
This PR removes the litellm dependency from kagent-adk by replacing LiteLLM-based model/embedding usage with provider-specific SDK implementations (Anthropic, Ollama, Bedrock, and OpenAI SDK calls), and adds unit tests to validate the new dispatch behavior.
Changes:
- Drop
litellmfromkagent-adkdependencies and lockfile. - Replace LiteLLM model creation with native provider model classes (Anthropic/Ollama/Bedrock) and update model dispatch in
types.py. - Rework embedding generation to call provider SDKs directly, and add new unit tests for embeddings and the new model adapters.
Reviewed changes
Copilot reviewed 12 out of 13 changed files in this pull request and generated 2 comments.
Show a summary per file
| File | Description |
|---|---|
| python/uv.lock | Removes litellm (and its transitive deps like fastuuid) from the workspace lock. |
| python/packages/kagent-adk/pyproject.toml | Removes litellm dependency; retains/uses provider SDK deps (openai/anthropic/boto3/ollama/numpy). |
| python/packages/kagent-adk/src/kagent/adk/_memory_service.py | Replaces LiteLLM embedding calls with provider-specific SDK embedding dispatch. |
| python/packages/kagent-adk/src/kagent/adk/types.py | Updates _create_llm_from_model_config to instantiate native Anthropic/Ollama/Bedrock implementations. |
| python/packages/kagent-adk/src/kagent/adk/models/_anthropic.py | Adds KAgentAnthropicLlm with base_url/headers and API key passthrough support. |
| python/packages/kagent-adk/src/kagent/adk/models/_bedrock.py | Adds KAgentBedrockLlm using Bedrock Converse / ConverseStream APIs via boto3. |
| python/packages/kagent-adk/src/kagent/adk/models/_ollama.py | Adds KAgentOllamaLlm using the native Ollama SDK and tool/function-call conversions. |
| python/packages/kagent-adk/src/kagent/adk/models/init.py | Exports new model classes instead of the removed LiteLLM wrapper. |
| python/packages/kagent-adk/src/kagent/adk/models/_litellm.py | Deletes the LiteLLM wrapper model class. |
| python/packages/kagent-adk/tests/unittests/test_embedding.py | Adds unit tests for embedding dispatch/truncation/normalization without LiteLLM. |
| python/packages/kagent-adk/tests/unittests/models/test_anthropic.py | Adds unit tests for the Anthropic adapter behavior. |
| python/packages/kagent-adk/tests/unittests/models/test_bedrock.py | Adds unit tests for Bedrock adapter + client region selection. |
| python/packages/kagent-adk/tests/unittests/models/test_ollama.py | Adds unit tests for Ollama adapter + option/header forwarding. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Signed-off-by: JM Huibonhoa <jm.huibonhoa@solo.io>
|
I threw a claude team at this just to double check, lmk what you think. I'm happy to merge without everything fixed to get rid of Review:
|
|
Quick comments based on my context:
We already set the AWS_REGION env var during translation, so it should work fine
Some models do not allow configuring embedding dimensions (returns a fixed size vector that is more than 768), that's the purpose of truncation and re-normalization. According to prior research this works fine in most cases, as long as the call to the model returns a vector longer than 768.
Probably out of scope, we might want to rework the embedding interface in the future for wider support |
Signed-off-by: JM Huibonhoa <jm.huibonhoa@solo.io>
Signed-off-by: JM Huibonhoa <jm.huibonhoa@solo.io>
## Motivation This will allow most users to directly switch from `runtime: python` to `runtime: go` without needing to worry about existing LLM provider configs since everything will be supported on the Go side, facilitating adoption of the new go runtime ## Summary Closes most of the gap between python and go identified in #1643 - TLS and api key passthrough for LLM provider - Support Ollama and Bedrock using client SDK as we've done earlier in #1540 - Use Bedrock client instead of messages API for Anthropic on Bedrock to support all bedrock runtime models - Tightens tool config conversion for Anthropic + Bedrock and fixes issues like #1645, #1683 - Sanitize ToolName for bedrock LLMs #1473, see [bedrock API docs](https://docs.aws.amazon.com/bedrock/latest/APIReference/API_runtime_ToolSpecification.html) - Refactor embedding models in python to be separate from memory service - Strip approval confirmation synthetic tool calls from LLM requests, these messages are persisted in task / events store and used by ADK internally but sending them to the model will be wasting tokens and confuse the model. If the session is long and has many HITL events, these internal tool messages will be consuming many unnecessary token! ## Testing Plan - [x] All new unit tests in go adk passes, all old unit tests in python passes - [x] Test for no regression with OpenAI and Gemini models in go runtime - [x] Validate with a wide range of use cases such as: builtin tools (ask user, save memory), ADK built-in tools (load memory) MCP tools, Remote A2A (subagent) tools, HITL tools (approvals) - [x] Bedrock LLM and embedding model in Go runtime - [x] OpenAI API key passthrough with A2A `--token` option - [x] Ollama LLM and embedding in Go runtime (local models, Gemma 4 + embedding Gemma) - [x] Ollama with TLS (local https server with self-signed certs) --------- Signed-off-by: Jet Chiang <pokyuen.jetchiang-ext@solo.io>
) ## Motivation This will allow most users to directly switch from `runtime: python` to `runtime: go` without needing to worry about existing LLM provider configs since everything will be supported on the Go side, facilitating adoption of the new go runtime ## Summary Closes most of the gap between python and go identified in kagent-dev#1643 - TLS and api key passthrough for LLM provider - Support Ollama and Bedrock using client SDK as we've done earlier in kagent-dev#1540 - Use Bedrock client instead of messages API for Anthropic on Bedrock to support all bedrock runtime models - Tightens tool config conversion for Anthropic + Bedrock and fixes issues like kagent-dev#1645, kagent-dev#1683 - Sanitize ToolName for bedrock LLMs kagent-dev#1473, see [bedrock API docs](https://docs.aws.amazon.com/bedrock/latest/APIReference/API_runtime_ToolSpecification.html) - Refactor embedding models in python to be separate from memory service - Strip approval confirmation synthetic tool calls from LLM requests, these messages are persisted in task / events store and used by ADK internally but sending them to the model will be wasting tokens and confuse the model. If the session is long and has many HITL events, these internal tool messages will be consuming many unnecessary token! ## Testing Plan - [x] All new unit tests in go adk passes, all old unit tests in python passes - [x] Test for no regression with OpenAI and Gemini models in go runtime - [x] Validate with a wide range of use cases such as: builtin tools (ask user, save memory), ADK built-in tools (load memory) MCP tools, Remote A2A (subagent) tools, HITL tools (approvals) - [x] Bedrock LLM and embedding model in Go runtime - [x] OpenAI API key passthrough with A2A `--token` option - [x] Ollama LLM and embedding in Go runtime (local models, Gemma 4 + embedding Gemma) - [x] Ollama with TLS (local https server with self-signed certs) --------- Signed-off-by: Jet Chiang <pokyuen.jetchiang-ext@solo.io>
…t-dev#1540) Removes `litellm` as a dependency from kagent. `litellm` is now replaced with provider specific sdks. ### Testing **ollama** 1. deploy ollama ```bash kubectl apply -f - <<EOF apiVersion: apps/v1 kind: Deployment metadata: name: ollama namespace: kagent spec: replicas: 1 selector: matchLabels: app: ollama template: metadata: labels: app: ollama spec: containers: - name: ollama image: ollama/ollama:latest ports: - containerPort: 11434 resources: requests: memory: "2Gi" limits: memory: "4Gi" --- apiVersion: v1 kind: Service metadata: name: ollama namespace: kagent spec: selector: app: ollama ports: - port: 11434 targetPort: 11434 EOF ``` 2. Pull a small model into the ollama deployment - `kubectl -n kagent exec -it deploy/ollama -- ollama pull llama3.2:1b` 3. create model config an agent ```bash kubectl apply -f - <<EOF apiVersion: kagent.dev/v1alpha2 kind: ModelConfig metadata: name: ollama-test-config namespace: kagent spec: provider: Ollama model: llama3.2:1b ollama: host: "http://ollama:11434" options: num_ctx: "2048" temperature: "0.7" top_k: "40" --- apiVersion: kagent.dev/v1alpha2 kind: Agent metadata: name: ollama-test namespace: kagent spec: type: Declarative description: "Ollama native SDK test" declarative: modelConfig: ollama-test-config systemMessage: "You are a helpful assistant. Answer concisely." EOF ``` 4. Port forward UI and test agent. `kubectl port-forward -n kagent svc/kagent-ui 3000:8080` <img width="1722" height="692" alt="Screenshot 2026-03-25 at 7 21 20 PM" src="https://github.com/user-attachments/assets/ca061c4c-0202-4a05-8968-92da4b9bf0a1" /> 5. Test memory recall by pulling an embedding model `kubectl -n kagent exec -it deploy/ollama -- ollama pull nomic-embed-text` 6. Create the embedding model config an ollama memory test agent ```bash kubectl apply -f - <<EOF apiVersion: kagent.dev/v1alpha2 kind: ModelConfig metadata: name: ollama-embedding-config namespace: kagent spec: provider: Ollama model: nomic-embed-text ollama: host: "http://ollama:11434" --- apiVersion: kagent.dev/v1alpha2 kind: Agent metadata: name: memory-ollama-test namespace: kagent spec: type: Declarative description: "Memory with Ollama embedding" declarative: modelConfig: ollama-test-config # chat model from Step 3 systemMessage: "You are a helpful assistant with memory." memory: modelConfig: ollama-embedding-config EOF ``` 7. Test in the UI. <img width="1719" height="981" alt="Screenshot 2026-03-26 at 2 54 04 PM" src="https://github.com/user-attachments/assets/d91306ec-f381-40bb-a3ea-4c4d4e5d631a" /> **embedding - google** 1. Create secret and model config ```bash kubectl apply -f - <<EOF apiVersion: v1 kind: Secret metadata: name: gemini-api-key-secret namespace: kagent type: Opaque data: GOOGLE_API_KEY: <your_api_key> --- apiVersion: kagent.dev/v1alpha2 kind: ModelConfig metadata: name: gemini-2-flash-config namespace: kagent spec: model: gemini-2.0-flash provider: Gemini apiKeySecret: gemini-api-key-secret apiKeySecretKey: GOOGLE_API_KEY gemini: {} --- apiVersion: kagent.dev/v1alpha2 kind: ModelConfig metadata: name: gemini-embedding-config namespace: kagent spec: model: gemini-embedding-001 provider: Gemini apiKeySecret: gemini-api-key-secret apiKeySecretKey: GOOGLE_API_KEY gemini: {} EOF ``` 2. Create Agent ```bash kubectl apply -f - <<EOF apiVersion: kagent.dev/v1alpha2 kind: Agent metadata: name: memory-openai-test namespace: kagent spec: type: Declarative description: "Memory with Gemini embedding" declarative: modelConfig: gemini-2-flash-config systemMessage: "You are a helpful assistant with memory." memory: modelConfig: gemini-embedding-config EOF ``` 3. Port forward UI and test agent. `kubectl port-forward -n kagent svc/kagent-ui 3000:8080` <img width="1727" height="700" alt="Screenshot 2026-03-25 at 7 20 18 PM" src="https://github.com/user-attachments/assets/f4f45a05-da6c-4593-9c8b-f369bfae51af" /> **bedrock** 1. Create aws-credentials ```bash kubectl -n kagent create secret generic aws-credentials \ --from-literal=AWS_ACCESS_KEY_ID="$AWS_ACCESS_KEY_ID" \ --from-literal=AWS_SECRET_ACCESS_KEY="$AWS_SECRET_ACCESS_KEY" \ --from-literal=AWS_DEFAULT_REGION="<your_region>" \ --from-literal=AWS_SESSION_TOKEN="$AWS_SESSION_TOKEN" \ --dry-run=client -o yaml | kubectl apply -f - ``` 2. Create model config and agent ```yaml apiVersion: kagent.dev/v1alpha2 kind: ModelConfig metadata: name: bedrock-model-config namespace: kagent spec: model: us.anthropic.claude-haiku-4-5-20251001-v1:0 provider: Bedrock bedrock: region: us-east-1 --- apiVersion: kagent.dev/v1alpha2 kind: Agent metadata: name: bedrock-test namespace: kagent spec: type: Declarative description: "Bedrock Converse API test" declarative: systemMessage: "You are a helpful assistant. Answer concisely." modelConfig: bedrock-model-config deployment: env: - name: AWS_ACCESS_KEY_ID valueFrom: secretKeyRef: name: aws-credentials key: AWS_ACCESS_KEY_ID - name: AWS_SECRET_ACCESS_KEY valueFrom: secretKeyRef: name: aws-credentials key: AWS_SECRET_ACCESS_KEY - name: AWS_DEFAULT_REGION valueFrom: secretKeyRef: name: aws-credentials key: AWS_DEFAULT_REGION - name: AWS_SESSION_TOKEN valueFrom: secretKeyRef: name: aws-credentials key: AWS_SESSION_TOKEN ``` <img width="1727" height="987" alt="Screenshot 2026-03-26 at 2 20 21 PM" src="https://github.com/user-attachments/assets/cda35fb3-a732-4b98-8519-709f766a6bf8" /> --------- Signed-off-by: JM Huibonhoa <jm.huibonhoa@solo.io> Co-authored-by: Eitan Yarmush <eitan.yarmush@solo.io>
Removes
litellmas a dependency from kagent.litellmis now replaced with provider specific sdks.Testing
ollama
kubectl -n kagent exec -it deploy/ollama -- ollama pull llama3.2:1bkubectl port-forward -n kagent svc/kagent-ui 3000:8080embedding - google
kubectl port-forward -n kagent svc/kagent-ui 3000:8080bedrock