-
Notifications
You must be signed in to change notification settings - Fork 1.6k
feat(model): add support for multiple chat models #4228
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
- Introduce `ModelConfigVariant` to handle single or multiple model configurations - Update validation logic to check for duplicate models in configurations - Update chat model handling to support multiple HTTP models - Add `MultiChatStream` to manage multiple chat streams and handle model selection - Modify health check and model loading to work with new configuration structure
Temporarily use the first model config in the list for chat model health conversion.
Hi @lkk214, thanks for the contribution, this is really a big changes, may I ask what is your scenarios by add multiple http models? we depend on the models in many place, It's not safe to do such a big refactor in short time |
Hi @zwpaper,thanks for your review! Here’s the rationale for multi-model support:
Moreover, we already support multiple http model providers, such as deepseek, openrouter, etc. It would be a better improvement if we can support multiple model configurations, at least we can use them at the same time. |
Thank you for the comprehensive reply. However, as models become increasingly powerful and versatile, it seems less essential to employ distinct models for various applications. Tabby relies heavily on the model's usage; we may need to discuss further before implementing such a significant change. May I inquire if this is utilized in your business scenario? |
@zwpaper Thanks for your reply. I use multiple models in my work. Although a single model can solve most problems, I prefer to find satisfactory answers from multiple models. This is more like an enhancement than a core change. It won’t affect previous configurations—it simply provides more possibilities for model configuration. I understand your concerns. Compared to this change, which might impact the core functionality, maintaining code stability is far more important—it’s absolutely critical. This PR may not be a high priority for Tabby. If the team decides not to merge it, I can close it or leave it as-is for future reference. This can be closed anytime if needed. |
PR Summary: This PR introduces support for managing multiple chat models by adding a MultiChatStream that aggregates multiple HTTP chat streams, and it updates the configuration model to accept multiple HTTP model configs for chat. Key changes include: • New functions to create a composite chat stream (create_multiple, add_engine). |
@@ -222,7 +222,12 @@ async fn load_model(config: &Config) { | |||
download_model_if_needed(&model.model_id, ModelKind::Completion).await; | |||
} | |||
|
|||
if let Some(ModelConfig::Local(ref model)) = config.model.chat { | |||
if let Some(model) = config |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Consider logging a warning or error when a chat model is not a local configuration in load_model, if that case is unexpected per business requirements.
Reviewed up to commit:cdea1a5fb3afab9155bf03281e03e71dd6510036 Additional Suggestioncrates/tabby/src/routes/models.rs, line:30-34Verify that the aggregation of chat model titles (combining primary title and supported models) does not introduce duplicates or ambiguities in display names.crates/tabby/src/serve.rs, line:398-404Validate that using unwrap_or for chat_device is adequate—consider explicitly checking for misconfiguration to avoid silently falling back to a default device.crates/tabby/src/services/health.rs, line:90-96Ensure get_model_configs() returns a non-empty array before accessing index 0 to avoid a potential panic.let chat = model_config
.chat
.as_ref()
.and_then(|m| m.get_model_configs().get(0).cloned())
.as_ref()
.map(ModelHealth::from); crates/http-api-bindings/src/chat/mod.rs, line:80Consider returning a Result or using graceful error handling for unsupported model kinds instead of panicking, so that the error can be managed upstream.fn create_engine(model: &HttpModelConfig) -> Box<dyn ChatCompletionStream> {
let api_endpoint = model
.api_endpoint
.as_deref()
.expect("api_endpoint is required");
match model.kind.as_str() {
"azure/chat" => {
let config = async_openai_alt::config::AzureConfig::new()
.with_api_base(api_endpoint)
.with_api_key(model.api_key.clone().unwrap_or_default())
.with_api_version(AZURE_API_VERSION.to_string())
.with_deployment_id(model.model_name.as_deref().expect("Model name is required"));
Box::new(
async_openai_alt::Client::with_config(config)
.with_http_client(create_reqwest_client(api_endpoint)),
)
}
"openai/chat" | "mistral/chat" => {
let config = OpenAIConfig::default()
.with_api_base(api_endpoint)
.with_api_key(model.api_key.clone().unwrap_or_default());
let mut builder = ExtendedOpenAIConfig::builder();
builder
.base(config)
.kind(model.kind.clone())
.supported_models(model.supported_models.clone())
.model_name(model.model_name.as_deref().expect("Model name is required"));
Box::new(
async_openai_alt::Client::with_config(
builder.build().expect("Failed to build config"),
)
.with_http_client(create_reqwest_client(api_endpoint)),
)
}
_ => {
// Instead of panicking, return a dummy implementation or handle gracefully.
// For illustration, returning a Box::new(UnsupportedModelKind) or similar.
// You may define an UnsupportedModelKind struct implementing ChatCompletionStream.
Box::new(crate::unsupported::UnsupportedModelKind::new(model.kind.clone()))
}
}
} Or refactor to return a Result and propagate error handling upstream. |
Panto AI has reviewed this pull request and provided key insights to improve your code. Need more in-depth reviews or have questions? Reach out to us at www.getpanto.ai or via email at [email protected] 🚀 |
Description
Add an optional
title
field (defaults tomodel_name
if empty) to HttpModelConfig to customize model name display and avoid conflicts between identical names from different providersMaintain backward compatibility with the single chat model configuration, while adding support for multiple models, each with independent settings (e.g., rate_limit, prompt_template)
Configuration Examples
Single Model (Legacy Format)
Multiple Models (New Format)