The Komodo API can be used along with llama-swap to facilitate a containerized setup! #241

Laflamme · 2025-08-11T14:08:35Z

Laflamme
Aug 11, 2025

I would like to share my hacky setup.

I wanted to keep llama-swap fully containerized, but I also wanted to use it with docker containers for TTS and ASR. I was already using Komodo, which has an API to start or stop stacks (among other functions, see https://docs.rs/komodo_client/latest/komodo_client/api/execute/index.html), so I figured I could simply integrate llama-swap into Komodo instead of implementing a redundant docker-in-docker setup.

Here is my llama-swap config:

models:
  "parakeet-stt":
    proxy: "http://192.168.0.2:8011"
    cmd: |
        curl -s --header "Content-Type: application/json"
        --header "X-Api-Key: [redacted]"
        --header "X-Api-Secret: [redacted]"
        --data '{ "type": "StartStack", "params": { "stack": "parakeet-stt" } }'
        https://komodo.my_domain.tld/execute && sleep 1
    cmdStop: |
        curl -s --header "Content-Type: application/json"
        --header "X-Api-Key: K-rn7gCAOKgAJQuLQN4BepdHZG9I6hQUZee93EMPqF"
        --header "X-Api-Secret: S-SmOXf1mNwhLMeY8fBas88OEBetliPo2S41SEosgu"
        --data '{ "type": "StopStack", "params": { "stack": "parakeet-stt" } }'
        https://komodo.my_domain.tld/execute
  "chatterbox-tts":
    proxy: "http://192.168.0.2:4321"
    cmd: |
        curl -s --header "Content-Type: application/json"
        --header "X-Api-Key: K-rn7gCAOKgAJQuLQN4BepdHZG9I6hQUZee93EMPqF"
        --header "X-Api-Secret: S-SmOXf1mNwhLMeY8fBas88OEBetliPo2S41SEosgu"
        --data '{ "type": "StartStack", "params": { "stack": "chatterbox-tts-api" } }'
        https://komodo.4o3.cc/execute && sleep 1
    cmdStop: |
        curl -s --header "Content-Type: application/json"
        --header "X-Api-Key: K-rn7gCAOKgAJQuLQN4BepdHZG9I6hQUZee93EMPqF"
        --header "X-Api-Secret: S-SmOXf1mNwhLMeY8fBas88OEBetliPo2S41SEosgu"
        --data '{ "type": "StopStack", "params": { "stack": "chatterbox-tts-api" } }'
        https://komodo.4o3.cc/execute

Using the curl command without the && sleep 1, llama-swap would successfully start the appropriate stack, but it wouldn't recognize that the model is loaded, showing the following error:

[DEBUG] <chatterbox-tts> cmd.Wait() returned error: <nil>
[INFO] <chatterbox-tts> process exited but not StateStopping, current state: starting

My understanding is that curl is instantaneous and llama-swap is expecting a process, so that's why I use && sleep 1 to work around that limitation.

I do not add && sleep 1 after the curl command in cmdStop, however, because with it, the model doesn't switch from stopping to stopped until the healthCheckTimeout value is reached.

My understanding is that without && sleep 1, llama-swap is "skipping graceful stop" (which is being handled by Komodo anyway), so the model is recognized as stopped at the expected time.

I'm sorry for the numerous edits to this post, and I hope this can help some of you!

mostlygeek · 2025-08-12T22:35:05Z

mostlygeek
Aug 12, 2025
Maintainer

whatever is in your llama-swap config is crashing on startup.

3 replies

Laflamme Aug 12, 2025
Author

Thank you for paying attention to this! I edited my original post to replace my inquiries with the solution. Everything seems to be working just as expected from the end user perspective, though I'm sure it is not from a code perspective 😅

mostlygeek Aug 12, 2025
Maintainer

For the record i’m not sure what you’re trying to accomplish and I’m both surprised and horrified that it “works” 😂

the && in cmd is probably not doing what we think it is. llama-swap parses the command and should be passing everything as an argument. It’s not spawning a shell in the background
llama-swap checks a health endpoint to know when the upstream process is ready, obviously curl doesn’t have one. You can use “none” for the checkEndpoint to skip the health check (it’s documented)
I do not recommend or condone any of the examples you shared. But I am impressed :)

Laflamme Aug 12, 2025
Author

the && in cmd is probably not doing what we think it is. llama-swap parses the command and should be passing everything as an argument. It’s not spawning a shell in the background

Oh my! 😅

llama-swap checks a health endpoint to know when the upstream process is ready, obviously curl doesn’t have one. You can use “none” for the checkEndpoint to skip the health check (it’s documented)

The curl process itself doesn't have a health endpoint, I agree, but it asks Komodo to start a stack (basically, to call docker compose start on a compose.yaml file that it's managing). That stack spawns at least one container, for example the one accessible at http://192.168.0.2:8011, which has a regular health endpoint at http://192.168.0.2:8011/health.

With this setup, llama-swap calls the Komodo API, then shows "starting" until the model endpoint is accessible, which is the expected behavior, isn't it?

What isn't quite working as you intended is the stopping behavior, since llama-swap is likely expecting a specific response after calling cmdStop. The relevant stack actually is gracefully quitting ( using docker compose stop), but from llama-swap's perspective it's being killed right away.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

The Komodo API can be used along with llama-swap to facilitate a containerized setup! #241

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{editor}}'s edit

{{editor}}'s edit

Uh oh!

Replies: 1 comment 3 replies

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{editor}}'s edit

{{editor}}'s edit

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{editor}}'s edit

{{editor}}'s edit

Uh oh!

Select a reply

Uh oh!

The Komodo API can be used along with llama-swap to facilitate a containerized setup! #241

Uh oh!

Uh oh!

Laflamme Aug 11, 2025

Replies: 1 comment · 3 replies

Uh oh!

Uh oh!

mostlygeek Aug 12, 2025 Maintainer

Uh oh!

Laflamme Aug 12, 2025 Author

Uh oh!

mostlygeek Aug 12, 2025 Maintainer

Uh oh!

Uh oh!

Laflamme Aug 12, 2025 Author

Laflamme
Aug 11, 2025

Replies: 1 comment 3 replies

mostlygeek
Aug 12, 2025
Maintainer

Laflamme Aug 12, 2025
Author

mostlygeek Aug 12, 2025
Maintainer

Laflamme Aug 12, 2025
Author