Skip to content

fix(tokens,usage): in case of redis hiccup#8112

Open
rickbijkerk wants to merge 1 commit into
graphql-hive:mainfrom
rickbijkerk:fix/tokens-usage-redis-command-timeout
Open

fix(tokens,usage): in case of redis hiccup#8112
rickbijkerk wants to merge 1 commit into
graphql-hive:mainfrom
rickbijkerk:fix/tokens-usage-redis-command-timeout

Conversation

@rickbijkerk

Copy link
Copy Markdown
Contributor

Background

If Redis becomes unavailable for a short moment for whatever reason, the tokens service can get stuck returning "token not found" for valid tokens until it is manually restarted. usage then returns 401s and usage ingestion stops, and the service does not self-recover.

Description

Adds commandTimeout to the ioredis client in two services:

  • packages/services/tokens
  • packages/services/usage

This MR does not structurally fix the core issue that a redis connection can get into a broken state, it however does throw an error on no response, throwing an error, triggering the backup path of using postgres (in case of the tokens service). Which in turn should fix it functionally but does leave the redis connection broken.

These are the only two services whose Redis client is constructed without any timeout/retry options (others set retryStrategy/reconnectOnError through createRedisClient). I didnt opt to rewrite usage/tokens to this mechanism as i couldnt fully oversee the consequences.

Longer term it may be worth unifying tokens/usage onto the shared createRedisClient for consistency, though note that alone wouldn't fully resolve this, since it also doesn't set commandTimeout/keepAlive.

Checklist

  • Error handling and logging
  • System configuration

…k to Postgres

After a Redis restart the ioredis client can report status "ready" on a
connection that no longer responds. Without commandTimeout, redis.get never
settles, so reads hang indefinitely instead of falling back to Postgres, and
valid tokens are reported as not found until the service is manually restarted.

@gemini-code-assist gemini-code-assist Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request configures a commandTimeout of 5,000ms for the Redis clients in the tokens and usage services to prevent commands from hanging indefinitely on half-open sockets. No issues were identified, and there is no feedback to provide.

Important

The consumer version of Gemini Code Assist on GitHub is being sunset. Starting June 18, 2026, new organization installations will be blocked, and all code review activity will officially cease on July 17, 2026.
For more details on the timeline and next steps, please review the Help Documentation.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Development

Successfully merging this pull request may close these issues.

1 participant