Describe the bug
Long-running queries issued through the Azure MCP PostgreSQL tools (postgres database query, postgres table schema get) fail with HTTP 500 Internal Server Error. The underlying Npgsql exception is:
Exception while reading from stream
Two factors combine to cause this:
- Default command timeout is too low. The command timeout falls back to Npgsql's built-in default of 30 seconds, so queries such as an exact
COUNT(*) over a large table, or any analytical query taking longer than 30s, time out and surface as a 500.
- No keepalive on the connection. While a single command runs, no application data flows over the TCP socket. An intermediary in Azure (Load Balancer / NAT gateway, ~4 minute idle timeout) silently drops the idle connection. Npgsql then surfaces this as
Exception while reading from stream → HTTP 500, even when the server itself is still processing the query.
This was hit by a customer running queries in the ~1–5 minute range and reproduced consistently.
Expected behavior
- Reasonable long-running read queries (up to a few minutes) complete successfully without a 500.
- The command timeout is configurable per call and server-wide.
- The connection is not dropped by an intermediary idle timeout while a query is actively running.
Actual behavior
- Queries longer than ~30s (or that stay idle on the socket past the load balancer idle timeout) fail with
500 Internal Server Error / Exception while reading from stream.
- There is no way to raise the timeout for an individual query.
Affected tools:
postgres database query
postgres table schema get
Reproduction Steps
- Configure the Azure MCP PostgreSQL tools against an Azure Database for PostgreSQL Flexible Server.
- Run a query that takes longer than 30 seconds, e.g.:
or an exact count over a large table:
SELECT COUNT(*) FROM <large_table>;
- Observe the request fail with
500 Internal Server Error and Exception while reading from stream.
Proposed fix (validated)
The following changes resolved the issue for the customer:
- Add an explicit, configurable command timeout with this precedence:
- explicit per-call
command-timeout option,
AZURE_MCP_POSTGRES_COMMAND_TIMEOUT environment variable (server-wide default),
- default of 300 seconds (5 minutes),
0 = wait indefinitely.
- Enable keepalives on the connection string so the socket stays active during long-running queries:
KeepAlive = 30 (Npgsql application-level keepalive every 30 idle seconds),
TcpKeepAlive = true.
With both changes, the 5-minute pg_sleep and large COUNT(*) queries complete successfully instead of returning a 500.
Environment
- Tool: Azure MCP Server — PostgreSQL tools (
Azure.Mcp.Tools.Postgres)
- Driver: Npgsql
- Backend: Azure Database for PostgreSQL Flexible Server
Describe the bug
Long-running queries issued through the Azure MCP PostgreSQL tools (
postgres database query,postgres table schema get) fail with HTTP500 Internal Server Error. The underlying Npgsql exception is:Two factors combine to cause this:
COUNT(*)over a large table, or any analytical query taking longer than 30s, time out and surface as a 500.Exception while reading from stream→ HTTP 500, even when the server itself is still processing the query.This was hit by a customer running queries in the ~1–5 minute range and reproduced consistently.
Expected behavior
Actual behavior
500 Internal Server Error/Exception while reading from stream.Affected tools:
postgres database querypostgres table schema getReproduction Steps
500 Internal Server ErrorandException while reading from stream.Proposed fix (validated)
The following changes resolved the issue for the customer:
command-timeoutoption,AZURE_MCP_POSTGRES_COMMAND_TIMEOUTenvironment variable (server-wide default),0= wait indefinitely.KeepAlive = 30(Npgsql application-level keepalive every 30 idle seconds),TcpKeepAlive = true.With both changes, the 5-minute
pg_sleepand largeCOUNT(*)queries complete successfully instead of returning a 500.Environment
Azure.Mcp.Tools.Postgres)