Skip to content

Increase default max-buffer-size to 10m for non-blocking containers #4524

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 3 commits into from
Mar 7, 2025

Conversation

sparrc
Copy link
Contributor

@sparrc sparrc commented Mar 5, 2025

Summary

This changes the default max-buffer-size setting in the LogConfiguration struct of the task definition from 1m to 10m.

This change brings the ECS agent default value closer to our recommended value of 25m. ECS recommends 25m in docs and blog posts.

While 10m is obviously lower than the recommended value of 25m, it still largely mitigates the risk of log loss due to high through-put (see blog post linked earlier for benchmarking details). 10m was chosen to avoid too drastic of a change in the default value and memory usage while still mitigating most risk of log loss due to high through-put with non-blocking mode.

Testing

New tests cover the changes: yes

Manual tests were run to verify that blocking containers do not set this value, non-blocking containers with a max-buffer-size in the task def do not set this value, and non-blocking containers without max-buffer-size in the task def do have the value set to 10m.

Description for the changelog

Enhancement: Increase LogConfig default max-buffer-size from 1m to 10m.

Licensing

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.

@sparrc sparrc requested a review from a team as a code owner March 5, 2025 22:42
@sparrc sparrc added the bot/test label Mar 5, 2025
@sparrc sparrc added the bot/test label Mar 5, 2025
// unset, then default to 10m.
if mode, ok := hostConfig.LogConfig.Config[logDriverMode]; ok {
_, hasBufferSize := hostConfig.LogConfig.Config[logDriverBufferSize]
if mode == string(dockercontainer.LogModeNonBlock) && !hasBufferSize {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What will be the buffer size for blocking mode?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In other words, what does the logging driver use as buffer size if no buffer size is provided?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

blocking mode does not have a buffer

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So every call is actually blocking? That's interesting. I'd be curious about its performance.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, if the log driver has issues sending logs to the destination, the application faces backpressure: https://aws.amazon.com/blogs/containers/choosing-container-logging-options-to-avoid-backpressure/

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's no configuration parameter for "blocking" mode, but I believe the log driver uses a buffered channel of a few hundred items, where each item is a separate log line.

@@ -1924,6 +1927,15 @@ func (engine *DockerTaskEngine) createContainer(task *apitask.Task, container *a
}
}

// If the container has mode=non-blocking set, and the max-buffer-size field is
// unset, then default to 10m.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: maybe we should not say 10m here so that the possibility of someone changing the constant in the future and forgetting to update this comment is removed. We can add a comment for the constant definition?

Comment on lines +3126 to +3127
client.EXPECT().APIVersion().Return(defaultDockerClientAPIVersion, nil).AnyTimes()
client.EXPECT().CreateContainer(gomock.Any(), gomock.Any(), gomock.Any(), gomock.Any(), gomock.Any()).Do(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: These mocks and all the additional setup won't be needed if we create a helper function that takes hostConfig as a parameter and sets the buffer size if needed and just unit test that function.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

True that would be cleaner...the amount of special cases already handled in here would make it quite a large refactor though so I'll have to save it for another day

amogh09
amogh09 previously approved these changes Mar 6, 2025
singholt
singholt previously approved these changes Mar 6, 2025
@sparrc sparrc dismissed stale reviews from singholt and amogh09 via 5568a75 March 6, 2025 16:39
@sparrc sparrc force-pushed the max-buffer-size branch from aefe4d9 to 5568a75 Compare March 6, 2025 16:39
@sparrc sparrc added the bot/test label Mar 7, 2025
@sparrc sparrc changed the title Default max-buffer-size to 10m for non-blocking containers Increase default max-buffer-size to 10m for non-blocking containers Mar 7, 2025
@sparrc sparrc enabled auto-merge (squash) March 7, 2025 19:15
@sparrc sparrc merged commit ade2808 into aws:dev Mar 7, 2025
40 checks passed
@mye956 mye956 mentioned this pull request Mar 11, 2025
prateekchaudhry added a commit to prateekchaudhry/amazon-ecs-agent that referenced this pull request Mar 20, 2025
xxx0624 pushed a commit to xxx0624/amazon-ecs-agent that referenced this pull request Apr 17, 2025
timj-hh pushed a commit to timj-hh/amazon-ecs-agent that referenced this pull request Jul 19, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants