Skip to content

Fix local IP address depletion by cleaning up network bridges for aws… #4741

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Aug 6, 2025

Conversation

tshan2001
Copy link
Contributor

@tshan2001 tshan2001 commented Aug 1, 2025

Summary

Scope down IP address depletion fix to managed linux platform only.

Previously in #4717, we made the change to fix the IP address depletion issue (for managed linux platform). However, since the change was made in common_linux, it introduced bug in existing platforms. This change is to scope down the fix to only apply to the managed linux platform.

Implementation details

Instead of invoking the CNI clean-up logic in the common_linux code path, methods configureRegularENI and configureBranchENI are created for the managed_linux platform, where we will use logic specific to managed linux platform, instead of using the common code path.

Testing

Unit Test
New tests cover the changes: YES
Introduced new unit tests TestManagedLinux_TestConfigureInterface/regular-eni and TestManagedLinux_TestConfigureInterface/branch-eni

% go test -tags unit -v -run TestManagedLinux_TestConfigureInterface /workplace/tianzes/amazon-ecs-agent/ecs-agent/netlib/platform
=== RUN   TestManagedLinux_TestConfigureInterface
=== RUN   TestManagedLinux_TestConfigureInterface/regular-eni
1754427458660162039 [Info] logger=structured msg="Configuring regular ENI" ENIName="" NetNSPath="ns1"
1754427458660242370 [Info] logger=structured msg="Configuring regular ENI" ENIName="" NetNSPath="ns1"
1754427458660426941 [Info] logger=structured msg="Configuring regular ENI" NetNSPath="ns1" ENIName=""
=== RUN   TestManagedLinux_TestConfigureInterface/branch-eni
1754427458660969428 [Info] logger=structured msg="Configuring branch ENI" ENIName="" NetNSPath="ns1"
1754427458661023942 [Info] logger=structured msg="Configuring branch ENI" ENIName="" NetNSPath="ns1"
1754427458661049387 [Info] logger=structured msg="Configuring branch ENI" ENIName="" NetNSPath="ns1"
--- PASS: TestManagedLinux_TestConfigureInterface (0.00s)
    --- PASS: TestManagedLinux_TestConfigureInterface/regular-eni (0.00s)
    --- PASS: TestManagedLinux_TestConfigureInterface/branch-eni (0.00s)
PASS
ok      github.com/aws/amazon-ecs-agent/ecs-agent/netlib/platform       (cached)

E2E Testing
Launched and stopped multiple tasks on the same host, then examined the CNI plugin log.
For task 8d919a758df041ba9f251e033275c961, saw the following:

% sudo cat /var/log/ecs/ecs-cni-warmpool.log | grep 8d919a758df041ba9f251e033275c961
2025-08-05T21:30:41Z [INFO] msg="Creating the bridge" netns=/var/run/netns/8d919a758df041ba9f251e033275c961-024c06c4784d ifname=eth0 containerID=container-id bridgeName=fargate-bridge ipamType=ecs-ipam
2025-08-05T21:30:41Z [INFO] msg="Creating veth pair for namespace" netns=/var/run/netns/8d919a758df041ba9f251e033275c961-024c06c4784d ifname=eth0 containerID=container-id bridgeName=fargate-bridge ipamType=ecs-ipam
2025-08-05T21:30:41Z [INFO] msg="Attaching veth pair to bridge" netns=/var/run/netns/8d919a758df041ba9f251e033275c961-024c06c4784d ifname=eth0 containerID=container-id bridgeName=fargate-bridge ipamType=ecs-ipam hostVethName=veth42de7dc3
2025-08-05T21:30:41Z [INFO] msg="Running IPAM plugin ADD" netns=/var/run/netns/8d919a758df041ba9f251e033275c961-024c06c4784d ifname=eth0 containerID=container-id bridgeName=fargate-bridge ipamType=ecs-ipam hostVethName=veth42de7dc3
2025-08-05T21:30:41Z [INFO] msg="Configuring container's interface" netns=/var/run/netns/8d919a758df041ba9f251e033275c961-024c06c4784d ifname=eth0 containerID=container-id bridgeName=fargate-bridge ipamType=ecs-ipam hostVethName=veth42de7dc3
2025-08-05T21:30:41Z [INFO] msg="Configuring bridge" netns=/var/run/netns/8d919a758df041ba9f251e033275c961-024c06c4784d ifname=eth0 containerID=container-id bridgeName=fargate-bridge ipamType=ecs-ipam hostVethName=veth42de7dc3
2025-08-05T21:30:41Z [INFO] msg="Deleting veth interface" netns=/var/run/netns/8d919a758df041ba9f251e033275c961-024c06c4784d ifname=eth0 containerID=container-id bridgeName=fargate-bridge ipamType=ecs-ipam
2025-08-05T21:30:41Z [INFO] msg="Running IPAM plugin DEL" netns=/var/run/netns/8d919a758df041ba9f251e033275c961-024c06c4784d ifname=eth0 containerID=container-id bridgeName=fargate-bridge ipamType=ecs-ipam
2025-08-05T21:30:41Z [INFO] msg="Deleting container interface" netns=/var/run/netns/8d919a758df041ba9f251e033275c961-024c06c4784d ifname=eth0 containerID=container-id bridgeName=fargate-bridge ipamType=ecs-ipam

Specifically, these 2 lines at the bottom prove that the fix to clean up netns kicked in:

2025-08-05T21:30:41Z [INFO] msg="Running IPAM plugin DEL" netns=/var/run/netns/8d919a758df041ba9f251e033275c961-024c06c4784d ifname=eth0 containerID=container-id bridgeName=fargate-bridge ipamType=ecs-ipam
2025-08-05T21:30:41Z [INFO] msg="Deleting container interface" netns=/var/run/netns/8d919a758df041ba9f251e033275c961-024c06c4784d ifname=eth0 containerID=container-id bridgeName=fargate-bridge ipamType=ecs-ipam

Description for the changelog

Scope down IP address depletion fix to managed linux platform only.

Additional Information

Does this PR include breaking model changes? If so, Have you added transformation functions?
NO

Does this PR include the addition of new environment variables in the README?
NO

Licensing

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.

@tshan2001 tshan2001 requested a review from a team as a code owner August 1, 2025 18:57
@JoseVillalta JoseVillalta self-assigned this Aug 1, 2025
JoseVillalta

This comment was marked as outdated.

@tshan2001 tshan2001 changed the title Fix local IP address depletion by cleaning up network bridges for aws… [DO NOT REVIEW WIP] Fix local IP address depletion by cleaning up network bridges for aws… Aug 1, 2025
@tshan2001 tshan2001 force-pushed the master branch 2 times, most recently from 04ab1b2 to 5d61c72 Compare August 5, 2025 20:59
@tshan2001 tshan2001 changed the base branch from master to dev August 5, 2025 21:06
@tshan2001 tshan2001 changed the title [DO NOT REVIEW WIP] Fix local IP address depletion by cleaning up network bridges for aws… Fix local IP address depletion by cleaning up network bridges for aws… Aug 5, 2025
require.NoError(t, err)
}

func testManaedLinuxBranchENIConfiguration(t *testing.T) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: Typo

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pushed a new rev to fix this

@@ -36,6 +42,88 @@ const (
macAddress = "0a:1b:2c:3d:4e:5f"
)

func TestManagedLinux_TestConfigureInterface(t *testing.T) {
t.Run("regular-eni", testManagedLinuxRegularENIConfiguration)
t.Run("branch-eni", testManaedLinuxBranchENIConfiguration)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: Typo

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pushed a new rev to fix this

Copy link
Contributor

@JoseVillalta JoseVillalta left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Other than theses minor comments, changes LGTM. Will approve contingent on test passing and fix for the typo in test name.

case networkinterface.V2NInterfaceAssociationProtocol:
err = m.common.configureGENEVEInterface(ctx, netNSPath, iface, netDAO)
case networkinterface.VETHInterfaceAssociationProtocol:
// Do nothing.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: can we have a comment about why?

Copy link
Contributor

@JoseVillalta JoseVillalta left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@tshan2001 tshan2001 merged commit a03b20b into aws:dev Aug 6, 2025
40 checks passed
@harishxr harishxr mentioned this pull request Aug 7, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants