Skip to content

Migrating awserr ResponseError to smithy ResponseError in GetRequestFailureStatusCode #4516

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Feb 28, 2025

Conversation

mye956
Copy link
Contributor

@mye956 mye956 commented Feb 25, 2025

Summary

This PR will aim to fix an issue where we're trying to type cast the RequestError type from AWS SDK Go V1 for any errors in EC2 IMDS GetMetadata calls which we are now using AWS SDK Go V2. Because of this when we hit an issue where when we're polling for the container instance to join an ASG and we hit any errors, agent will exit because it couldn't recognize the error as different error types are being used in V2. There's specific status codes that agent checks for to continue polling in case of any transient issue which it currently can't get any at all.

This issue can only happen when ECS_WARM_POOLS_CHECK is set to true.

https://docs.aws.amazon.com/sdk-for-go/v2/developer-guide/handle-errors.html

Related Github Issue: #4487

Implementation details

  • GetRequestFailureStatusCode -> GetResponseErrorStatusCode
    • Changed awserr.RequestFailure to smithy.ResponseError to obtain the status code

Testing

  • Updated test cases to use smithy.OperationError as well from awserr.RequestFailure.

Manual testing:

  • Created a custom AMI with changes and launched a container instance with ECS_WARM_POOLS_CHECK set to true. Expected behavior is for agent to no longer exit with non-zero and continue polling.

Standalone instance:

level=info time=2025-02-27T19:29:23Z msg="Event stream ContainerChange start listening..." module=eventstream.go
level=info time=2025-02-27T19:29:23Z msg="Loading state!" module=state_manager.go
level=info time=2025-02-27T19:29:23Z msg="Waiting for instance to go InService" module=agent.go
level=debug time=2025-02-27T19:29:23Z msg="Error when getting intended lifecycle state: operation error ec2imds: GetMetadata, http response error StatusCode: 404, request to EC2 IMDS failed" module=agent.go
level=debug time=2025-02-27T19:29:25Z msg="Error when getting intended lifecycle state:
[ec2-user@ip-172-31-17-238 ~]$ docker ps
CONTAINER ID   IMAGE                            COMMAND    CREATED         STATUS                     PORTS     NAMES
ce1a28ad4010   amazon/amazon-ecs-agent:latest   "/agent"   7 minutes ago   Up 7 minutes (unhealthy)             ecs-agent

Warmpool instance that eventually joins an ASG

level=info time=2025-02-27T19:38:29Z msg="Event stream ContainerChange start listening..." module=eventstream.go
level=info time=2025-02-27T19:38:29Z msg="Loading state!" module=state_manager.go
level=info time=2025-02-27T19:38:29Z msg="Waiting for instance to go InService" module=agent.go
level=debug time=2025-02-27T19:38:29Z msg="Error when getting intended lifecycle state: operation error ec2imds: GetMetadata, http response error StatusCode: 404, request to EC2 IMDS failed" module=agent.go
level=debug time=2025-02-27T19:38:30Z msg="Error when getting intended lifecycle state: operation error ec2imds: GetMetadata, http response error StatusCode: 404, request to EC2 IMDS failed" module=agent.go
level=debug time=2025-02-27T19:38:31Z msg="Error when getting intended lifecycle state: operation error ec2imds: GetMetadata, http response error StatusCode: 404, request to EC2 IMDS failed" module=agent.go
level=debug time=2025-02-27T19:38:33Z msg="Target lifecycle state of instance: " module=agent.go
level=debug time=2025-02-27T19:39:33Z msg="Error when getting intended lifecycle state: operation error ec2imds: GetMetadata, http response error StatusCode: 404, request to EC2 IMDS failed" module=agent.go
level=debug time=2025-02-27T19:39:34Z msg="Error when getting intended lifecycle state: operation error ec2imds: GetMetadata, http response error StatusCode: 404, request to EC2 IMDS failed" module=agent.go
level=debug time=2025-02-27T19:39:36Z msg="Error when getting intended lifecycle state: operation error ec2imds: GetMetadata, http response error StatusCode: 404, request to EC2 IMDS failed" module=agent.go
level=debug time=2025-02-27T19:39:38Z msg="Target lifecycle state of instance: " module=agent.go
level=debug time=2025-02-27T19:40:38Z msg="Target lifecycle state of instance: InService" module=agent.go
level=debug time=2025-02-27T19:40:38Z msg="Loading pause container tarball:" image="/images/amazon-ecs-pause.tar"
level=debug time=2025-02-27T19:40:38Z msg="Inspecting container image: " image="amazon/amazon-ecs-pause:0.1.0"
level=debug time=2025-02-27T19:40:38Z msg="Setting up ENI Watcher" module=agent_unix.go
level=info time=2025-02-27T19:40:38Z msg="eni watcher has been initialized" module=watcher_linux.go
level=info time=2025-02-27T19:40:38Z msg="Successfully got ECS instance credentials from provider: EC2RoleProvider"
level=debug time=2025-02-27T19:40:38Z msg="Extended supported docker API versions" supportedAPIVersions=[1.17 1.18 1.19 1.20 1.21 1.22 1.23 1.24 1.25 1.26 1.27 1.28 1.29 1.30 1.31 1.32 1.33 1.34 1.35 1.36 1.37 1.38 1.39 1.40 1.41 1.42 1.43 1.44]
level=debug time=2025-02-27T19:40:38Z msg="Inspecting container image: " image="amazon/amazon-ecs-pause:0.1.0"
level=debug time=2025-02-27T19:40:38Z msg="Inspecting container image: " image="ecs-service-connect-agent:interface-"
level=debug time=2025-02-27T19:40:38Z msg="Loading Appnet agent container tarball: /managed-agents/serviceconnect/ecs-service-connect-agent.interface-v1.tar"
level=info time=2025-02-27T19:40:58Z msg="Successfully loaded Appnet agent container tarball: /managed-agents/serviceconnect/ecs-service-connect-agent.interface-v1.tar" image="ecs-service-connect-agent:interface-v1"
level=debug time=2025-02-27T19:40:58Z msg="Inspecting container image: " image="ecs-service-connect-agent:interface-v1"
level=debug time=2025-02-27T19:40:58Z msg="Found route" Route={Ifindex: 2 Dst: <nil> Src: 172.31.19.51 Gw: 172.31.16.1 Flags: [] Table: 254 Realm: 0}
level=debug time=2025-02-27T19:40:58Z msg="Found the associated network interface by the index" LinkName="ens5" LinkIndex=2
level=debug time=2025-02-27T19:40:59Z msg="Fault injection capability is enabled." module=agent_capability.go
level=info time=2025-02-27T19:40:59Z msg="Registering Instance with ECS"
level=debug time=2025-02-27T19:40:59Z msg="Attempting to get Instance Identity Document"
level=debug time=2025-02-27T19:40:59Z msg="Successfully retrieved Instance Identity Document"
level=debug time=2025-02-27T19:40:59Z msg="Attempting to get Instance Identity Signature"
level=debug time=2025-02-27T19:40:59Z msg="Successfully retrieved Instance Identity Signature"
level=info time=2025-02-27T19:40:59Z msg="Remaining memory" remainingMemory=3785
level=info time=2025-02-27T19:40:59Z msg="Registered container instance with cluster!"
level=info time=2025-02-27T19:40:59Z msg="Instance registration completed successfully" instanceArn="arn:aws:ecs:us-west-2:113424923516:container-instance/default/8f0f52f80d164d76a421a030738c293b" cluster="default"
[ec2-user@ip-172-31-19-51 ~]$ docker ps
CONTAINER ID   IMAGE                            COMMAND    CREATED          STATUS                    PORTS     NAMES
4e14792aad5b   amazon/amazon-ecs-agent:latest   "/agent"   13 minutes ago   Up 13 minutes (healthy)             ecs-agent

Description for the changelog

Bugfix: Migrate over to smithy ResponseError for obtaining status code of IMDS GetMetadata calls

Additional Information

Does this PR include breaking model changes? If so, Have you added transformation functions?

Does this PR include the addition of new environment variables in the README?

Licensing

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.

@mye956 mye956 force-pushed the asg-statuscode-aws-sdk-go-v2 branch 2 times, most recently from b609bc9 to 2a1b5a4 Compare February 25, 2025 23:08
@mye956 mye956 marked this pull request as ready for review February 25, 2025 23:20
@mye956 mye956 requested a review from a team as a code owner February 25, 2025 23:20
@mye956 mye956 force-pushed the asg-statuscode-aws-sdk-go-v2 branch 3 times, most recently from 3fa1a1d to 798f606 Compare February 26, 2025 22:10
@mye956 mye956 force-pushed the asg-statuscode-aws-sdk-go-v2 branch from 798f606 to cc04290 Compare February 26, 2025 22:57
@mye956 mye956 force-pushed the asg-statuscode-aws-sdk-go-v2 branch 2 times, most recently from 968d358 to 47d19ae Compare February 27, 2025 00:51
@mye956 mye956 changed the title [WIP] Migrating awserr ResponseError to smithy ResponseError in GetRequestFailureStatusCode Migrating awserr ResponseError to smithy ResponseError in GetRequestFailureStatusCode Feb 27, 2025
@mye956 mye956 force-pushed the asg-statuscode-aws-sdk-go-v2 branch from 47d19ae to b746fcf Compare February 27, 2025 20:51
@mye956 mye956 force-pushed the asg-statuscode-aws-sdk-go-v2 branch from b746fcf to 96fd579 Compare February 27, 2025 23:12
@mye956 mye956 merged commit d4e8c56 into aws:dev Feb 28, 2025
40 checks passed
@mye956 mye956 mentioned this pull request Mar 11, 2025
@mye956 mye956 deleted the asg-statuscode-aws-sdk-go-v2 branch April 28, 2025 15:53
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants