Support for S3 AWS IAM auth#8079
Conversation
Add opt-in AWS IAM authentication for S3 connections. When IAM is enabled, services authenticate to S3 using short-lived SigV4 pre-signed tokens instead of static credentials, with a new token generated for each request. New environment variables: - S3_AWS_IAM_AUTH_ENABLED: enable IAM authentication for S3 - S3_MIRROR_AWS_IAM_AUTH_ENABLED: enable IAM authentication for S3 Mirror - S3_AUDIT_LOG_AWS_IAM_AUTH_ENABLED: enable IAM authentication for S3 Audit Log
There was a problem hiding this comment.
Code Review
This pull request introduces opt-in AWS IAM authentication for S3 connections across services, allowing dynamic credential resolution (such as EKS Pod Identity) via the AWS SDK credential chain alongside static fallbacks. It adds new environment variables, updates environment validation schemas, refactors AwsClient to use an AwsCredentialProvider, and introduces comprehensive unit tests. A TypeScript compilation issue was identified in aws.ts where the sdkProvider type definition lacks the expiration property, which is subsequently accessed when resolving credentials.
| let sdkProvider: () => Promise<{ | ||
| accessKeyId: string; | ||
| secretAccessKey: string; | ||
| sessionToken?: string; | ||
| }>; |
There was a problem hiding this comment.
The sdkProvider variable is typed without the expiration property, which causes a TypeScript compilation error when attempting to access creds.expiration on line 107. Since fromNodeProviderChain returns credentials that may include an expiration date, updating the type of sdkProvider to include expiration?: Date resolves this type mismatch cleanly and safely.
| let sdkProvider: () => Promise<{ | |
| accessKeyId: string; | |
| secretAccessKey: string; | |
| sessionToken?: string; | |
| }>; | |
| let sdkProvider: () => Promise<{ | |
| accessKeyId: string; | |
| secretAccessKey: string; | |
| sessionToken?: string; | |
| expiration?: Date; | |
| }>; |
Background
Self-hosters running Hive on AWS with AWS S3 currently have no way to use IAM-based authentication for S3 connections, which forces them to manage and rotate static access keys
S3_ACCESS_KEY_ID/S3_SECRET_ACCESS_KEY.This PR adds opt-in AWS IAM authentication support for S3 connections. When IAM auth is enabled, services authenticate to S3 using the AWS SDK default credential chain (IRSA, EKS Pod Identity, instance profile, etc.) instead of static credentials.
This PR is part of the following issue. We will have separate PRs for each IAM support to help decrease the scope per PR.
Description
Since S3 connections are HTTP requests, there is no need for a token refresher. Instead, tokens are generated per request via a credential provider.
Additionally, since the console already has a custom AWS S3 client (aws4fetch) which already supports a session token, we'll build on top of that and use a credential provider interface to avoid deeply modifying the underlying pre-existing S3 client.
We created a new
createDefaultCredentialProvider()function inapi/src/modules/cdn/providers/aws.tswhich creates anAwsCredentialProviderthat:fromNodeProviderChain()from@aws-sdk/credential-providers(the SDK handles credential caching and automatic refresh before expiry)
The
AwsClientclass is updated to accept acredentialProviderinstead of raw credential strings. On everysign()call, it resolves fresh credentials from the provider.Since the
cdn-workerpackage targets Cloudflare Workers and cannot import from the AWS SDK. To maintain compatibility, theAwsCredentialProviderinterface is duplicated incdn-worker/src/aws.ts. So this supports S3 IAM Auth on the CDN when the server service hasCDN_API=1.Deploying the CDN to AWS Lambda does not support passwordless connections to S3 at this time.
This PR also fixes a bug where the audit log S3 client was incorrectly using the primary S3 bucket's endpoint and bucket name instead of its own. The
AuditLogS3Confignow correctly usess3AuditLogs.endpointands3AuditLogs.bucketName. Additionally, the audit log CSV upload now includesaws: { signQuery: true }to ensure the S3 PUT request is signed correctly.New environment variables
S3_AWS_IAM_AUTH_ENABLED1to enable IAM authentication for the primary S3 bucket.S3_MIRROR_AWS_IAM_AUTH_ENABLED1to enable IAM authentication for the S3 mirror bucket.S3_AUDIT_LOG_AWS_IAM_AUTH_ENABLED1to enable IAM authentication for the S3 audit log bucket.Environment Variable Validation
When IAM auth is not enabled for a given S3 bucket, the environment validation enforces that static credentials (
S3*_ACCESS_KEY_ID/S3*_SECRET_ACCESS_KEY) are provided. Previously these were always required; now they are only required when IAM is disabled.When IAM auth is enabled:
S3_ENDPOINTshould be set to the AWS S3 regional endpoint (e.g.https://s3.us-east-1.amazonaws.com)S3_BUCKET_NAMEmust be set to the AWS S3 bucket namePnpm-lock file generation
Like MSK, the CI pipelines will fail because of the pnpm-lock file. We're downgrading dependencies and not building a few packages, due to constraints in our environment. Is it possible for a code maintainer to help us generate the pnpm-lock file?
Checklist