Support for MSK AWS IAM auth#8049
Conversation
There was a problem hiding this comment.
Code Review
This PR implements AWS IAM authentication for MSK across the usage and usage-ingestor services. Key changes include a new MSK token provider utility, updated Zod environment schemas, and support for comma-separated Kafka broker lists. Feedback identifies a typo in the changeset file and recommends trimming whitespace when parsing the broker list to prevent potential connection errors.
- Add environment validation: require KAFKA_SSL=1 and AWS region when IAM is enabled - Add non-happy-path tests for createMskIamTokenProvider (error propagation, undefined token, token refresh) - Remove unnecessary type cast in createMskIamTokenProvider - Upgrade aws-msk-iam-sasl-signer-js from 1.0.0 to 1.0.3 - Trim whitespace in comma-separated KAFKA_BROKER addresses - Improve changeset documentation
|
Also the CI pipelines fail because of the pnpm-lock file. We're downgrading dependencies and not building a few packages, due to constraints in our environment. Is it possible for a code maintainer to help us generate the pnpm-lock file? |
n1ru4l
left a comment
There was a problem hiding this comment.
i added a few comments that should be addressed + CI and linting is failing currently. Can you please address these?
n1ru4l
left a comment
There was a problem hiding this comment.
Legit! Thank you for helping out!
Background
Self-hosters running Hive on AWS with Amazon MSK (Managed Streaming for Apache Kafka) currently have no way to use IAM-based authentication for Kafka connections. This forces them to use static credentials.
This PR adds opt-in MSK AWS IAM authentication support for the services that communicates to Kafka/MSK:
usageandusage-ingestor. In addition, the KAFKA_BROKER previously only accepted a single broker address. This PR also updates it to accept a comma-separated list of brokers. This functionality is already supported in the KafkaJS library.This PR is part of the following issue. We will have separate PRs for each IAM support to help decrease the scope per PR.
Description
The MSK token generation logic happens in
service-common/src/iam-msk.ts'screateMskIamTokenProvider()function. This generates a short-lived SigV4 token usingaws-msk-iam-sasl-signer-js. The services use this provider to authenticate to Kafka via the OAUTHBEARER SASL mechanism when IAM auth is enabled. KafkaJS supports an oauthBearerProvider to take a function for dynamic passwords so it will handle the token refreshes.New environmental variables introduced
AWS_REGIONKAFKA_AWS_IAM_AUTH_ENABLED1to enable IAM authenticationKAFKA_AWS_REGIONAWS_REGION)Credential Flow for MSK
--- title: MSK / Kafka (Long-lived Connection) --- flowchart TD A[Resolve Kafka SASL] --> B{AWS_REGION set?} B -->|Yes| C{KAFKA_AWS_IAM_AUTH_ENABLED = true?} C -->|Yes| D[✅ OAUTHBEARER + SigV4 token<br/>Region: KAFKA_AWS_REGION ?? AWS_REGION] C -->|No / not set| E{Explicit SASL mechanism?<br/>plain / scram-sha-*} B -->|No| E E -->|Yes| F[✅ Use username + password] E -->|No| G[✅ No SASL]Checklist