-
Notifications
You must be signed in to change notification settings - Fork 9
Release test for AWS IAM Postgres passwordless #192
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
- Add database_passwordless_aws_use_iam and database_passwordless_aws_region variables - Add DATABASE_AUTH_USE_AWS_IAM and DATABASE_AUTH_AWS_DB_REGION environment variables - Update both runtime_container_engine_config and tfe_init modules - Follow Redis passwordless authentication pattern for PostgreSQL
- Add Database.Passwordless.AWSUseInstanceProfile structure to tfe_init template variables - Add database_aws_iam_auth_enabled and database_aws_iam_region variables to settings module - This provides the correct nested structure that TFE templates expect for IAM authentication - Templates look for .Database.Passwordless.AWSUseInstanceProfile to set DATABASE_AUTH_USE_AWS_IAM Fixes: PostgreSQL passwordless authentication was failing because DATABASE_AUTH_USE_AWS_IAM was not being set correctly
- Added database_iam_username variable to tfe_init module - Updated template to use proper database connection variables - Template now automatically creates PostgreSQL IAM user with proper permissions - Fixed variable reference from database_user to database_iam_username - Added comprehensive error handling and database readiness checks This completes the automation requirements for PostgreSQL passwordless authentication release tests.
Set AWS_DEFAULT_REGION environment variable before AWS CLI commands:
- Export AWS_DEFAULT_REGION=${database_aws_iam_region}
- Ensures AWS CLI commands in PostgreSQL IAM user creation have proper region
- Resolves 'You must specify a region' AWS CLI errors
This should fix the user_data script failures and allow proper PostgreSQL IAM user creation.
TFE's auth.go code requires DATABASE_URL environment variable for IAM authentication to work: - Added DATABASE_URL construction using database connection parameters - Format: postgresql://user:password@host:5432/dbname?parameters - Critical for TFE to properly use AWS IAM database authentication This should resolve the 'database password must be set' configuration error.
Handle null password properly in DATABASE_URL construction: - Only include password part if database_password is not null - Format: postgresql://user@host:5432/dbname (no password part for IAM auth) - Ensures valid PostgreSQL connection string for IAM authentication This should fix DATABASE_URL format for pgmultiauth IAM authentication.
TFE requires a password in the DATABASE_URL even when IAM authentication is enabled. Added 'aws-iam-auth' placeholder password to ensure proper DATABASE_URL format: postgresql://user:aws-iam-auth@host:5432/db?params
- Set DATABASE_URL to null for IAM auth (generated at runtime) - Add RDS IAM token generation function in tfe.sh.tpl - Generate authentication tokens using 'aws rds generate-db-auth-token' - Inject DATABASE_URL with actual IAM token into Docker Compose at runtime - Use IAM username and token-based authentication - Ensure proper URL encoding for special characters in tokens
Based on Atlas app implementation and TFE auth.go analysis: - Set DATABASE_URL to base connection string WITHOUT password for IAM auth - Remove runtime token generation (pgmultiauth library handles this internally) - Let TFE's pgmultiauth library generate IAM tokens automatically - DATABASE_URL format: postgresql://user@host:5432/db?sslmode=require - DATABASE_AUTH_USE_AWS_IAM=true triggers IAM authentication in pgmultiauth
database_host already includes ':5432' port from RDS endpoint, so don't add ':5432' again. This was causing malformed DATABASE_URL which resulted in 'DATABASE_URL: null' in compose.yaml. Correct format: postgresql://user@host:5432/db?params
- Set TFE_DATABASE_PASSWORD to null when IAM auth is enabled - Add TFE_DATABASE_USE_INSTANCE_PROFILE for AWS instance profile - Filter out null values so TFE_DATABASE_PASSWORD doesn't appear in compose at all - This prevents TFE config validation from checking for password when using IAM
…nt variables - Add TFE_DATABASE_PASSWORDLESS_AWS_USE_INSTANCE_PROFILE mapping to fix TFE config validation - Add TFE_DATABASE_PASSWORDLESS_AWS_REGION for completeness - These environment variables map to Database.Passwordless struct fields in TFE config - Resolves 502 error by ensuring TFE validation recognizes IAM auth is enabled
- Set TFE_DATABASE_PASSWORD to empty string for IAM auth (required by TFE) - Add DATABASE_AUTH_AWS_SERVICE_NAME=rds-db for Atlas IAM provider - Fix DATABASE_URL format with empty password placeholder (:@) - Remove duplicate port from DATABASE_URL (host already includes port) These fixes address the 502 errors by ensuring Atlas can properly: 1. Initialize AWS IAM token provider with correct service name 2. Parse database connection parameters from properly formatted URL 3. Generate IAM tokens for PostgreSQL RDS connections
Remove DATABASE_AUTH_* and DATABASE_URL variables as they should not be required for PostgreSQL IAM authentication. The TFE core system should handle IAM authentication using only the TFE_DATABASE_* variables.
- Add SSM document execution in user_data template - Execute SSM command during TFE FDO startup to create IAM user automatically - Add postgres_iam_setup_ssm_document variable to tfe_init module - Wait for command completion and log results - Prevents 502 errors caused by missing IAM user in PostgreSQL database
- Compress PostgreSQL IAM setup script significantly - Reduce verbose logging to minimize user_data size - Keep essential functionality: SSM execution and status monitoring - Use shorter variable names and compact loops
- Add IMDSv2 token support with IMDSv1 fallback for metadata access - Fix bash syntax errors in conditional statements and loops - Add proper error handling and validation for SSM commands - Use while loops instead of bash ranges for compatibility - Add metadata debugging output for troubleshooting
- Remove complex SSM command execution that was causing failures - Create IAM user directly in user_data script using psql - Install PostgreSQL client and handle database connection inline - Use template variables for database connection parameters - Eliminate dependency on SSM document and metadata service - Simpler, more reliable approach for IAM user creation
- Compress entire IAM setup into 4 concise lines - Remove all verbose logging and error handling to save space - Use compact for loop with seq instead of while loops - Combine all SQL grants into single psql command - Essential functionality preserved in minimal footprint
- Add detailed logging for each step of the PostgreSQL setup - Add retry logic for PostgreSQL client installation - Add proper connectivity testing with version check - Add user verification after creation - Add explicit error handling with exit 1 on failure - Add more descriptive log messages to debug cloud-init issues
PostgreSQL IAM user creation is now handled by null_resource in terraform-aws-terraform-enterprise database module instead of user_data script. This provides better error handling and ensures the user is created before TFE container starts.
- Add robust PostgreSQL IAM user creation back to EC2 user_data - Include proper error handling and logging - EC2 instances have network access to RDS unlike CI machines - Compact but complete implementation for AWS size limits
- Reduced from ~50 lines to ~15 lines of PostgreSQL setup - Install specific postgresql-client-16 package - Compact single-line SQL command for IAM user creation - Reduced logging but kept essential status messages - File now 10KB vs previous ~12KB to ensure AWS compliance - Maintains all required functionality for IAM user creation
- Remove unused variable from tfe_init module - Update user_data template condition to use database_iam_username - Clean up SSM document approach in favor of direct automation
- Fix terraform formatting issues - Code follows best practices and naming conventions
- Remove unused Redis passwordless AWS variables: - redis_passwordless_aws_use_iam - redis_passwordless_aws_region - redis_passwordless_aws_host_name - Remove unused Redis Sidekiq variables: - redis_sidekiq_host - redis_sidekiq_user - redis_sidekiq_password - redis_sidekiq_use_tls - redis_sidekiq_use_auth All variables were declared but never referenced in the codebase. Fixes terraform_unused_declarations warnings from tflint.
- Remove invalid TFE_DATABASE_USE_INSTANCE_PROFILE variable - Keep only the correct TFE_DATABASE_PASSWORDLESS_AWS_USE_INSTANCE_PROFILE - TFE_DATABASE_USE_INSTANCE_PROFILE is not a valid TFE environment variable - Eliminates duplicate and invalid configuration Line 19: TFE_DATABASE_USE_INSTANCE_PROFILE (invalid) → removed Line 21: TFE_DATABASE_PASSWORDLESS_AWS_USE_INSTANCE_PROFILE (valid) → kept
- Add back explorer_database_passwordless_azure_use_msi variable - Add back explorer_database_passwordless_azure_client_id variable - These were incorrectly removed as they are for Explorer database, not primary database - Our scope is limited to primary database changes only Reverts scope creep - Explorer database variables should remain untouched.
| database = { | ||
| TFE_DATABASE_USER = var.database_user | ||
| TFE_DATABASE_PASSWORD = var.database_password | ||
| TFE_DATABASE_USER = var.database_user |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
please run terraform fmt before pushing commits.
| TFE_DATABASE_PASSWORDLESS_AZURE_CLIENT_ID = var.database_passwordless_azure_client_id | ||
| # AWS IAM authentication for PostgreSQL passwordless | ||
| TFE_DATABASE_PASSWORDLESS_AWS_USE_INSTANCE_PROFILE = var.database_passwordless_aws_use_iam | ||
| TFE_DATABASE_PASSWORDLESS_AWS_REGION = var.database_passwordless_aws_region |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe we can do it in a separate PR, but I am thinking the variables database_passwordless_aws_region and s3_region have the same values. Does it makes sense to merge them into one?
|
|
||
| %{ if database_iam_username != null && database_iam_username != "" ~} | ||
| echo "[$(date +"%FT%T")] Setting up PostgreSQL IAM user" | tee -a $log_pathname | ||
| sudo apt-get update -qq && sudo apt-get install -y postgresql-client-16 >/dev/null 2>&1 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks a little fragile, what happens when we use pg 15 or 17 as in compatibility tests?
| psql -h "${database_host}" -U "${admin_database_username}" -d "${database_name}" -c "DO \$\$ BEGIN IF NOT EXISTS (SELECT 1 FROM pg_roles WHERE rolname = '${database_iam_username}') THEN CREATE USER \"${database_iam_username}\" WITH LOGIN; GRANT rds_iam TO \"${database_iam_username}\"; GRANT CONNECT ON DATABASE \"${database_name}\" TO \"${database_iam_username}\"; GRANT USAGE, CREATE ON SCHEMA public TO \"${database_iam_username}\"; GRANT ALL ON ALL TABLES IN SCHEMA public TO \"${database_iam_username}\"; GRANT ALL ON ALL SEQUENCES IN SCHEMA public TO \"${database_iam_username}\"; ALTER DEFAULT PRIVILEGES IN SCHEMA public GRANT ALL ON TABLES TO \"${database_iam_username}\"; ALTER DEFAULT PRIVILEGES IN SCHEMA public GRANT ALL ON SEQUENCES TO \"${database_iam_username}\"; RAISE NOTICE 'IAM user created'; ELSE RAISE NOTICE 'IAM user exists'; END IF; END \$\$;" >/dev/null 2>&1 | ||
| echo "IAM user ${database_iam_username} ready" | tee -a $log_pathname |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this what we suggest to our clients as well? To give admin privileges to the IAM user on all schemas? The reason I ask - release tests should be similar in deployment to what we suggest our clients, if the privileges we suggest change some time in the future the test should break.
Also, have we tried doing this via IAM roles and policies? If not, could we first try setting up that manually and if it works then incorporate it here?
| PGPASSWORD="$admin_password" psql -h "$db_endpoint" -U "$admin_user" -d "$db_name" -v ON_ERROR_STOP=1 << EOF | ||
| DO \$\$ | ||
| BEGIN | ||
| -- Check if user exists | ||
| IF NOT EXISTS (SELECT 1 FROM pg_roles WHERE rolname = '$iam_user') THEN | ||
| -- Create the IAM user | ||
| CREATE USER "$iam_user"; | ||
| -- Grant rds_iam role (this role exists automatically in RDS PostgreSQL with IAM auth enabled) | ||
| GRANT rds_iam TO "$iam_user"; | ||
| -- Grant necessary database permissions | ||
| GRANT CONNECT ON DATABASE "$db_name" TO "$iam_user"; | ||
| GRANT USAGE ON SCHEMA public TO "$iam_user"; | ||
| GRANT CREATE ON SCHEMA public TO "$iam_user"; | ||
| -- Grant table permissions | ||
| GRANT ALL PRIVILEGES ON ALL TABLES IN SCHEMA public TO "$iam_user"; | ||
| GRANT ALL PRIVILEGES ON ALL SEQUENCES IN SCHEMA public TO "$iam_user"; | ||
| -- Grant default privileges for future objects | ||
| ALTER DEFAULT PRIVILEGES IN SCHEMA public GRANT ALL ON TABLES TO "$iam_user"; | ||
| ALTER DEFAULT PRIVILEGES IN SCHEMA public GRANT ALL ON SEQUENCES TO "$iam_user"; | ||
| RAISE NOTICE 'Successfully created IAM user: $iam_user'; | ||
| ELSE | ||
| RAISE NOTICE 'IAM user already exists: $iam_user'; | ||
| END IF; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is the difference between the things we are doing here and in modules/tfe_init/templates/aws.ubuntu.docker.tfe.sh.tpl?
ajmera-naman
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Tests are failing
| TFE_DATABASE_PASSWORD = var.database_password | ||
| TFE_DATABASE_USER = var.database_user | ||
| # For IAM authentication, set empty password but ensure the variable exists | ||
| TFE_DATABASE_PASSWORD = var.database_passwordless_aws_use_iam ? "" : var.database_password |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can directly send "" for var.database_password
| } | ||
| database_configuration = local.disk ? {} : local.database | ||
| # Filter out null values so they don't appear in the compose file at all | ||
| database_configuration = local.disk ? {} : { for k, v in local.database : k => v if v != null } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this needed
| TFE_EXPLORER_DATABASE_USER = var.explorer_database_user | ||
| TFE_EXPLORER_DATABASE_PASSWORD = var.explorer_database_password | ||
| TFE_EXPLORER_DATABASE_PARAMETERS = var.explorer_database_parameters | ||
| TFE_EXPLORER_DATABASE_PASSWORDLESS_AZURE_USE_MSI = var.explorer_database_passwordless_azure_use_msi |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why are we removing this
| TFE_EXPLORER_DATABASE_PASSWORD = var.explorer_database_password | ||
| TFE_EXPLORER_DATABASE_PARAMETERS = var.explorer_database_parameters | ||
| TFE_EXPLORER_DATABASE_PASSWORDLESS_AZURE_USE_MSI = var.explorer_database_passwordless_azure_use_msi | ||
| TFE_EXPLORER_DATABASE_PASSWORDLESS_AZURE_CLIENT_ID = var.explorer_database_passwordless_azure_client_id |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same as above
| redis_bootstrap_ca_pathname = local.redis_bootstrap_ca_pathname | ||
|
|
||
| # Database configuration for templates | ||
| Database = { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why are we adding this
Fix multiple issues based on code review comments: 1. Set TFE_DATABASE_PASSWORD to empty string directly (ajmera-naman feedback) 2. Remove unnecessary null filtering logic - not needed for passwordless setup 3. Restore explorer database variables that were accidentally removed 4. Remove duplicate TFE_EXPLORER_DATABASE_PARAMETERS line 5. Run terraform fmt to fix formatting issues (kkavish feedback) 6. Simplify database_configuration assignment This addresses all PR review comments while focusing the module on PostgreSQL IAM passwordless authentication and maintaining explorer database functionality.
c5e133b to
33b6172
Compare
|
Closing this as this repo is archived. New PR in FDO repo: https://github.com/hashicorp/terraform-terraform-enterprise-fdo/pull/68 |
Background
This release test tests the passwordless authentication to AWS postgres using IAM based token.
Relates OR Closes #(https://hashicorp.atlassian.net/browse/IND-4009)
JIRA: https://hashicorp.atlassian.net/browse/IND-4009
Related PRs
terraform-enterprise: https://github.com/hashicorp/terraform-enterprise/pull/3397
terraform-aws-terraform-enterprise: https://github.com/hashicorp/terraform-aws-terraform-enterprise
ptfedev-infra: https://github.com/hashicorp/ptfedev-infra/pull/887
How has this been tested?
The release test ran on Github:https://github.com/hashicorp/terraform-enterprise/actions/runs/19146818577/job/54726521821
Checked the TFE logs, database connection was success.
Did you add a new setting?
No
CI/CD: https://github.com/hashicorp/terraform-enterprise/actions/runs/19464088212/job/55694964359
Screenshot: