Skip to content

Conversation

@raviharshicorp
Copy link

@raviharshicorp raviharshicorp commented Nov 11, 2025

Background

This release test tests the passwordless authentication to AWS postgres using IAM based token.

Relates OR Closes #(https://hashicorp.atlassian.net/browse/IND-4009)

JIRA: https://hashicorp.atlassian.net/browse/IND-4009

Related PRs
terraform-enterprise: https://github.com/hashicorp/terraform-enterprise/pull/3397
terraform-aws-terraform-enterprise: https://github.com/hashicorp/terraform-aws-terraform-enterprise
ptfedev-infra: https://github.com/hashicorp/ptfedev-infra/pull/887

How has this been tested?

The release test ran on Github:https://github.com/hashicorp/terraform-enterprise/actions/runs/19146818577/job/54726521821
Checked the TFE logs, database connection was success.

Did you add a new setting?

No

CI/CD: https://github.com/hashicorp/terraform-enterprise/actions/runs/19464088212/job/55694964359

Screenshot:

Screenshot 2025-11-20 at 19 49 04 Screenshot 2025-11-20 at 19 50 23 Screenshot 2025-11-20 at 19 51 17 Screenshot 2025-11-20 at 19 54 56 Screenshot 2025-11-20 at 19 55 24 Screenshot 2025-11-20 at 19 55 47

- Add database_passwordless_aws_use_iam and database_passwordless_aws_region variables
- Add DATABASE_AUTH_USE_AWS_IAM and DATABASE_AUTH_AWS_DB_REGION environment variables
- Update both runtime_container_engine_config and tfe_init modules
- Follow Redis passwordless authentication pattern for PostgreSQL
- Add Database.Passwordless.AWSUseInstanceProfile structure to tfe_init template variables
- Add database_aws_iam_auth_enabled and database_aws_iam_region variables to settings module
- This provides the correct nested structure that TFE templates expect for IAM authentication
- Templates look for .Database.Passwordless.AWSUseInstanceProfile to set DATABASE_AUTH_USE_AWS_IAM

Fixes: PostgreSQL passwordless authentication was failing because DATABASE_AUTH_USE_AWS_IAM was not being set correctly
- Added database_iam_username variable to tfe_init module
- Updated template to use proper database connection variables
- Template now automatically creates PostgreSQL IAM user with proper permissions
- Fixed variable reference from database_user to database_iam_username
- Added comprehensive error handling and database readiness checks

This completes the automation requirements for PostgreSQL passwordless authentication release tests.
Set AWS_DEFAULT_REGION environment variable before AWS CLI commands:
- Export AWS_DEFAULT_REGION=${database_aws_iam_region}
- Ensures AWS CLI commands in PostgreSQL IAM user creation have proper region
- Resolves 'You must specify a region' AWS CLI errors

This should fix the user_data script failures and allow proper PostgreSQL IAM user creation.
TFE's auth.go code requires DATABASE_URL environment variable for IAM authentication to work:
- Added DATABASE_URL construction using database connection parameters
- Format: postgresql://user:password@host:5432/dbname?parameters
- Critical for TFE to properly use AWS IAM database authentication

This should resolve the 'database password must be set' configuration error.
Handle null password properly in DATABASE_URL construction:
- Only include password part if database_password is not null
- Format: postgresql://user@host:5432/dbname (no password part for IAM auth)
- Ensures valid PostgreSQL connection string for IAM authentication

This should fix DATABASE_URL format for pgmultiauth IAM authentication.
TFE requires a password in the DATABASE_URL even when IAM authentication
is enabled. Added 'aws-iam-auth' placeholder password to ensure proper
DATABASE_URL format: postgresql://user:aws-iam-auth@host:5432/db?params
- Set DATABASE_URL to null for IAM auth (generated at runtime)
- Add RDS IAM token generation function in tfe.sh.tpl
- Generate authentication tokens using 'aws rds generate-db-auth-token'
- Inject DATABASE_URL with actual IAM token into Docker Compose at runtime
- Use IAM username and token-based authentication
- Ensure proper URL encoding for special characters in tokens
Based on Atlas app implementation and TFE auth.go analysis:
- Set DATABASE_URL to base connection string WITHOUT password for IAM auth
- Remove runtime token generation (pgmultiauth library handles this internally)
- Let TFE's pgmultiauth library generate IAM tokens automatically
- DATABASE_URL format: postgresql://user@host:5432/db?sslmode=require
- DATABASE_AUTH_USE_AWS_IAM=true triggers IAM authentication in pgmultiauth
database_host already includes ':5432' port from RDS endpoint, so don't
add ':5432' again. This was causing malformed DATABASE_URL which resulted
in 'DATABASE_URL: null' in compose.yaml.

Correct format: postgresql://user@host:5432/db?params
- Set TFE_DATABASE_PASSWORD to null when IAM auth is enabled
- Add TFE_DATABASE_USE_INSTANCE_PROFILE for AWS instance profile
- Filter out null values so TFE_DATABASE_PASSWORD doesn't appear in compose at all
- This prevents TFE config validation from checking for password when using IAM
…nt variables

- Add TFE_DATABASE_PASSWORDLESS_AWS_USE_INSTANCE_PROFILE mapping to fix TFE config validation
- Add TFE_DATABASE_PASSWORDLESS_AWS_REGION for completeness
- These environment variables map to Database.Passwordless struct fields in TFE config
- Resolves 502 error by ensuring TFE validation recognizes IAM auth is enabled
- Set TFE_DATABASE_PASSWORD to empty string for IAM auth (required by TFE)
- Add DATABASE_AUTH_AWS_SERVICE_NAME=rds-db for Atlas IAM provider
- Fix DATABASE_URL format with empty password placeholder (:@)
- Remove duplicate port from DATABASE_URL (host already includes port)

These fixes address the 502 errors by ensuring Atlas can properly:
1. Initialize AWS IAM token provider with correct service name
2. Parse database connection parameters from properly formatted URL
3. Generate IAM tokens for PostgreSQL RDS connections
Remove DATABASE_AUTH_* and DATABASE_URL variables as they should not be
required for PostgreSQL IAM authentication. The TFE core system should
handle IAM authentication using only the TFE_DATABASE_* variables.
- Add SSM document execution in user_data template
- Execute SSM command during TFE FDO startup to create IAM user automatically
- Add postgres_iam_setup_ssm_document variable to tfe_init module
- Wait for command completion and log results
- Prevents 502 errors caused by missing IAM user in PostgreSQL database
- Compress PostgreSQL IAM setup script significantly
- Reduce verbose logging to minimize user_data size
- Keep essential functionality: SSM execution and status monitoring
- Use shorter variable names and compact loops
- Add IMDSv2 token support with IMDSv1 fallback for metadata access
- Fix bash syntax errors in conditional statements and loops
- Add proper error handling and validation for SSM commands
- Use while loops instead of bash ranges for compatibility
- Add metadata debugging output for troubleshooting
- Remove complex SSM command execution that was causing failures
- Create IAM user directly in user_data script using psql
- Install PostgreSQL client and handle database connection inline
- Use template variables for database connection parameters
- Eliminate dependency on SSM document and metadata service
- Simpler, more reliable approach for IAM user creation
- Compress entire IAM setup into 4 concise lines
- Remove all verbose logging and error handling to save space
- Use compact for loop with seq instead of while loops
- Combine all SQL grants into single psql command
- Essential functionality preserved in minimal footprint
- Add detailed logging for each step of the PostgreSQL setup
- Add retry logic for PostgreSQL client installation
- Add proper connectivity testing with version check
- Add user verification after creation
- Add explicit error handling with exit 1 on failure
- Add more descriptive log messages to debug cloud-init issues
PostgreSQL IAM user creation is now handled by null_resource
in terraform-aws-terraform-enterprise database module instead
of user_data script. This provides better error handling and
ensures the user is created before TFE container starts.
- Add robust PostgreSQL IAM user creation back to EC2 user_data
- Include proper error handling and logging
- EC2 instances have network access to RDS unlike CI machines
- Compact but complete implementation for AWS size limits
- Reduced from ~50 lines to ~15 lines of PostgreSQL setup
- Install specific postgresql-client-16 package
- Compact single-line SQL command for IAM user creation
- Reduced logging but kept essential status messages
- File now 10KB vs previous ~12KB to ensure AWS compliance
- Maintains all required functionality for IAM user creation
- Remove unused variable from tfe_init module
- Update user_data template condition to use database_iam_username
- Clean up SSM document approach in favor of direct automation
- Fix terraform formatting issues
- Code follows best practices and naming conventions
- Remove unused Redis passwordless AWS variables:
  - redis_passwordless_aws_use_iam
  - redis_passwordless_aws_region
  - redis_passwordless_aws_host_name
- Remove unused Redis Sidekiq variables:
  - redis_sidekiq_host
  - redis_sidekiq_user
  - redis_sidekiq_password
  - redis_sidekiq_use_tls
  - redis_sidekiq_use_auth

All variables were declared but never referenced in the codebase.
Fixes terraform_unused_declarations warnings from tflint.
@raviharshicorp raviharshicorp marked this pull request as ready for review November 20, 2025 20:58
@raviharshicorp raviharshicorp requested a review from a team as a code owner November 20, 2025 20:58
- Remove invalid TFE_DATABASE_USE_INSTANCE_PROFILE variable
- Keep only the correct TFE_DATABASE_PASSWORDLESS_AWS_USE_INSTANCE_PROFILE
- TFE_DATABASE_USE_INSTANCE_PROFILE is not a valid TFE environment variable
- Eliminates duplicate and invalid configuration

Line 19: TFE_DATABASE_USE_INSTANCE_PROFILE (invalid) → removed
Line 21: TFE_DATABASE_PASSWORDLESS_AWS_USE_INSTANCE_PROFILE (valid) → kept
- Add back explorer_database_passwordless_azure_use_msi variable
- Add back explorer_database_passwordless_azure_client_id variable
- These were incorrectly removed as they are for Explorer database, not primary database
- Our scope is limited to primary database changes only

Reverts scope creep - Explorer database variables should remain untouched.
database = {
TFE_DATABASE_USER = var.database_user
TFE_DATABASE_PASSWORD = var.database_password
TFE_DATABASE_USER = var.database_user
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please run terraform fmt before pushing commits.

TFE_DATABASE_PASSWORDLESS_AZURE_CLIENT_ID = var.database_passwordless_azure_client_id
# AWS IAM authentication for PostgreSQL passwordless
TFE_DATABASE_PASSWORDLESS_AWS_USE_INSTANCE_PROFILE = var.database_passwordless_aws_use_iam
TFE_DATABASE_PASSWORDLESS_AWS_REGION = var.database_passwordless_aws_region
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe we can do it in a separate PR, but I am thinking the variables database_passwordless_aws_region and s3_region have the same values. Does it makes sense to merge them into one?


%{ if database_iam_username != null && database_iam_username != "" ~}
echo "[$(date +"%FT%T")] Setting up PostgreSQL IAM user" | tee -a $log_pathname
sudo apt-get update -qq && sudo apt-get install -y postgresql-client-16 >/dev/null 2>&1
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks a little fragile, what happens when we use pg 15 or 17 as in compatibility tests?

Comment on lines +193 to +194
psql -h "${database_host}" -U "${admin_database_username}" -d "${database_name}" -c "DO \$\$ BEGIN IF NOT EXISTS (SELECT 1 FROM pg_roles WHERE rolname = '${database_iam_username}') THEN CREATE USER \"${database_iam_username}\" WITH LOGIN; GRANT rds_iam TO \"${database_iam_username}\"; GRANT CONNECT ON DATABASE \"${database_name}\" TO \"${database_iam_username}\"; GRANT USAGE, CREATE ON SCHEMA public TO \"${database_iam_username}\"; GRANT ALL ON ALL TABLES IN SCHEMA public TO \"${database_iam_username}\"; GRANT ALL ON ALL SEQUENCES IN SCHEMA public TO \"${database_iam_username}\"; ALTER DEFAULT PRIVILEGES IN SCHEMA public GRANT ALL ON TABLES TO \"${database_iam_username}\"; ALTER DEFAULT PRIVILEGES IN SCHEMA public GRANT ALL ON SEQUENCES TO \"${database_iam_username}\"; RAISE NOTICE 'IAM user created'; ELSE RAISE NOTICE 'IAM user exists'; END IF; END \$\$;" >/dev/null 2>&1
echo "IAM user ${database_iam_username} ready" | tee -a $log_pathname
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this what we suggest to our clients as well? To give admin privileges to the IAM user on all schemas? The reason I ask - release tests should be similar in deployment to what we suggest our clients, if the privileges we suggest change some time in the future the test should break.

Also, have we tried doing this via IAM roles and policies? If not, could we first try setting up that manually and if it works then incorporate it here?

Comment on lines +240 to +262
PGPASSWORD="$admin_password" psql -h "$db_endpoint" -U "$admin_user" -d "$db_name" -v ON_ERROR_STOP=1 << EOF
DO \$\$
BEGIN
-- Check if user exists
IF NOT EXISTS (SELECT 1 FROM pg_roles WHERE rolname = '$iam_user') THEN
-- Create the IAM user
CREATE USER "$iam_user";
-- Grant rds_iam role (this role exists automatically in RDS PostgreSQL with IAM auth enabled)
GRANT rds_iam TO "$iam_user";
-- Grant necessary database permissions
GRANT CONNECT ON DATABASE "$db_name" TO "$iam_user";
GRANT USAGE ON SCHEMA public TO "$iam_user";
GRANT CREATE ON SCHEMA public TO "$iam_user";
-- Grant table permissions
GRANT ALL PRIVILEGES ON ALL TABLES IN SCHEMA public TO "$iam_user";
GRANT ALL PRIVILEGES ON ALL SEQUENCES IN SCHEMA public TO "$iam_user";
-- Grant default privileges for future objects
ALTER DEFAULT PRIVILEGES IN SCHEMA public GRANT ALL ON TABLES TO "$iam_user";
ALTER DEFAULT PRIVILEGES IN SCHEMA public GRANT ALL ON SEQUENCES TO "$iam_user";
RAISE NOTICE 'Successfully created IAM user: $iam_user';
ELSE
RAISE NOTICE 'IAM user already exists: $iam_user';
END IF;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the difference between the things we are doing here and in modules/tfe_init/templates/aws.ubuntu.docker.tfe.sh.tpl?

Copy link
Contributor

@ajmera-naman ajmera-naman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tests are failing

TFE_DATABASE_PASSWORD = var.database_password
TFE_DATABASE_USER = var.database_user
# For IAM authentication, set empty password but ensure the variable exists
TFE_DATABASE_PASSWORD = var.database_passwordless_aws_use_iam ? "" : var.database_password
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can directly send "" for var.database_password

}
database_configuration = local.disk ? {} : local.database
# Filter out null values so they don't appear in the compose file at all
database_configuration = local.disk ? {} : { for k, v in local.database : k => v if v != null }
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this needed

TFE_EXPLORER_DATABASE_USER = var.explorer_database_user
TFE_EXPLORER_DATABASE_PASSWORD = var.explorer_database_password
TFE_EXPLORER_DATABASE_PARAMETERS = var.explorer_database_parameters
TFE_EXPLORER_DATABASE_PASSWORDLESS_AZURE_USE_MSI = var.explorer_database_passwordless_azure_use_msi
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why are we removing this

TFE_EXPLORER_DATABASE_PASSWORD = var.explorer_database_password
TFE_EXPLORER_DATABASE_PARAMETERS = var.explorer_database_parameters
TFE_EXPLORER_DATABASE_PASSWORDLESS_AZURE_USE_MSI = var.explorer_database_passwordless_azure_use_msi
TFE_EXPLORER_DATABASE_PASSWORDLESS_AZURE_CLIENT_ID = var.explorer_database_passwordless_azure_client_id
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same as above

redis_bootstrap_ca_pathname = local.redis_bootstrap_ca_pathname

# Database configuration for templates
Database = {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why are we adding this

Fix multiple issues based on code review comments:

1. Set TFE_DATABASE_PASSWORD to empty string directly (ajmera-naman feedback)
2. Remove unnecessary null filtering logic - not needed for passwordless setup
3. Restore explorer database variables that were accidentally removed
4. Remove duplicate TFE_EXPLORER_DATABASE_PARAMETERS line
5. Run terraform fmt to fix formatting issues (kkavish feedback)
6. Simplify database_configuration assignment

This addresses all PR review comments while focusing the module on PostgreSQL
IAM passwordless authentication and maintaining explorer database functionality.
@raviharshicorp raviharshicorp force-pushed the pravi/IND-5776-release-test branch from c5e133b to 33b6172 Compare December 1, 2025 15:31
@raviharshicorp
Copy link
Author

Closing this as this repo is archived. New PR in FDO repo: https://github.com/hashicorp/terraform-terraform-enterprise-fdo/pull/68

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants