Skip to main content

Explanation: Understanding S3 Infrastructure

Introduction

This document explains the architecture, design decisions, and policy guardrails behind Norton's self-service S3 infrastructure. It covers how the Terraform module works, how the CI/CD pipeline validates and applies changes, what OPA policies enforce, and the rationale behind the choices made. Read the how-to guide first if you need practical step-by-step instructions for creating or modifying buckets.

The Problem: Compliance and Auditability

Creating an S3 bucket in AWS is straightforward — but doing it via the AWS Console bypasses the auditable, policy-enforced change trail required for SOC2 compliance. Every infrastructure change must be tracked, reviewed, and validated against organizational policies. Manual Console changes leave no reviewable merge request, no OPA policy evaluation, and no consistent Terraform state.

What Self-Service Solves

The self-service model ensures all S3 changes flow through Terraform and the CI/CD pipeline, providing a complete audit trail while giving development teams direct ownership of their configurations:

Developers own the configuration. OPA policies provide automated guardrails. The Platform team provides oversight without being a bottleneck.


Architecture Overview

Module Structure

The S3 Terraform module lives at aws/s3/ in the Infrastructure repository:

aws/s3/
├── s3.tf # Bucket resources, versioning, encryption, lifecycle, policies
├── variables.tf # Input variable definitions with types and defaults
└── policies/ # Optional JSON bucket policies (e.g., {bucket-key}-policy.json)

How Buckets Are Configured

Each environment has its own variables file at accounts/{environment}/s3/terraform.tfvars. These files contain a map called s3_buckets where each key is a logical name for a bucket and the value is an object with configuration attributes.

The module iterates over this map and creates the corresponding AWS resources:

Unlike the RDS module (which has ~30 required/optional fields), S3 bucket creation is minimal by design — an empty object {} creates a fully functional encrypted bucket. Every optional feature is additive.


Bucket Naming

Where the "key" comes from

In the tfvars file, s3_buckets is a Terraform map — each top-level entry has a key (the logical name on the left) and a value (the configuration object on the right):

s3_buckets = {
my-app-data = { # <-- "my-app-data" is the key
versioning_enabled = true
}
}

The key is chosen by you, the developer, when you add the entry. It is not auto-generated and not created anywhere else — the module simply reads whatever string you put there. The module then uses that key as the starting point for the actual AWS bucket name.

Naming strategies

The module supports two naming strategies:

Default naming (custom_name = false or omitted):

  • Bucket name: {key}-{environment}
  • Example: key my-app-data in the dev environment becomes my-app-data-dev

Custom naming (custom_name = true):

  • Bucket name: exactly the map key
  • Example: key wwnorton-kubecost-federated becomes wwnorton-kubecost-federated

Bucket names are globally unique across all of AWS. If another AWS account anywhere in the world already owns a bucket with that name, your Terraform apply will fail. The default naming strategy (with environment suffix) significantly reduces the chance of collisions. If using custom_name, prefix with wwnorton- to avoid conflicts with other organizations.


Encryption

Every bucket gets server-side encryption automatically. There is no way to create an unencrypted bucket through this module, and the OPA policy enforces that encryption configuration exists for every new bucket.

How Encryption Is Applied

  1. The module creates an aws_s3_bucket_server_side_encryption_configuration for every bucket in the map — no exceptions
  2. The encryption algorithm defaults to aws:kms (AWS Key Management Service)
  3. The KMS key defaults to the shared account-level key defined in default_kms_key_id in the tfvars
  4. Per-bucket overrides are available via sse_algorithm and kms_key_id attributes
  5. bucket_key_enabled = true is set for KMS to reduce per-request encryption costs

Encryption Options

AlgorithmHow It WorksWhen to Use
aws:kms (default)Encrypts with a KMS key; supports key rotation and CloudTrail audit loggingMost use cases — recommended for all production buckets
AES256AWS-managed encryption (SSE-S3); no KMS key needed, no audit trailWhen KMS is not required or for cost reduction on high-volume, low-sensitivity buckets

Versioning

When versioning_enabled = true, S3 keeps all versions of an object (including deletes). This is useful for:

  • Accidental deletion recovery — "undelete" objects by removing the delete marker
  • Change history — access previous versions of an object by version ID
  • Compliance — retain immutable records of all object changes

Cost implication: Versioning stores all object versions, which increases storage costs over time. A 10 MB file updated 100 times results in 1 GB of stored data. Consider combining versioning with lifecycle rules to transition old versions to Glacier or expire them after a retention period.


Lifecycle Rules (Glacier and Deep Archive)

When glacier_enabled = true, the module adds a lifecycle rule that automatically transitions objects to cheaper storage tiers over time. This is the primary cost optimization mechanism for S3 buckets with large volumes of infrequently accessed data.

Storage Class Transition Flow

How It Affects Your Application

Storage ClassAccess SpeedCost ReductionApplication Impact
S3 StandardMillisecondsBaselineNo impact — normal read/write operations work as expected
Glacier1 minute to 12 hours~80% cheaper than StandardDirect reads fail with InvalidObjectState; must use the RestoreObject API first, then wait for the restore to complete
Deep ArchiveUp to 12 hours~95% cheaper than StandardSame as Glacier but with longer retrieval times; designed for data accessed less than once per year

Glacier transitions are one-way for existing objects. Once an object transitions to Glacier, it stays there until you explicitly restore it. The lifecycle rule applies to all objects in the bucket (empty prefix filter — no directory scoping). If your application needs to read objects older than the transition threshold, either:

  • Move those objects to a separate non-Glacier bucket, or
  • Implement RestoreObject handling in your application code, or
  • Increase glacier_transition_days to cover your application's access window

For the steps to distinguish an archived object from a genuinely deleted one, see How-To → Check whether a missing object is archived vs deleted.


ELB Access Logs

When elb_logs_enabled = true, the module attaches a bucket policy that allows the AWS Elastic Load Balancing service to write access logs to the bucket. This is a two-part setup: the module creates the bucket and policy, but you must separately configure the load balancer to send logs to it.

How It Works

  1. The module creates a bucket policy granting s3:PutObject to the ELB service account
  2. The ELB service account ID is region-specific — for us-east-1 it's 127311923021 (configured in the tfvars as elb_logs_account_id)
  3. Logs are written to {bucket}/AWSLogs/{account-id}/ and {bucket}/*/AWSLogs/{account-id}/
  4. After the bucket exists, you must enable access logging on the load balancer itself — the bucket alone doesn't trigger log delivery

What's in ELB Access Logs

Each log entry includes detailed request-level data:

  • Timing: Request timestamp, processing times (request, target, response latency)
  • Networking: Client IP and port, target IP, ELB status code and target status code
  • Request: Full URL, HTTP method, user agent string
  • Payload: Bytes sent and received
  • Security: SSL/TLS cipher, protocol version, certificate details

These logs are invaluable for debugging latency spikes, identifying error patterns, analyzing traffic distribution, and security incident investigation.


Bucket Policies

The module supports three types of bucket policies, applied in this order:

1. ELB Logs Policy (Automatic)

Attached when elb_logs_enabled = true. Grants the AWS ELB service account permission to write access log files to the bucket.

2. Public Read Policy (Blocked by Default)

When set, public_read = true removes the S3 public access block and attaches a policy allowing unauthenticated s3:GetObject on every object in the bucket. It is intended only for intentionally-public content (e.g., static website assets).

public_read = true is blocked by default. MRs that set public_read = true will not be approved through the standard review flow — the configuration is rejected unless your team has coordinated an exception with the Platform team ahead of time.

If your team has a legitimate need for a public bucket, reach out to Platform before opening your MR with:

  • What content will be served and why it must be publicly readable (vs signed URLs, CloudFront with OAI, or an authenticated endpoint)
  • The expected traffic volume
  • Who owns the bucket contents and is accountable for keeping it free of sensitive data

If Platform agrees the use case is valid, they'll sign off on the MR and help you structure the bucket safely — typically as a dedicated bucket containing only public content, never mixed with private data.

Never use public_read = true for buckets containing user data, application secrets, logs, or anything else that shouldn't be on the public internet.

3. Custom Policies (File-Based)

If a JSON file exists at aws/s3/policies/{bucket-key}-policy.json in the module directory, it is attached as the bucket policy. This allows fine-grained IAM policies for specific services, roles, or cross-account access.

The most common reason teams reach for a custom policy is to grant a specific application access to a bucket — typically a Lambda function, an EKS pod (via EKS Pod Identity or IRSA), or an AWS service principal like CloudFront, AWS Config, or AWS Backup. The shape of the policy is the same in every case:

The differences between platforms (Lambda vs EKS pod vs cross-account vs service principal) are entirely in the Principal field; the rest of the policy structure is identical. The how-to guide has copy-pasteable examples for each platform — see How-To → Granting Application Access via a Bucket Policy.

Bucket policies and IAM identity policies stack — both must allow the action. Granting a Lambda's role access via a bucket policy is necessary but not sufficient: the Lambda's execution role also needs an identity-side IAM policy allowing the same s3:* action on the same bucket ARN. Norton's convention is to keep bucket-side permissions in this repo and identity-side permissions wherever the role is defined.

Policy precedence: If both ELB logs and a custom policy are defined for the same bucket, the custom policy file is used as the final policy document (it replaces the ELB policy via depends_on). If you need both ELB log delivery and a custom policy, ensure the custom policy JSON includes the ELB log permissions.

The filename has to match the bucket key exactly. The module looks for aws/s3/policies/{bucket-key}-policy.json — a typo or a mismatched separator (_ vs -) means the policy silently does not attach. The plan output will look fine; the bucket will simply have no custom policy. Always verify the policy is attached after apply via the AWS Console (Bucket → Permissions → Bucket policy) or aws s3api get-bucket-policy --bucket {name}.


OPA Policy Guardrails

The CI pipeline evaluates every S3 change against Open Policy Agent (OPA) policies before Terraform can apply. These policies enforce organizational standards and prevent misconfigurations.

How OPA Evaluation Works

  1. The CI pipeline runs terraform plan and converts the output to JSON
  2. OPA evaluates the plan JSON against policy rules in policies/accounts/{environment}/s3/policy.rego
  3. If any deny rules match, the pipeline fails and posts the violation messages as MR comments
  4. The developer fixes the violations and pushes updated commits — the pipeline re-runs automatically

What the Policies Enforce

CheckWhat's ValidatedHow It Works
EncryptionAll new buckets must have encryptionVerifies aws_s3_bucket_server_side_encryption_configuration exists for each new bucket
Glacier min daysTransition not too soonglacier_transition_days must meet the minimum from allowlist
Glacier max daysTransition not too lateglacier_transition_days must not exceed the maximum from allowlist
Deep Archive orderingMust be after Glacierdeep_archive_days must be greater than glacier_transition_days
Deep Archive min daysNot too soondeep_archive_days must meet the minimum from allowlist
Bucket deletion (prod only)BlockedProduction buckets cannot be deleted via Terraform — contact Platform team

These values change over time. Always check the current allowlist at policies/data/s3/allowlist.json before submitting your MR rather than relying on values printed in this document.

Known Policy Gaps

The following areas are documented for transparency and are tracked for future improvement:

  1. Public read is review-gated, not OPA-gated today: public_read = true is blocked by default via the Platform review process described in Public Read Policy, not via an OPA deny rule. Any MR that sets it without a pre-coordinated exception should be declined during review.
  2. No tag enforcement: The S3 module does not currently support bucket-level tags in the configuration object, so there is no tag enforcement at the OPA level.
  3. No force_destroy control: All buckets are created with force_destroy = true, meaning Terraform can delete them even when they contain objects. This is convenient for development but risky for production — the production OPA policy compensates by blocking all bucket deletions entirely.

Environment Differences

Development vs Production

The table below shows the per-environment values the OPA policy enforces. All rows except force_destroy are OPA-enforced ranges/rules — the pipeline will reject your MR if your configuration falls outside them. The force_destroy row is a fixed module behavior — not something developers set in tfvars. Glacier/Deep Archive day values in tfvars are free-form within the enforced range; outside the range, OPA blocks the plan.

AspectDevelopmentProductionEnforced how
Glacier min days1 day7 daysOPA range check — developer-configurable
Glacier max days365 days180 daysOPA range check — developer-configurable
Deep Archive min days30 days30 daysOPA range check — developer-configurable
Bucket deletionAllowedBlocked by OPAOPA deny rule in prod
force_destroyEnabled (module-fixed)Enabled (module-fixed; deletion still blocked by OPA)Hardcoded in module, not a tfvars attribute

These values live in policies/data/s3/allowlist.json. Treat that file as the source of truth — this documentation is a snapshot and may drift.

Account Structure

Norton's AWS accounts map to environments as follows:

  • Development account (637244866643): Hosts dev, QA, and staging buckets
  • Production account (100478842646): Hosts production buckets
  • ELB logs account (127311923021): AWS-managed account for us-east-1 ELB log delivery (this is not a Norton account — it's an AWS service account)

Each account has its own KMS encryption keys, IAM roles, and Terraform state.


CI/CD Pipeline Flow

The S3 pipeline follows the same flow as all Infrastructure repository resources:

What Happens on a Merge Request

If OPA finds violations, the pipeline fails and posts the specific issues as comments on the MR. Fix the violations and push again — the pipeline re-runs automatically.

What Happens on Merge to Main

Changes are applied to the specific environment based on the directory path:

  • Changes in accounts/development/s3/ are applied to the Development AWS account
  • Changes in accounts/production/s3/ are applied to the Production AWS account

Design Rationale

Why Minimal Configuration

Unlike RDS (which has ~30 required/optional fields), S3 buckets are designed to be created with an empty {}. This is intentional:

  1. Low barrier to entry — Creating a bucket should be as easy as adding one line to a file
  2. Secure defaults — Encryption is always on, public access is always off, no configuration needed
  3. Additive features — Versioning, Glacier, ELB logs, and folders are all opt-in when you need them
  4. Less room for error — Fewer required fields means fewer opportunities for misconfiguration

Why force_destroy Is Enabled

All buckets have force_destroy = true, which allows Terraform to delete non-empty buckets. This was a deliberate decision:

  1. Development needs to iterate quickly — recreating buckets during testing shouldn't require manual emptying of objects
  2. Production is protected — the OPA policy blocks all bucket deletions in the production account, regardless of force_destroy
  3. The combination gives flexibility in dev while maintaining safety in prod

How ownership is protected against accidental or malicious deletion

force_destroy = true at the AWS level means the bucket can be deleted non-empty — it does not mean anyone can delete it. The layers that gate deletion in development today are:

  1. GitLab MR review — every deletion requires editing the tfvars, which only happens through a merge request.
  2. CODEOWNERS — the Infrastructure repository uses CODEOWNERS to require review from the Platform team on changes to these tfvars files, so no other team can silently delete a bucket owned by the team that created it.
  3. Terraform plan visibility — any deletion shows up in the plan comment as aws_s3_bucket.this["my-app-data"] will be destroyed, which is hard to miss during review.
  4. Production is fully blocked — the production OPA policy rejects any plan that destroys an existing bucket, regardless of who authored the MR.

Known gap: in development, the current OPA policy does not enforce that the bucket owner's team be the one proposing the deletion — that check is entirely human. Adding an owner-tag-based OPA rule (so only the team that owns a bucket can modify/destroy it) is tracked as future work. Until then: if you notice a deletion of a bucket your team owns in someone else's MR, block review and tag @platform.

Why Encryption Is Always On

The module applies KMS encryption to every bucket with no opt-out mechanism. This aligns with:

  1. Norton's security requirements for data at rest encryption across all storage
  2. AWS best practices — encryption adds no latency and is free for S3-managed keys
  3. Compliance requirements that mandate encryption for all data storage, regardless of sensitivity classification

References

Internal (Norton)

External (AWS & HashiCorp)