Cloud Architecture
16 Feb 2026

SaaS on Top of Your Customers’ AWS Accounts: How We Use Cross-Account IAM Roles to Build a Multi-Tenant Product

What if your SaaS never stored customer data? Here's how we built a multi-tenant SaaS on AWS without storing customer data, using cross-account IAM roles for isolation and control.

Sam Williams
|
Sr. Product Engineer

Most articles about multi-tenant SaaS architecture follow the same pattern: one database, a tenant ID on every row, careful query scoping to make sure Customer A can't see Customer B's data. It's a well-understood problem with well-understood solutions.

We don't have that problem. We have a different one.

At Pronetx, we build CX Portal - an administration platform for Amazon Connect contact centres. Amazon Connect instances live in the customer's AWS account, not ours. Our product's entire job is to manage infrastructure we don't own, in accounts we don't control, on behalf of customers who have their own security teams, their own compliance requirements, and their own strong opinions about who gets access to what.

That changes almost everything about how you architect a multi-tenant system. This article is about what we learned.

The Architecture at a Glance

Before getting into the details, it helps to understand the basic shape of the system.

CX Portal is a single SaaS application hosted in our AWS account. We have one frontend, one set of backend services. Standard stuff. What's unusual is that we store almost none of our customers' data. The data lives in their account. Our application reads it, writes it, and manages it, but it is never stored outside of their AWS account.

To make this work, we use IAM cross-account roles. The customer deploys an IAM role into their account with a trust policy that allows our application to assume it. When a user makes a request in CX Portal, our backend assumes that role using STS, gets temporary credentials scoped to that customer's account, and uses those credentials to interact with their Connect instance, their DynamoDB tables, their EventBridge rules - whatever the feature needs.

The high-level flow looks like this:

Simple in concept. Complicated in practice.

Setting Up Cross-Account Access

The Trust Policy

Everything starts with the IAM role in the customer's account. The trust policy on that role determines who is allowed to assume it. In our case, that's a specific IAM role in our AWS account - the role attached to our Lambda authoriser.

A minimal trust policy looks like this:

{
 "Version": "2012-10-17",
 "Statement": [
   {
     "Effect": "Allow",
     "Principal": {
       "AWS": "arn:aws:iam::YOUR_PRONETX_ACCOUNT_ID:role/CXPortalAuthoriserRole"
     },
     "Action": "sts:AssumeRole"
   }
 ]
}

The customer deploys this via CloudFormation or CDK - we provide the stack. The Principal points to the role in our account. That's the only thing we're listing. More on why that matters shortly.

Calling AssumeRole

At runtime, once we've resolved which role ARN to use for a given customer, we call STS to get temporary credentials:

import { STSClient, AssumeRoleCommand } from "@aws-sdk/client-sts";

const stsClient = new STSClient({});

const credentials = await stsClient.send(
 new AssumeRoleCommand({
   RoleArn: roleArn,
   RoleSessionName: `cx-portal-${tenantId}-${Date.now()}`,
   DurationSeconds: 900,
 })
);

Those temporary credentials can then be passed into any AWS SDK client, and every call made with them will be scoped to the customer's account with the permissions defined on their role.

The Module-Per-Role Pattern

The CX Portal is designed to be a suite of tools and we call each of those tools a module.

One design decision we made early: each module in CX Portal gets its own IAM role. We don't have one big CXPortalRole that covers everything - we have separate roles for each product module with only the permissions that module actually needs.

This matters for two reasons:

  • Least privilege: if there's a bug or a breach in one module, the blast radius is limited to the permissions that module holds.
  • It's cleaner for customers. When they review what they're granting us, they're reviewing a set of focused, purposeful permissions rather than one sprawling policy that does everything.

Customers deploy a CloudFormation stack per module they subscribe to. Each stack creates the appropriate role. Each role has the minimum permissions needed for that module to function.

The Lambda Authoriser Problem — And Why It Matters

One nuance that isn’t immediately obvious from the AWS documentation: the trust policy on a cross-account IAM role has to explicitly list the AWS principal that's allowed to assume it. That means every Lambda function (specifically, every Lambda execution role) that calls AssumeRole on that role needs to be listed.

If you're not careful, you end up in a situation where every new API endpoint you add to your product requires a new Lambda function, which requires a new execution role, which requires every customer to update their IAM deployment to add it to the trust policy.

This introduces additional operational complexity in a multi-account setup.

The solution we landed on is a single Lambda authoriser that sits in front of all API Gateway requests. It's the only principal listed in every customer trust policy. When a new endpoint is added, no customer changes are required - the authoriser already has the trust it needs.

The authoriser does a few things:

  1. validates the incoming user token
  2. resolves which customer the request is for
  3. checks user has permission to call this endpoint using our role based access control (RBAC) system
  4. looks up the appropriate role ARN, assumes it, and attaches the resulting credentials to the request context for the downstream Lambda handler to use.

If there are any issues (invalid user token, insufficient permissions, or a missing client IAM role), the business logic Lambda is never invoked and the user gets a 403 response.

Authoriser Caching

One knock-on implication of routing everything through a Lambda authoriser is that you're potentially calling STS on every single API request. AssumeRole isn't slow, but invoking the lambda adds cost and latency, and you don't need a new set of temporary credentials on every call - they're valid for 15 minutes.

API Gateway's built-in authoriser caching solves this cleanly. You configure a cache TTL and a cache key, and API Gateway will reuse the authoriser's response for subsequent requests that match the same key - no authoriser invocation, no STS call.

Role Resolution

The last piece of the puzzle: given an incoming request, how do we know which role ARN to assume?

We maintain a mapping table in our own DynamoDB that lets us resolve a tenant, AWS account, and module combination to the specific role ARN to assume. When a customer onboards a new module or adds a new Connect instance, they run our CloudFormation stack, and we write the resulting role ARN into this table. Resolution at request time is a single DynamoDB lookup.

Multi-Tenancy Without a Shared Database

Traditional multi-tenancy involves a shared database with tenant IDs everywhere. You add WHERE tenant_id = ? to every query, hope nobody forgets, and spend a lot of time auditing your data access layer.

Our model is different. Tenant isolation is enforced by IAM, not by query scoping.

When our Lambda handler runs with credentials assumed from Customer A's role, those credentials only have access to Customer A's resources. There's no scenario where a bug causes us to accidentally read Customer B's data - the credentials don't have permission to touch Customer B's account. The isolation is structural, not just convention.

This also removes the complexity and security concerns from the developers. They just use the credentials that get passed to the lambda, and know that they will be scoped to the correct AWS account.

Our own DynamoDB (the config store, not the data store) does hold tenant-scoped data - the role mappings, subscription details, user records - but the actual customer operational data never leaves their account.

This is a meaningful security story, especially for enterprise customers who've been around the block a few times. "Your data never leaves your account" is easy for a security team to verify. "We use tenant ID scoping in our database and we've audited it thoroughly" requires significantly more trust.

The Complications Nobody Tells You About

Customers Control Their Own Roles

This is the one that will catch you off guard if you're not expecting it.

In a normal SaaS product, you own the database, you own the roles, you own the infrastructure. If something breaks, it's your fault and you can fix it. In this model, the customer owns the IAM roles. They can modify them, delete them, or accidentally misconfigure them as part of an unrelated bit of IAM housekeeping.

We've had customers delete a module role while cleaning up "unused" IAM resources. We've had customers narrow the permissions on a role without realising it would break anything. We've had CloudFormation stacks drift because someone made a manual change in the console.

Role Permission Drift

This one is subtler, and it's entirely on us.

As CX Portal adds new features, those features often need additional permissions. A new integration might require a Connect API call that wasn't in the original policy. A new module feature might need to read from a DynamoDB table that didn't exist when the customer first deployed. A new operational pattern might require changes to their EventBridge rules.

Every time this happens, every customer needs to redeploy their CloudFormation stack to pick up the updated IAM policies.

This is a genuine operational reality you have to design your release process around. Features that require permission changes need to be clearly communicated, the deployment instructions need to be simple and well-tested, and you have to be prepared for the fact that some customers will be on an older stack for a while. We've learned to version our CloudFormation stacks and to make new permissions additive where possible - keeping old permissions in place while adding new ones - so that customers on older stacks degrade gracefully rather than breaking entirely.

It also changes how you think about feature releases. A feature that requires a new permission is a more complex release than one that doesn't. That's worth factoring into planning.

When Does This Pattern Make Sense?

This architecture isn't for every SaaS product. It's specifically well-suited to a particular type of customer and a particular type of product.

The ideal fit is customers who already have their own AWS account and are looking for augmentation rather than replacement. They're not looking for you to host their data - they want a product that sits on top of infrastructure they already own and operates. The value you're providing is the tooling and the expertise, not the hosting.

This is especially common in highly regulated industries. Healthcare organisations bound by HIPAA, financial services firms under FCA or SOC 2 obligations, and public sector organisations with strict data residency requirements often cannot allow operational data to leave their AWS environment, regardless of how good your security story is. "The data doesn't leave your account" isn't a nice-to-have for these customers - it's a procurement requirement.

These organisations have also typically made significant investments in their own AWS security posture: VPCs, SCPs, GuardDuty, Security Hub, the works. They don't want a SaaS vendor to bypass all of that by pulling their data into a third-party account. The cross-account role model respects their existing security boundary rather than working around it.

When does it not make sense? If your customers don't have meaningful AWS infrastructure, this pattern creates more problems than it solves - you're asking customers to manage IAM deployments they're not equipped to maintain. And if you need to own the data model to build the product (analytics, ML, aggregation across customers), keeping data in your own account is the right call.

The Honest Summary

This model is genuinely powerful. For the right product and the right customer, it's a significantly better story than traditional SaaS - better security, cleaner isolation, simpler compliance conversations.

But customer-controlled IAM is a real operational complexity that you need to plan for, not hope you don't encounter. Your error handling needs to be specific and actionable. Your release process needs to account for the fact that some features require customer deployments. Your onboarding needs to be clear enough that customers can successfully deploy and maintain their IAM stacks.

None of those are blockers. They're just the tradeoffs that come with this approach - and in our experience, for the customers we're building for, they're very much worth it.