As a start-up, growing your infrastructure at pace with your needs is not easy, but it can be made significantly easier through making some smart architectural choices early on.
At CV Partner, we make heavy use of Amazon Web Services to deliver our CV management platform to customers around the world. In this series of blog posts, I will introduce some useful architectural patterns which have enabled us to strike a good balance between agility and security. One of these patterns involves making use of multiple AWS accounts, connected by AWS Organizations.
Everyone begins developing on AWS in the same way: open an account, add a credit card, and start building something. In the early stages, there's little point in investing time in over-engineering your infrastructure and adding overhead where it's not strictly required. However, the multiple account model I will describe here is not expensive, and doesn't take up a great deal of your time. It's quicker to adopt this way of thinking from the start, rather than try to retro-fit later on. It will set you on the right path for compliance with various security frameworks, such as ISO27001 or the PCI-DSS, and will keep you in line with AWS-recommended security models.
The official AWS design guidelines prescribe a number of principles you should keep in mind while building your infrastructure, organised under five 'pillars'. The pillars we are concerned with today are security and cost optimization.
When you first start out, perhaps you have a handful of people on your team, each with their own IAM users in a single account. Everyone probably has full administrator privileges. This makes sense to begin with, because you're likely mostly developers, unsure of which services you're going to be using, and you might not even be dealing with real customer data yet. Eventually though, you're going to need to start getting serious about security, permission boundaries and accountability.
The first thing you ought to stop doing is logging in as your account's root user. This is a powerful administrative user which can bypass all other security restrictions, and it's only required for certain specific actions, such as creating CloudFront key pairs, and changing support plans or other account modifications. Shared credentials such at this are difficult to manage, because you may need to rotate them each time a developer leaves, and in the case of the root account, if you lose it, you may have to undergo a costly legal process to regain control of your account. So, we're going to use IAM users to grant our colleagues access to our AWS account.
Once everyone is logging in with IAM, we need to secure away our root account credentials. Firstly, as with any credential, I would strongly recommend using a password manager. Secondly, it can make investors nervous when a single person is able to log in as root and bypass policy and security restrictions. I would suggest enabling MFA on the root user, and giving the OTP secret to a different user than the person responsible for keeping the password. This way you have a two-party requirement for any access beyond what is normally expected. Keeping a log of the reason for this access might be a good idea for auditing purposes, however using CloudTrail and other auditing features provided by AWS can reduce this burden.
When creating applications in the cloud, we can apply long-settled security conventions in most areas of our projects. Network- and application-level security paradigms translate well from on-premises or self-hosted implementations to the cloud. The tricky part can be making the right decisions when it comes to the administrative layer, the capabilities of which vary wildly between cloud providers. Just as we will design our VPCs and security group rules to isolate resources and minimise access on the network layer, we should take the same approach in the administrative domain by isolating resources logically. Part of this process is finding natural partitions in your infrastructure, and moving resources to new accounts along those lines.
A good solution for most companies is simply to have one account for their production environment(s), and another account for everything else. Unlike VPCs, AWS does not recommend that you create temporary accounts as part of your development or deployment pipeline. While it's easy enough to create them, extra manual steps are required for deletion, which makes this process poorly suited for automation. So, you should resist the temptation to open a new account for every sub-project, customer or development environment you spin up. Too much separation here will likely just slow your developers down without adding much value.
At CV Partner, as an ISO27001 compliant organisiation, we must go to great lengths to ensure the security and integrity of our customers' data. So, we found that a natural way to partition our resources was by the type of data they were handling. We decided to create new accounts based on this, so that all our development environments which typically handle random test data are separated from our production environments which deal with real customer information. This has allowed us to introduce tight controls in production, while allowing more freedom in development through the use of account-based policies and IAM controls.
Aside from regular infrastructure, perhaps you have some special activity in your organisation which could be well served by having its own account. AWS Organizations allows you to connect accounts hierarchically, applying policies layer by layer. If it made sense for you to do so, you could have an account dedicated to managing S3 buckets, with a policy restricting that account from any activity not involving S3 buckets. If part of your business involved multiple users accessing S3 data as a matter of course, this might be a sensible option for you. This way, if an account was compromised, the possible 'blast radius' of any data breach would be easy to measure.
Once you've decided on which accounts you need, it's time to create a master account to bind them all together. You have a couple of options here; it's tempting to make your current account the 'master', and create new leaf accounts attached to it. However, this will probably cause you the largest amount of work. It's good practice to use the master account only for resources which are truly shared between other accounts, which probably doesn't include everything that's currently in your existing account. Purists will tell you that the master account should handle only authentication, and nothing else, however if you have some central infrastructure such as a VPN server, it wouldn't be unreasonable to suggest that the master account would be a good home for it. Either way, create yourself a new master account from scratch, using the regular AWS sign-up procedure. From the AWS Organizations console panel, you can then invite your existing account to become a member of the organisation, as well as create new leaf accounts without having to go through the usual sign-up process.
When you create a new account via the console, although the root user exists, it's not accessible until you set a password for it. I'd recommend leaving these root users alone unless you find that you need them for something (e.g. CloudFront keypairs) later. The only time you will definitely need to use them is if you decide to close the account later.
The more accounts you have, the more difficult it can be to keep track of users and access controls. It's easy to quickly create a user in a sub-account, assign them some permissions and then forget about it. However, you're just giving yourself a headache later on, when you have to trudge through your sub-accounts disabling credentials for an employee who's changing position or leaving the company. I've found that the most practical solution is to only create user accounts for your human employees in the master account. You might even go so far as to enforce a policy that no access keys may be generated in any child accounts. Resources in those accounts should be able to receive the permissions they need through assuming roles, so there should be little need for an application or service to use actual access keys.
But, I hear you ask, how can our users access resources in sub-accounts, if they're not allowed to have IAM users there? Traditional IAM user permissions can't cross accounts. Well, AWS provides a mechanism for this, called the Security Token Service (STS). Using this, our users can assume roles in other accounts, which give them the permissions they need by proxy. The idea behind this method is that we stop thinking about users, and think instead about the various functions our users perform. For example:
Operations: Users who can access production nodes
Developers: Users with read access to all logs, and full access to non-production environments
Auditors: Users who can access specific S3 buckets
Network: Users who can administer VPCs and access flow logs
We will create IAM roles in each sub-account to match each of these functions. This way, we can think of permissions in terms of the actions each role will need to perform in each account. For example, users who are assuming the 'production' role will likely have expansive permissions in a 'production' account, but perhaps may only access production-specific buckets in a dedicated S3 storage account. Perhaps a developer role has read-only access to logs and metrics in the production account, but administrator access in a sandbox account.
Because we need to create and manage these roles across many accounts, you might consider using orchestration tools such as Terraform, or AWS's own CloudFormation. Terraform in particular has support for managing multiple accounts using the usual features available in AWS shared credential files. Codifying our permissions in this way allows us to make use of version control, such as git, in order to provide us with an audited history of changes to permissions across all our accounts. This is a real time-saver if you have a strict change-control process, or have to periodically review permissions as part of a compliance framework, for example the PCI-DSS.
For extra security, it's important to ensure that our users all have MFA enabled, and we can enforce this by only allowing users logged-in with MFA to assume our new roles. To achieve this, we need to set a custom trust policy on the roles, such as the one you can see in this GitHub gist.
Once our roles are defined, we need to create matching groups in our master account. We can then give these groups permissions to 'assume' the appropriate roles in each account. Again, if you are using one, it would be smart to manage these groups in your orchestration tool. Each group will need an in-line policy which empowers its users to assume the appropriately named role, as you can see in this example GitHub gist.
The final step is to add users to the master account, configure their MFA, and then add these users to the groups which represent the roles they will perform. From this point forward, our users will need to tweak their local AWS config files in order to access the resources they want. For each user, create a single access key. They can enter this information into their config file in the usual way. Then, they will configure slave profiles which use these master credentials to assume roles in other accounts. Take a look at this GitHub gist for an example trust policy.
Users will then be able to use the AWS command-line tools in the same way as they would usually, but they can make use of the `--profile` option in order to specify which sub-account profile they would like to access.
Similarly, AWS console access is possible, using the regular role-switching mechanism you can find documented in the AWS knowledgebase.
As with everything in the cloud, this way isn't the only way, and might not be the best way for your company. Whatever road you choose, make sure you're taking full advantage of all the security and organisation features your cloud provider has to offer. A little bit of extra work now will likely save you time later as your business grows.