Managing the privileges available to a container, is important to the ongoing integrity of the container and the host on which it runs. With privilege, comes power, and the potential to abuse that power, wittingly or unwittingly.
A simple container example, serves to illustrate:
$ sudo docker container run -itd --name test alpine sh
8588dfbfc89fc5761c11ebff6c9319fb655da92a1134cd5810031149e5cfc6e0
$ sudo docker container top test -eo pid
PID
2140
$ ps -fp 26919
UID PID PPID C STIME TTY TIME CMD
root 2140 2109 0 10:31 pts/0 00:00:00 sh
A container is started in detached mode, we retrieve the process ID from the perspective of the default (or host's) PID namespace, list the process, and find that the UID (user ID) associated with the container's process is root
. It turns out that the set of UIDs and GIDs (Groups IDs) are the same for the container and the host, because containers are started with the privileged user with UID/GID=0 (aka root
or superuser). A big ask, but if the container's process were able to break out of the confines of the container, it would have root
access on the host.
There are lots of things we can do to mitigate this risk. Docker removes a lot of potentially, pernicious privileges by dropping capabilities, and applying other security mechanisms in order to minimize the potential attack surface. We can even make use of user namespaces, by configuring the Docker daemon to map a UID/GID range from the host, onto another range in the container. This means a container's process, running as the privileged root
user, will map to a non-privileged user on the host.
If you're able to make use of the --userns-remap
config option on the daemon, to perform this mapping, you absolutely should. Unfortunately, it's not always possible or desirable to do so - another story, another post! This puts us back to square one; what can we do to minimize the risk? The simple answer is, that we should always be guided by the principle of least privilege. Often, containers need privileges that are associated with the root
user, but if they don't, then you should take action to run your containers as a benign user. How do you achieve this?
A Simple Example
Let's take a simple Dockerfile example, which defines a Docker image for the AWS CLI. This use case might be more suited to a local developer's laptop, rather than a sensitive, production-based environment, but it will serve as an illustration. The image enables us to install and run AWS CLI commands in a container, rather than on the host itself:
FROM alpine:latest
# Define build argument for AWS CLI version
ARG VERSION
# Install dependencies, AWS CLI and clean up.
RUN set -ex && \
apk add --no-cache \
python \
groff \
less \
py-pip && \
pip --no-cache-dir install awscli==$VERSION && \
apk del py-pip
CMD ["help"]
ENTRYPOINT ["aws"]
Assuming the contents of the above are in a file called Dockerfile
, located in the current working directory, we can use the docker image build
command to build this image. Assuming we have made the local user a member of the group docker
, which for convenience, will provide unfettered access to the Docker CLI (something that should only ever be done in a development environment), the following will create the image:
$ docker image build --build-arg VERSION="1.14.38" -t aws:v1 .
We could then check the image works as intended, by running a container derived from the image. This is equivalent to running the command aws --version
in a non-containerized environment:
$ docker container run --rm --name aws aws:v1 --version
aws-cli/1.14.38 Python/2.7.14 Linux/4.4.0-112-generic botocore/1.8.42
This is all well and good, but as we didn't take any action to curtail any privileges, the container ran as the root
user, with UID/GID=0. This level of privilege is not necessary to run AWS CLI commands, so let's do something about it!
Using a Non-privileged User
To fix this, we can add a non-privileged user to the image, and then 'set' the user for the image to the non-privileged user, so that a derived container's process, is no longer privileged. The changes to the Dockerfile, might look something like this:
FROM alpine:latest
# Define build argument for AWS CLI version
ARG VERSION
# Install dependencies, AWS CLI and clean up.
RUN set -ex && \
apk add --no-cache \
python \
groff \
less \
py-pip && \
pip --no-cache-dir install awscli==$VERSION && \
apk del py-pip && \
addgroup aws && \
adduser -D -G aws aws
USER aws
WORKDIR /home/aws
CMD ["help"]
ENTRYPOINT ["aws"]
All we've done, is add two commands to the RUN
instruction, to add a group called aws
, and to add a user called aws
that belongs to the aws
group. In order to make use of the aws
user, however, we also have to set the user with the USER
Dockerfile instruction, and whilst we're at it, we'll set the working context in the filesystem, to its home directory, courtesy of the WORKDIR
instruction. We can re-build the image, tagging it as v2
this time:
$ docker image build --build-arg VERSION="1.14.38" -t aws:v2 .
Now that we have a new variant of the aws
image, we'll run up a new container, but we'll not specify any command line arguments, which means the argument for the aws
command will be help
, as specified with the CMD
instruction in the Dockerfile:
$ docker container run --rm -it --name aws aws:v2
Unsurprisingly, this will list help for the AWS CLI, which is piped to less
, which will give us the opportunity to poke around whilst the container is still running. In another terminal on the host, if we repeat the exercise we carried out earlier, when we looked for the container's process(es), we get the following:
$ docker container top aws -eo pid
PID
2436
2487
$ ps -fp 2436,2487
UID PID PPID C STIME TTY TIME CMD
rackham 2436 2407 0 14:27 pts/0 00:00:00 /usr/bin/python2 /usr/bin/aws he
rackham 2487 2436 0 14:27 pts/0 00:00:00 less -R
It reports that the processes are running with the UID associated with the user rackham
. In actual fact, the UID 1000
is associated with the user rackham
on the host, but in the container, the UID 1000
is associated with the user aws
:
$ id -u rackham
1000
$ docker container exec -it aws id
uid=1000(aws) gid=1000(aws) groups=1000(aws)
What really matters is the UID, not the user that it translates to, as the kernel works with the UID when it comes to access control. With the trivial changes made to the image, our container is happily running as a non-privileged user, which should provide us with some peace of mind.
IDs and Bind Mounts
There is something missing from the AWS CLI image, however. In order to do anything meaningful, the AWS CLI commands need access to the user's AWS configuration and credentials, in order to access the AWS API. Obviously, we shouldn't bake these into the image, especially if we intend to share the image with others! We could pass them as environment variables, but whilst this might be a means for injecting configuration items into a container, it's not safe for sensitive data, such as credentials. If you allow others, access to the same Docker daemon, without limiting access using an access authorization plugin, environment variables will be exposed to other users, if they use the docker container inspect
command. Another approach would be to bind mount the files containing the relevant data, into the container at run time. In fact, if we want to make use of the aws configure
command, to update our local AWS configuration, this is the only way we can update those files, when using a container.
On Linux, AWS config and credentials files are normally located in $HOME/.aws
, so we need to bind mount this directory inside the container, at the /home/aws/.aws
location of the container's user. We need to do this, each time we want to execute an AWS CLI command using a container. Let's try this out, and try to list the instances running in the default region, which is specified in the AWS config
file located in /home/aws/.aws
. This command is equivalent to running aws ec2 describe-instances
:
$ docker container run --rm -it --mount type=bind,source=$HOME/.aws,target=/home/aws/.aws \
--name aws aws:v2 ec2 describe-instances
You must specify a region. You can also configure your region by running "aws configure"
That didn't go too well! The error message would suggest that the aws
command can't find the files. After we've ascertained that the local user's UID/GID is 1001
, if we run another container, and override the container's entrypoint, and run ls -l ./.aws
, we can see the reason for the error:
$ id
uid=1001(baxter) gid=1001(baxter) groups=1001(baxter),27(sudo),999(docker)
$ docker container run --rm -it --mount type=bind,source=$HOME/.aws,target=/home/aws/.aws \
--entrypoint ls --name aws aws:v2 -l ./.aws
total 8
-rw------- 1 1001 1001 149 Feb 13 16:20 config
-rw------- 1 1001 1001 229 Feb 13 15:42 credentials
The files are present inside the container, but they are owned by UID/GID=1001. Remember, whilst we didn't specify a deterministic UID/GID for the container's user in the image, the addgroup
and adduser
commands, created the aws
user with a UID/GID=1000. There is a mismatch between the UID/GIDs, and the file permissions are such that the container's user cannot read or write to the files.
This is a big problem. We've been careful to ensure that our container runs with diminished privileges, but ended up with a problem to resolve, as a consequence.
We could try and circumvent this problem, by using the --user
config option to the docker container run
command, and specify the container gets run with a UID/GID=1001, instead of 1000:
$ docker container run --rm -it --mount type=bind,source=$HOME/.aws,target=/home/aws/.aws \
--user 1001:1001 --name aws aws:v2 ec2 describe-instances
You must specify a region. You can also configure your region by running "aws configure".
This error message is starting to become familiar. The reason, this time, is that there is no 'environment' ($HOME
to be precise) for a user with UID/GID=1001, which the AWS CLI needs in order to locate the config
and credentials
files. This is because there is no user configured in the container's filesystem with UID/GID=1001. We might be tempted to pass a HOME
environment variable to the docker container run
command, or even to alter the Dockerfile to provide deterministic values for the UID/GID. If we succumb to these seductions, then we're in danger of making the image very specific to a given host, and relying too much on a consumer of our image, to figure out how to make it work around these idiosyncrasies. A better option, would be to add the aws
user after the container has been created, which will give us the ability to add the user with the required UID/GID. Let's see how to do this.
Defer Stepping Down to a Non-privileged User
The image for the AWS CLI is immutable, so we can't define a 'variable' aws
user in the Dockerfile. Instead, we can make use of an entrypoint script, which will get executed when the container starts. It replaces the aws
command, that is specified as the entrypoint in the Dockerfile. Here's a revised Dockerfile:
$ FROM alpine:latest
# Define build time argument for AWS CLI version
ARG VERSION
# Add default UID for 'aws' user
ENV AWS_UID=1000
# Install dependencies, AWS CLI and clean up.
RUN set -ex && \
apk add --no-cache \
python \
groff \
less \
py-pip \
su-exec && \
pip --no-cache-dir install awscli==$VERSION && \
apk del py-pip && \
mkdir -p /home/aws
COPY docker-entrypoint.sh /usr/local/bin/
WORKDIR /home/aws
CMD ["help"]
ENTRYPOINT ["docker-entrypoint.sh"]
In addition to changing the entrypoint, and copying the script from the build context with the COPY
instruction, we've added an environment variable specifying a default UID for the aws
user (in case the user neglects to do so), removed the commands from the RUN
instruction for creating the user, and added a command to create the mount point for the bind mount. We've also added a utility to the image, called su-exec
, which will enable our script to step down from the root
user to the aws
user at the last moment.
Let's get to the entrypoint script, itself:
#!/bin/sh
# If --user is used on command line, cut straight to aws command.
# The command will fail, unless the AWS region and profile have
# been provided as command line arguments or envs.
if [ "$(id -u)" != '0' ]; then
exec aws "$@"
fi
# Add 'aws' user using $AWS_UID and $AWS_GID
if [ ! -z "${AWS_GID+x}" ] && [ "$AWS_GID" != "$AWS_UID" ]; then
addgroup -g $AWS_GID aws
adduser -D -G aws -u $AWS_UID aws
else
adduser -D -u $AWS_UID aws
fi
# Step down from root to aws, and run command
exec su-exec aws aws "$@"
When the script is invoked, it is running with the all powerful UID/GID=0, unless the user has invoked the container using the --user
config option. As the script needs root
privileges to create the aws
user, if its invoked with any other user, it won't be possible to create the aws
user. Hence, a check is made early on in the script, and if the user associated with the container's process is not UID=0, then we simply use exec
to replace the script with the aws
command, and any arguments passed at the end of the command which invoked the container (e.g ec2 describe-instances
). In this scenario, the command will fail if it is required to provide a default region and credentials.
What we would prefer the user to do instead, is specify an environment variable, AWS_UID
(and optionally, AWS_GID
), on the command line, which reflects the owner of the AWS config
and credentials
files on the host. Using this variable, the script will create the aws
user with a corresponding UID/GID, before the script is replaced with the desired AWS CLI command, which is executed as the aws
user, courtesy of the su-exec
utility. First we must re-build the image, and when that's done, let's also create an alias for invoking the AWS CLI container:
$ docker image build --build-arg VERSION="1.14.38" -t aws:v3 .
$ alias aws='docker container run --rm -it --mount type=bind,source=$HOME/.aws,target=/home/aws/.aws --env AWS_UID=$UID --name aws aws:v3'
In the Docker CLI command we've aliased, we've defined the AWS_UID
environment variable for use inside the container, which is set to the UID of the user invoking the container. All that's left to do, is test the new configuration, using the alias:
$ aws ec2 describe-instances --query 'Reservations[*].Instances[*].[InstanceId,State.Name]'
[
[
[
"i-04d3a022e5cc0a140",
"terminated"
]
],
[
[
"i-009efe47f59402b4e",
"terminated"
]
],
[
[
"i-0ad081df0fbe1d9e4",
"running"
]
]
]
This time we're successful!
Stepping down from the root
user for our containerized AWS CLI, is a fairly trivial example use case. The technique of stepping down to a non-privileged user in an entrypoint script, however, is very common for applications that require privileges to perform some initialisation, prior to invoking the application associated with the container. You might want to create a database, for example, or apply some configuration based on the characteristics of the host, or the command line arguments provided at run time.
Summary
If we hadn't undertaken this exercise to reduce the privileges available inside a container derived from our AWS CLI image, the task of creating the image would have been quite straightforward. However, in taking the time, and expending a little effort, we have taken a considerable step in minimizing the risk of privilege escalation inside the container, which in turn helps to reduce the risk of compromising the host itself. Running containers with a non-privileged user, is one of many steps we can take to secure the containers we run, especially when they are deployed to a production environment.
If you want to find out what else you can do to make your containers more secure, check out my hosted training course - Securing Docker Container Workloads.