Docker is a containerization service, designed for running apps in their own environment on any system. It’s intended to be platform-agnostic, but if you need to store data on a disk, that can be done with volume and bind mounts.
Use an External Database or Object Store
This is the method that most people will recommend. Storing state as a file on disk isn’t in line with Docker’s model, and while it can be done, it’s always best to consider—do you really need to?
For example, let’s say you’re running an a web application in Docker that needs to store data in a database. It doesn’t make much sense to run MySQL in a Docker container, so you should instead deploy MySQL on RDS or EC2, and have the Docker container connect to it directly. The Docker container is entirely stateless like it is intended to be; it can be stopped, started, or hit with a sledgehammer, and a new one can be spun up in its place, all without data loss. Using IAM permissions, this can be accomplished securely, entirely within your VPC.
If you really need to store files, such as user uploaded photos and video, you really should be using AWS’s Simple Storage Service (S3). It’s much cheaper than EBS-based storage, and far cheaper compared to EFS storage, which is your primary choice for a shared filesystem for ECS containers. Rather than storing a file on disk, you upload directly to S3. This method also allows you to run additional processing using Lambda functions on uploaded content, such as compressing images or video, which can save you a lot on bandwidth costs.
Simple Solution: Mount a Drive to a Container
Docker has two ways to achieve persistence: volume mounts, and bind mounts. Bind mounts allow you to mount a particular location on your server’s filesystem to a location inside the Docker container. This link can be read-only, but also read/write, where files written by the Docker container will persist on disk.
You can bind individual host directories to target directories in the Docker container, which is useful, but the recommended method is to create a new “volume,” managed by Docker. This makes it easier to backup, transfer, and share volumes between different instances of containers.
A word of caution: If you don’t have direct access to the server you’re running Docker on, as is the case with managed deployments like AWS’s Elastic Container Service (ECS) and Kubernetes, you’ll want to be careful with this. It’s tied to the server’s own disk space, which is usually ephemeral. You’ll want to use an external file store like EFS to achieve real persistence with ECS (more on that later).
However, bind and volume mounts do work well if you’re simply using Docker to run an easy installation of an app on your server, or just want quick persistence for testing purposes. Either way, the method of creating volumes will be the same regardless of where you’re storing them.
You can create a new volume from the command line with:
docker volume create nginx-config
And then, when you go to run your Docker container, link it to the target in the container with the --mount
flag:
docker run -d --name devtest --mount source=nginx-config,target=/etc/nginx nginx:latest
If you run docker inspect
, you’ll see the volume listed under the Mounts
section.
If you’re using Docker Compose, the setup is easy as well. Simply add a volumes
entry for each container service you have, then map a volume name to a location in the guest. You’ll also need to provide a list of volumes in a top-level volumes
key for Compose to provision.
version: "3.0" services: web: image: nginx:latest ports: - "80:80" volumes: - nginx-config:/etc/nginx/ volumes: nginx-config:
This will create the volume automatically for this Compose. If you’d like to use a premade volume from outside Compose, specify external: true
in the volume configuration:
volumes: cms-content: external: true
If you’d like to instead simply do a bind mount and not bother with volumes, simply enter in a path name in place of the volume name, and forego defining the volume names.
version: "3.0" services: web: image: nginx:latest ports: - "80:80" volumes: - /docker/nginx-config/:/etc/nginx/
You can read Docker’s full documentation on using volumes with Compose if your use case requires something more specific than this.
For Managed Deployments, Use a Shared File System (AWS EFS)
If you’re deploying on AWS ECS, you won’t be able to use a normal bind or volume mount, because once you shut down the container, you probably won’t be running on the same machine the next time you start it up, defeating the purpose of persistence.
However, you can still achieve persistence using another AWS service—Elastic File System (EFS). EFS is a shared network file system. You can mount it to multiple EC2 servers, and the data accessed will be synced across all of them. For example, you could use this to host the static content and code for your website, then run all of your worker nodes on ECS to handle the actual serving of your content. This gets around the restriction of not storing data on disk, because the volume mount is bound to an external drive that persists across ECS deployments.
To set this up, you’ll need to create an EFS file system. This is fairly straightforward, and can be done from the EFS Management Console, but you’ll want to make a note of the volume ID as you’ll be needing it to work with the volume.
If you need to manually add or change files in your EFS volume, you can mount it to any EC2 instance. You’ll need to install amazon-efs-utils
:
sudo yum install -y amazon-efs-utils
And then mount it with the following command, using the ID:
sudo mount -t efs fs-12345678:/ /mnt/efs
This way, you can directly view and edit the contents of your EFS volume as if it was another HDD on your server. You’ll want to make sure you have nfs-utils installed for this all to work properly.
Next, you’ll have to hook up ECS to this volume. Create a new task definition in the ECS Management Console. Scroll to the bottom, and select “Configure Via JSON.” Then, replace the empty “volumes” key with the following JSON, adding the “family” key at the end:
"volumes": [ { "name": "efs-demo", "host": null, "dockerVolumeConfiguration": { "autoprovision": true, "labels": null, "scope": "shared", "driver": "local", "driverOpts": { "type": "nfs", "device": ":/", "o": "addr=fs-XXXXXX.efs.us-east-1.amazonaws.com,nfsvers=4.1,rsize=1048576,wsize=1048576,hard,timeo=600,retrans=2,noresvport" } } } ], "family":"nginx",
Replace fs-XXXXXX.efs.us-east-1.amazonaws.com
with your EFS volume’s real address. You should see a new volume:
You can use this in your container definition as a mount point. Select “Add Container” (or edit an existing one), and under “Storage And Logging,” select the newly created volume and specify a container path.
Save the task definition, and when you launch a cluster with this new definition, all of the containers will be able to access your shared file system.