Round Robin MongoDB backups to S3 with Tar
22 Aug 2012Have you been looking for an easy way to back something up to the cloud with minimum effort? Having explored several options we settled on the most simple solution available. Tar and Amazon S3.
The Tar program provides the ability to create tar archives, as well as various other kinds of manipulation. For example, you can use Tar on previously created archives to extract files, to store additional files, or to update or list files which were already stored. Initially, tar archives were used to store files conveniently on magnetic tape. The name “Tar” comes from this use; it stands for tape archiver. Despite the utility’s name, Tar can direct its output to available devices, files, or other programs (using pipes), it can even access remote devices or files (as archives).
backup.sh
tarsplitter.sh
Make sure that s3cmd
or s3multiput
is in your environment path.
Running on an AWS instance
As long as you are on an AWS instance you have the s3multiput
utility installed and ready to start using these scripts right away. We noticed on our AWS instance that s3multiput
did not work because FileChunkIO was not installed. The S3 command line tools are written in Python, so we installed FileChunkIO with the following command:
$ sudo easy_install FileChunkIO
Non-AWS scenarios
If you are not on an AWS instance, you have to install s3cmd. Unfortunately the S3 tools available in AWS are not yet packaged for Ubuntu, which however has some native package support for other AWS services.
Installing S3 Tools on Ubuntu
If you happen to have Ubuntu 12.04 LTS you can safely install s3cmd
with apt-get
.
$ sudo apt-get install s3cmd
Otherwise we recommend that you install from source.
Installing S3 Tools on RPM-based systems
Users of Suse (Novell) and RedHat based Linux distributions are encouraged to add our package repository to their package managers. That way you’ll always stay up to date with your s3cmd package.
As stated above it is best to add the package repository to stay up to date with the S3 tools.
Installing S3 Tools from source
Check out the source of S3 tools:
$ git clone git://github.com/s3tools/s3cmd.git
$ cd s3cmd
$ sudo python setup.py install
The above requires that you have the Python distutils module. On a Debian system (such as Ubuntu):
$ sudo apt-get install python-setuptools
Configure s3cmd
You have to run s3cmd --configure
in order to make s3cmd
work. This will take you through a set of guided prompts setting up your access key and secret key as well as encryption.