Distributing files across a web farm or cluster.
We have 100's of servers in several locations , as part of our web
content management we need to push content out frequently, some times several
times a hour or more.
To date, we have used a mixture of http downloads and rsync script to
accomplish this. Now we are testing new mixture, that we hope will scale
out.
In our central location we have a large archive with all the files we
need to distribute. Our remote datacenters have a single node in each
datacenter to help with distribution. we take the archive, lets pretend
its a freebsd iso file, and we make it available via https, so we can download
it over the internet between our datacenters, not via our mpls or other
expensive transits. using metalink files, you can also specify the internal source as a lower preference.
then within the datacenter we share the file via torrent with the single
node mentioned above being the seed for the datacenter, and also the tracker.
encryption is optional.
This works well for single files, like large tar files, we need to
experiment with multiple individual small files.
1. start a tracker in
each datacenter (assume bt.local.dc)
o bttrack --dfile
/tmp/dfile --port 85 --reannounce_interval 5
2. create a torrent file
of the archive in the central distribution point. (assume bt.master)
o btmakemetafile http://bt.local.dc:85/announce
FreeBSD-8.4-BETA1-i386-dvd1.iso
3. Make a metalink file describing the archive and torrent too
o echo -e "external 100 https % https://bt.master.internet.ip \n internal 100 bittorrent % http://bt.master \n internal 10 http http://bt.master" | metalink -d md5 FreeBSD-8.4-BETA1-i386-dvd1.iso | sed 's/<url preference="100" location="internal" type="bittorrent">\(.*\)<\/url>/<url preference="100" location="internal" type="bittorrent">\1.torrent<\/url>/g' > f.metalink
4. make the archive
available on https, and the torrent file on https and a metalink too.
o cp
FreeBSD-8.4-BETA1-i386-dvd1.iso /var/www
o cp f.metalink
/var/www
o cp
FreeBSD-8.4-BETA1-i386-dvd1.iso.torrent /var/www
5. on the node that is
your tracker in each datacenter, start aria2c to download the metalink, it will
then download the torrent and start to seed as it downloads the the archive
with multi-part https download
o aria2c
--seed-ratio=0.0 --disable-ipv6 -V -d /var/www http://bt.master/f.metalink
6. on your endpoints
start aria2c to download the torrent , they will automatically then download
the file in the torrent from the swarm. set a post download hook to
finish the job.
o aria2c
--seed-ratio=0.0 --disable-ipv6 -V -d /var/www --on-bt-download-complete=nextsteps.sh http://bt.local.dc/FreeBSD-8.4-BETA1-i386-dvd1.iso.torrent
Comments
bbcp, bbftp, gridftp