Building a Private Cloud: Experimenting with GlusterFS

I've had this nagging desire for how I *want* cloud computing to manifest in my day to day life. I currently spend my workday on a desktop computer with a pile of storage, and then make use of an Android tablet for couch surfing and an Android phone when out of the house. I rent a small virtual private server for my web hosting needs, and make use of Amazon S3 for encrypted backups of about 10 gigs of my most critical personal data.

I'll come right out and say I love Google's applications. Having my mail / documents / calendar available on all those devices and any other system I log into via a browser is immensely valuable to me. There is however an obvious problem, that is a lot of personal data now in the hands of a very large corporation, within reach of the US government should they decide they want a look at it. I have no guarantees what it will be used for, how safe it's being protected, who within those organizations is taking a peek, etc. A big part of our lives trickles out into our digital footprint and, just as I wouldn't want anyone rummaging around inside my head, I'd like to have a similar sense of security for my data.

In a perfect world, I'd like my own personal cloud solution, globally accessible on any of my devices, but secured and backed by encrypted storage such that no corporation or government has access to it. Something I could run on a server myself, or lease a solution for but with the knowledge that the provider didn't have access to that data.

This could obviously be a huge undertaking, but in the interest of starting small and simple, I wanted to first poke around with distributed file storage. Being a Fedora user and Red Hat employee (full disclosure), GlusterFS was the first thing to come to mind.

Gluster appears to be readily available in Fedora 17. With the following packages installed:


glusterfs-3.2.5-6.fc16.x86_64
glusterfs-server-3.2.5-6.fc16.x86_64
glusterfs-fuse-3.2.5-6.fc16.x86_64

I was able to start the server without any configuration changes:


$ service glusterd start

You'll want to make sure correct ports are open on your firewall, in my case I just shut it down but the gluster docs cover what needs to be done quite nicely.

Naturally you'd eventually be adding more trusted servers, but it appears doing the above automatically knows that the local server is available for storage. The next step was to create a volume, starting with just a distributed configuration (even though I think I'd ultimately want replicated):


$ mkdir /cloudfs
$ gluster volume create test-volume transport tcp kramer.local.rm-rf.ca:/cloudfs
Creation of volume test-volume has been successful. Please start the volume to access data.
$ gluster volume start test-volume

And now to mount that volume and make use of it:


$ mount -t glusterfs myserver:/test-volume /cloudfs-mounted

This appears to be all there is to it.


$ cp anaconda-ks.cfg /cloudfs-mounted
$ ls /cloudfs
anaconda-ks.cfg

I was able to repeat the mount step on another system and have access to the exact same data. So it was possible to setup a network filesystem in just a few minutes, which frankly I was rather impressed with.

Clearly there is no authentication in play here and the data is not encrypted on the gluster servers. The only security here is provided by the firewall rules, and I don't think anything is encrypted during transmission either. Solutions for this sound like they're on the way, and I'm also going to try to keep an eye on HekaFS, which sounds like exactly the end result I want.

Overall I'm quite please with what I'm seeing so far. Going to experiment with setting up a replicated volume for local backups for my wife's laptop and such and see how it goes from there.

With something up and running, the next big question on my mind is how could I get at this from an Android device?