Navigatie overslaan.

Tags for Storage: Filesystems & Backups

Storage: Filesystems & Backups

This is the third part of a series of articles that describe the way I run my computers. These articles are not intended as tutorials and do not contain the details to mirror my setup. Feel free to ask for details if you are interested. This article is about how I do backups and synchronization of my home directory.

Synchronized homedirectories

I have two main computers, one is my desktop and the other is a small server in the basement. The server is on 24/7, but the desktop only when I use it. When I'm at home I use the desktop and when I'm not at home I log into the server.
The main purpose of connecting to my own server is to get to my files, so anything that is important has to be on the server. On the other hand I don't want to run my homedirectory over NFS, it's just not fast enough.

Syncing with Unison

The solution is to use Unison to synchronize the home directories of both computers.
Unison is an rsync-like synchronization tool. The most important feature is that it will not copy entires files, but only the parts that were changed. It differs from rsync in that it is a two-way tool, even capable of resolving conflicting changes.
This tool is run automatically when I start or stop my desktop and makes sure that my home directory remains identical between the two systems.
Although it was not intended as such, this copy is my first backup.

Hiding the delay

The biggest disadvantage is that it takes about 10 minutes to process my home directory. I don't want to wait 10 minutes when booting my system, therefore it's started as early as possible in the boot sequence and run in the background.
This leaves a small risk window as my home directory is being modified while I'm already logged in. In practice I have never detected any problems with it though. Changes to the files that are needed at startup are rare.

During shutdown the delay becomes very obvious, but that's not a problem.

Starting the computer is still slow, so I usually don't wait for it to boot but go for coffee instead. By the time I get back the computer is started, but then I still would need to log in, which takes a few more minute before all my programs are started. I don't do that.

The trick is that I've configured my desktop to automatically login my user (without asking for a password). I'm the only user of this desktop, but it's still a security risk. I've solved this by immediately staring the screensaver after login. My programs are started directly but can't be used until the screensaver is unlocked, which requires my password.
This works fine, except for programs that ask for input such as a password. Sometimes this is an advantage though. For example my instant message client won't go online before I'm actually at the computer.

Not a network fileystem

I've considered to replace the system above by some kind of network filesystem like glusterfs. However there is an advantage to not syncing immediately. If I accidently overwrite or delete something (yes, that does happen) a network filesystem would immediately propagate my mistake to the other computer.
In my current setup this will not happen until I turn my desktop off. I can just login to the server and copy the previous version back.
Besides, I couldn't get any netwerk/cluster filesystem to work reliable. They all have either a single point of failure, lack of stability, no concurrent access from multiple locations or a requirement for special hardware.

Disk Backups

Backups are made at the server. The server always has a copy of my home directory that's less than 24 hours out of sync. The backups run very early in the morning when I'm supposed to be at sleep, so usually everything is synced properly.

Every night a copy of my homedir is made to a seperate disk. Instead of making full copies I use Storebackup. Storebackup only copies the files that have been changed. Unchanged files are linked so they don't take up any extra space.
Storebackup also deals with cleaning up old backups when I run out of diskspace.
During the first month all daily backups are kept. After that only one copy per week is retained while the other days of the week are deleted. After three months it's reduced to one copy per month.
Although Storebackup supports compressed backups I don't use that option. By keeping uncompressed copies of my data it's very easy to restore backups. Just go to directory and copy the files.

At this moment my homedir is 6.5G. The backups are 14G and go back 6 months, including some very big changes.

Tape backups

I used to have a fancy Bacula setup to make backups on tape, but my tapestreamer died and has not been replaced yet.
Tapes are slow but big. I used it to make backups of the entire system. A friend used to store a few of those tapes for me, in case my house burns down.

Remote backups

Once in a while I create a GPG-encrypted tarball of my homedir and upload it to a friends harddrive. Just in case a plane crashes into the house.

Temporary internet sharing directory

Sharing files with people on the internet can be a pain. On my webserver I have special directory, that is automatically emptied every day. If I want to temporarely offer a file to someone I put it in that directory. I can than forget about it, the next day it will be deleted.
A robots.txt file tries to keep search engines out of this directory.

to be continued

RAID

Everytime I mention RAID someone needs to mention that RAID is no backup. Yes I know, no need to mail me about it. Unfortunately a proper solution is expensive and I'm just a cheap student. All important files are backed up, as you may have read above.
It's not realistic to give the same treatement to all my music and movies, it's just too much data.
Secondly, as I computer addict I have a need for the primary function of RAID which is to keep my computer going if disk breaks down.
I use three RAIDs, probably overkill, but it's fun.

1 raid0. My homedir is on RAID0. That may seem stupid, but I have plenty of backups. For my homedirectory I want speed. If one of the disks would die I will build a new raid0 and restore a backup.
2 raid1. My / and /boot are on raid1. As long as a single disk survives my computer can still boot, and it's fast.
3 raid6. Music, movies and other big files are on RAID6. RAID6 is slow, but it can deal with two broken disks. These files are not important enough to spend a lot of money on backups, but it would be very annoying to use them.

I used to have a raid5+LVM for /usr and /var, but I gave up on that. A fresh install is much faster, as long as I have a backup of /etc (/var/lib/dpkg and /usr/local are also nice to have). Nowadays it's just a bunch of disks in an LVM pool.

NFS

NFS is used to share files between computers. For example there is usenet client running on my system that exports downloads over NFS. Nothing fancy here, but I couldn't leave it out.

Filesystems

I'm probably the only fan of reiserfs left. Supposedly reiserfs is less reliable than ext2/3/4. I haven't noticed it yet, and I've seen my share of broken filesystems.
If it's important you need backups anyway.
It seems that most filesystems are optimized for somewhat largish files (1mb or more). In reality 99% of the files on my system are only a few kilobyte big. Especially the mail partition and the backup partition. For these kinds of loads all other filesystems are slow and inefficient when compared to Reiserfs.

There are many stories online of people complaining about reiserfs. Even though my own experiences do not agree I have accepted the fact that reiserfs is less reliable.
For the data that I do not backup I use other filesystems. At the moment it's a mix of ext3, ext4, xfs and btrfs, but that could change at any time.