CAPSLOCK2000's blog
Submitted by CAPSLOCK2000 on 10 January, 2010 - 13:16
Click here for untitled120/10 mbit internet at home. My internet connection is fsater than my internal network :)Click here for untitledI kinda forgot to post this, but I got a new job at the university of Tilburg and rented a house in that town.
New house and job
Submitted by CAPSLOCK2000 on 5 January, 2010 - 17:12Click here for untitledRented a house, I must be the youngest "Senior" in the city.Click here for untitledStarted on a new job as Unix admin for the University of Tilburg. This is just like playtime. Big computers and lots of coffee :)
IPv6 glue in the .name TLD
Submitted by CAPSLOCK2000 on 26 November, 2009 - 11:21After an initial false-start, it is now really possible to add IPv6 glue records for .name domains.
I'm so happy. :) :) :)
This was the last bit needed to complete my IPv6 setup. Now my network is safe in case IPv4 suddenly dissappears :)
This also allowed me to pass the Hurricane Electric IPv6 certification:
Personalized weather prediction
Submitted by CAPSLOCK2000 on 10 November, 2009 - 22:00Here's a one-liner to generate a personalized weather prediction.
Quite a fancy name for a line of pixels, but allow me to explain.
In the Netherlands there are two kinds of weather, raining and about to start raining (four if you add hail and snow). If you don't want to get wet every time you go outside you need to plan your trips. Okay, I'm exaggerating but you get the point.
While there are many nice websites, browser widgets and desktop applications dedicated to the weather most of them are not very practical. They will give you a very detailed report on the weather throughout the day for your entire region. Typically I'm just interested in the weather at my place for the next hour or two.
My simple solution is to take an animated prediction of my country, cut out a small area around my house and append all these images side to side. This gives a small rectangle that will immediately show if and when it will rain during the next three hours.
The end result is rather small, but that is the entire point. Here's an example:
| Current Prediction | ![]() |
| Fixed example | ![]() |
Black means dry, blue means rain, the darker the more rain. Every sub-image represents 10 minutes.
This is generated with the following command:
wget http://www2.buienradar.nl/forecast/nl-eu-forecast/loop.gif -O - | \ convert '-[1-99]' -crop 10x10+414+563 -alpha Off +repage +append ~/public_html/weer.png
Note that this is useless to you unless you live in Maastricht. If you live somewhere else in the Netherlands you can adjust the coordinates to point to your own city.
Upgraded Drupal to version 6
Submitted by CAPSLOCK2000 on 29 September, 2009 - 18:14I just upgraded this website from Drupal 4 to Drupal 6. First I upgraded to Drupal 5, fixed a few problems, and than upgraded tot Drupal 6. All in all it was fairly easy. Drupals upgrade system is quite good.
Audio: Weighted Playlists, exploitation vs. exploration
Submitted by CAPSLOCK2000 on 29 September, 2009 - 15:32This is the fourth part of a series of articles that describe the way I run my computers. These articles are not intended as tutorials and do not contain the details to mirror my setup. They are written in one go, and most certainly contain errors. Feel free to ask for details if you are interested. I will try to update whenever it's appropriate. This article is about how I manage my audio files.
The Problem
Before we get to the technical part of this article I will have to describe my musical preferences.
- I like to explore new music; I constantly download new and unkown music.
- I like variation; listening to the same album three times in a row is nothing for me.
- I like albums; some songs need to be heard in their context to fully appreciate them
- 90% of everything is crap; most albums contain only one are two decent songs.
As a consequence of these four points I have a lot of music that I hardly know, and a lot of music that is rather bad. If I only listened to the music that I know to be good I would never hear anything new. If I played all songs randomly I would be listening to bad music most of the time. In Artificial Intelligence we would call it a problem of exploration versus exploitation.
Gathering Information
So I want a playlist that balances the quality of a song with when it was played last.
To do so this information needs to be recorded. The last play date is easy, my audio player (Amarok) automatically keeps track of that. The quality has to be determined manually. I rate my songs on a 1-10 scale (and 0 for unrated files). Rating takes a lot of time, it took me three years, and I'm still not completely done.
The solution
My solution to the playlist problem is to associate each level with a time period, which indicates how often I want to hear a song.
For example:
| Rating | Delay |
| 10 | 2 months |
| 9 | 3 months |
| 8 | 5 months |
| 7 | 8 months |
| 6 | 12 months |
| 5 | 22 months |
| 2-4 | 26 months |
| 1 | never |
This means that my favorite songs are added to the playlist two months after they have been played for the last time. Most songs are only played once every 22 months. Almost two years between plays seems like an awfully long time, but remember that these are the songs that I do not consider very good (but not bad either), or that I want to hear again before assigning a final rating. I believe that you should hear something at least twice before you can decide on it. There is a lot of music that I couldn't appreciate the first time I heard it, so when in doubt I don't delete.
Low Ratings
Rating 1 songs are never automatically queued. Its stuff that I don't want to hear, but don't want to delete either.
Rating 2-4 is lumped together in one group. Anything below rating 5 I consider bad music. I want to hear those songs once in a while in case my taste or opinion changes, but they shouldn't dominate my playlist. This playlist is limited to 150 songs.
Implemenation
Amarok 1.4 has everything that is needed. I use the Smart Playlist feature to automatically generate playlists that match the above specification.
Besides the categories above I also have a special playlist for new and unrated music that's also added to the mix.
Example
Right now my playlist looks like this:
| Rating | # Songs |
| 10 | 39 |
| 9 | 50 |
| 8 | 68 |
| 7 | 88 |
| 6 | 265 |
| 5 | 489 |
| 2-4 | 150 |
| new | 321 |
Notes
Albums
I mentioned earlier that I like albums. Amarok has a nice "random album" feature, that randomly selects an album from the current playlists, and plays all songs from that album that are on the playlist. When used with the above set of playlists it ensures that songs from the same album are always played in order, but the good songs will be on the list more often than the bad tracks.
Size
I try to keep my playlist between 1000 and 1500 songs. A short playlist loads much faster, but a long playlist increases the probability that more than one song from the same album is on the list.
Storage: Filesystems & Backups
Submitted by CAPSLOCK2000 on 18 September, 2009 - 13:15This is the third part of a series of articles that describe the way I run my computers. These articles are not intended as tutorials and do not contain the details to mirror my setup. Feel free to ask for details if you are interested. This article is about how I do backups and synchronization of my home directory.
E-mail: Centralized server
Submitted by CAPSLOCK2000 on 17 September, 2009 - 19:44This is the second part of a series of articles that describe the way I run my computers. This one is about e-mail. These articles are not intended as tutorials and do not contain the details to mirror my setup. Feel free to ask for details if you are interested.
The goals
The goal of my mail system is to centralize all my e-mail in one location. I receive a lot of mail, on many different e-mail address. Centralizing is the only way to keep it manageable and pay proper attention to all mail.
I have a number of requirements:
- Mail must be accessible with a normal mail client, webmail is too slow for me
- Mail must be accessible over the web
- Mail must be searchable
- Mail must be filtered into folders
- Spam must be removed
One important requirement I have for all my software is that I do not want to be locked into old, unmaintained or bad software. Therefore I try to use open standards and file formats as much as possible. Every bit of software mentioned on this page has at least one decent alternative. If needed I could replace every program without any loss of data or functionality.
Receiving mail
E-mail arrives at my system in two different ways. The first is by direct SMTP, and the second is trough fetchmail.
Fetchmail is a program that fetches mail from POP3 and IMAP servers. Upon retrieving an e-mail it is handed over to my SMTP server. There is one e-mail account that does not work (directly) with fetchmail, and that is my Hotmail address. For that I use a program called Hotway which is a http2pop3 proxy. It acts like a pop3 server, but in the background it logs into the Hotmail website and retrieves the e-mails over HTTP. It's butt-ugly, but it works.
Filtering spam
From now on all mail is handled as if it arrived directly over SMTP.
Let's follow a mail through the system.
The first thing to do is spam-checking. The longer you wait with that, the more time you waste on mail that will be deleted anyway. Spamassassin with the Bayesian classifier is my favorite anti-spam solution.
When I say that it's the first thing, I really mean it. My mail server does not acknowledge the reception of an e-mail until it has been scanned. If the system thinks it's spam it will not acknowledge the reception, but refuse it. The sending server will then have to bounce my error message back to the sender.
However, must spam has a falsified sender address. Sending it back will only increase the problem. Therefore, if the server is truly convinced that it is spam, it will _not_ refuse it. Instead it will accept it and delete it immediately. No need to burden the sending server any more with a mail that's spam anyway.
The other side of the story is mail that the system cannot decide on. These mails are put into a special mail folder that I manually check about once a week. It's been months since my last false positive, but I prefer err on the side of caution.
Bayesian classifier
Spamassassin includes a Bayesian classifier, that is trained on your own e-mail. It learns the differences between e-mails that you have designated 'spam' or 'ham' (not-spam). For this purpose I have two special mail folders named 'spam' and 'ham'. Any mail that I put into these folders is automatically added to the training set of the classifier. If I find a mail that is misclassified I put it into those folders.
(This way of training, called Train-On-Error, is not optimal. It will not learn changes in your behaviour until you find out that it has been making mistakes. Training on my entire mail archive is not feasible because it would take too much time. I intend to write something that uses a sample of recent mail.)
Sorting into mailboxes
Now that the mail has been accepted, it needs to be delivered into a local mail folder.
I use the Sieve mail filtering language. Sieve is a standarized language to define mail filters. There is a special protocol for (remotely) managing Sieve scripts. The advantage of Sieve is that all filtering is done on the server, but I can still write filters from within my mail client.
The filter itself is not very interesting, but one nice detail is that I use special purpose e-mail addresses to simplify filtering. For example if I would send you an e-mail, I would use the email address "casper-you@gielen.name" . That makes it very easy to filter your replies into a mailbox. A nice little extra is that it helps me to manage spam. If I would receive from casper-you@gielen.name, I would know that you are somehow responsible, as I did not use that address with anybody else. I now know not to trust you, and I can add that e-mail address to my blacklist without the risk of dropping any real mail. Now that I've posted that address on this page I should probably add it to the blacklist straight away.
Storing mail
All mail is stored in the Mailbox format. This format stores each message in a seperate file. Also useful is that the filename gives some information about the status of the mail, such as if it has been read or if it is flagged as important.
I use this to automatically clean up some mailboxes. I follow a bunch of mailinglists that are also archived on the internet, so there is no point in keeping my own copy of those mails for more than a few days. I wrote a script that deletes mail from those folders that has been read, is not flagged and has not been accessed for thirty days as important.
Another advantage is that all the standard Unix tools can be used to manage those files. It also avoids most problems associated with simultaneous connections to the same mailbox, as it is no longer necessary to lock entire mailboxes.
The biggest disadvantage is that you'll soon have many thousands of files and most filesystems have terrible performance on many small files. Fortunately it's no problem for Reiserfs.
Reading mail
Mail is served over IMAP through Dovecot. A small but very powerfull IMAP server. IMAP is really the only choice for this. POP3 is not suitable for leaving mail on the server, and whatever Exchange uses is only compatible with Outlook.
I read most of my mail through KMail, closely followed by Icedove (aka Mozilla Thunderbird). When logged in remotely I tend to fall back to mutt. mutt is still the most powerful mail client that I know off. If I need to do anything really fancy than mutt is usually the best choice.
I also run the Ilohamail webmail client. It's a very simple webmail client, but it is extremely fast. I use the webmail when I'm not allowed, or don't have the time, to log into my own system with SSH.
Miscellaneous
That's about it. I could tell a lot more, for example that everything is secured with SSL, but I think this article is long enough as it is. Good night.
Audio: 2.1 + headphones for the lazy
Submitted by CAPSLOCK2000 on 17 September, 2009 - 19:08I've decided that I should document as much about my setup as possible. I'll start off easily with my audio setup.
My computer has a pair of normal speakers and a subwoofer. During the night I use a headphone. In the past this involved plugging the headphones into the amplifier to switch.
However the Creative SoundBlaster Audigy allows for a much more nifty setup. I suppose that other 5.1 capable sound cards are probably able to do the same.
Pixelpipe
Submitted by CAPSLOCK2000 on 9 September, 2009 - 13:10Testing pixelpipe. Pixelpipe promises to forward this message to multiple sites. Think of it as a repeater for social sites. This would nicely complement Friendfeed, which aggregates social sites. My goal is to have one or two sites that I have to follow and update instead of a gazillion different sites.
Test
Submitted by CAPSLOCK2000 on 9 September, 2009 - 13:00Testing pixelpipe. Pixelpipe promises to forward this message to multiple sites. Think of it as a repeater for social sites. This would nicely complement Friendfeed, which aggregates social sites. My goal is to have one or two sites that I have to follow and update instead of a gazillion different sites.
Gandi supports IPv6 glue records for .name
Submitted by CAPSLOCK2000 on 31 August, 2009 - 15:29I've finally gotten IPv6 glue records for my domain. If I can trust Hurricane Electric's
IPv6 progress report I'm the first one who has IPv6 glue records in the .name zone.
I've been trying to get IPv6 glue records for some time, but it's not easy for a .name domain if your registrar doesn't support it.
There is a trick where you use another registrar just to push the glue records, but .name is quite rare and I have not found anyone that would provide that service to me at a reasonable price.
Gandi promised a solution a while ago, but so far it didn't work for me, but now it does. Just register your DNS servers with them and add them to your domain, just like you would do with IPv4. There is one caveat; you can register only 1 IP adress per hostname. That means that you will need to add seperate records for IPv4 and IPv6. That shouldn't be problem for anyone though.
update: I celebrated to early. Even though the interface accepted the request the ipv6 glue records are not actually published. I've mailed Verisign and they promised that support for the .name domain would be re-enabled on November 7. I'm looking forward to it.
update2: It works now! http://www.gielen.name/IPv6 glue for NAME
DNSSEC made easy with zonesigner
Submitted by CAPSLOCK2000 on 28 August, 2009 - 14:01I've just tested zonesigner from dnssec-tools.org. It was surprisingly easy. If you think that DNSSEC is a complex mess you should try zonesigner. It's pretty much as close to a turn-key solution for DNSSEC as possible. You don't really need to understand what's happening. Just follow the instructions and you'll be fine.
Adding DNSSEC to your domain is still not for everybody, but if you feel confident about administrating BIND, than DNSSEC should be within your reach as well.
Amarok 1.4 not rebuilding collection
Submitted by CAPSLOCK2000 on 11 August, 2009 - 13:20mostly a reminder to myself:
When Amarok 1.4 refuses to rebuild it's collection, even when starting with a new configuration, try "touch /dir/to/mp3".
PS. Yes, I still use Amarok 1.4. Dynamic playlists have become a requirement for me. While the "fuzzy" playlists of Amarok 2 are a nice idea, they do not provide the level of control that is available in Amarok 1.4.
The real show-stopper for me is that there are no relative time intervals in Amarok 2. I want to be able to ask for "All tracks that have not been played in the last 3 months". Amarok2 only supports absolute times (eg "All tracks that have not been played since 01-01-2003").
PS2. What I want is queries like:
(rating == 5 && last_played > 3 months) ||
(rating == 4.5 && last_played > 4 months) ||
(rating == 4.0 && last_played > 5 months)
I'll probably have to write it myself.
Recovering an inactive and dirty RAID
Submitted by CAPSLOCK2000 on 11 August, 2009 - 11:26Last night my computer crashed, and today my RAID5 array wouldn't start. That was extra painfull as the array was already one drive short.
dmesg looked something like this (the device is /dev/md6):
[ 2577.637615] raid5: device sdb5 operational as raid disk 0
[ 2577.637626] raid5: device hde5 operational as raid disk 3
[ 2577.637633] raid5: device hdg5 operational as raid disk 2
[ 2577.637639] raid5: device hdc5 operational as raid disk 1
[ 2577.638576] raid5: allocated 5264kB for md6
[ 2577.638629] raid5: cannot start dirty degraded array for md6
[ 2577.639471] RAID5 conf printout:
[ 2577.639476] --- rd:5 wd:4
[ 2577.639482] disk 0, o:1, dev:sdb5
[ 2577.639488] disk 1, o:1, dev:hdc5
[ 2577.639494] disk 2, o:1, dev:hdg5
[ 2577.639499] disk 3, o:1, dev:hde5
[ 2577.639504] raid5: failed to run raid set md6
[ 2577.639509] md: pers->run() failed ...
I don't have backups for that data, so I was adamant on getting the array back to work. Afterall I knew it was at least 99% correct. I would be more than happy to accept a little corruption to get the majority of my data back.
I was able to force my kernel to believe that the disks were okay using the following commands.
WARNING 1: don't try this unless you are desperate and all alternatives have failed.
WARNING 2: I wrote the following down from memory. My brain does not do raid, and I'm not entirely sure the procedure below is complete and correct. However, as the information below is hard to find, I decided do write it down anyway. Look it up _before_ you use it!
(In this example I'll use /dev/md6 as the raid device)
1. Stop the array:
mdadm -S /dev/md6
2. Make the array read only:
echo 1 > /sys/module/md_mod/parameters/start_ro
3. Force the kernel to believe the array is clean:
echo "clean" > /sys/block/md6/md/array_state
(ignore the error message)
4. Restart the array:
mdadm -A --force /dev/md6
You can now take a look at /proc/mdstat to see if the array has been started, and then try to mount it. You'll probably need to do an fsck.
(Now would be a good time to consider a backup strategy)






