Gitorious went down this morning

Our frontend web server went down at 6:24CET this morning, we will be updating this post as we bring the server back up. Here’s what we know right now:

  • At 6:24 CET a Kernel oops occured. The alarms at our hosting provider went off, and the server was booted. 
  • Since the file system keeping the repositories hasn’t had a full consistency check since August 2012 a fsck was started
  • When fsck hadn’t completed at 8:00 CET, the server was routinely rebooted, and another fsck process was started at 8:04 CET
  • The last time we ran a full fsck on the file system, it took about 2.5 hours. Since then, however, we have installed dedicated storage for our servers, and this has higher IO capacity than the one we were running from in August last year.
  • 10:06 CET: The server is back up. We will upgrade the kernel and do another reboot, hopefully the kernel issue we encountered earlier today has been resolved. Expect a few minutes downtime in a few minutes
  • 10:13 CET: All systems are running again, with an updated kernel

2 Comments

  1. Kioob
    Posted February 26, 2013 at 8:40 am | Permalink

    Hi,

    what kind of FS do you use to have 2.5 hours of fsck ? ext3 ?

  2. Marius Mathiesen
    Posted February 26, 2013 at 8:52 am | Permalink

    @Kiiob: we’re currently on ext2 (or ext3, I don’t remember exactly). We had scheduled a conversion to ext4, but our main ops guy is at the hospital :-(


Follow

Get every new post delivered to your Inbox.

Join 846 other followers

%d bloggers like this: