|
Posted by: stak
Tags: mozilla, hardware, projects
Posted on: 2014-10-28 21:24:15
I've been wanting to build a NAS (network-attached storage) box for a while now, and the ominous creaking noises from the laptop I was previously using as a file server prompted me to finally take action. I wanted to build rather than buy because (a) I wanted more control over the machine and OS, (b) I figured I'd learn something along the way and (c) thought it might be cheaper. This blog posts documents the decisions and mistakes I made and problems I ran into.
First step was figuring out the level of data redundancy and storage space I wanted. After reading up on the different RAID levels I figured 4 drives with 3 TB each in a RAID5 configuration would suit my needs for the next few years. I don't have a huge amount of data so the ~9TB of usable space sounded fine, and being able to survive single-drive failures sounded sufficient to me. For all critical data I keep a copy on a separate machine as well.
I chose to go with software RAID rather than hardware because I've read horror stories of hardware RAID controllers going obsolete and being unable to find a replacement, rendering the data unreadable. That didn't sound good. With an open-source software RAID controller at least you can get the source code and have a shot at recovering your data if things go bad.
With this in mind I started looking at software options - a bit of searching took me to FreeNAS which sounded exactly like what I wanted. However after reading through random threads in the user forums it seemed like the FreeNAS people are very focused on using ZFS and hardware setups with ECC RAM. From what I gleaned, using ZFS without ECC RAM is a bad idea, because errors in the RAM can cause ZFS to corrupt your data silently and unrecoverably (and worse, it causes propagation of the corruption). A system that makes bad situations worse didn't sound so good to me.
I could have still gone with ZFS with ECC RAM but from some rudimentary searching it sounded like it would increase the cost significantly, and frankly I didn't see the point. So instead I decided to go with NAS4Free (which actually was the original FreeNAS before iXsystems bought the trademark and forked the code) which allows using a UFS file system in a software RAID5 configuration.
So with the software decisions made, it was time to pick hardware. I used this guide by Sam Kear as a starting point and modified a few things here and there. I ended up with this parts list that I mostly ordered from canadadirect.com. (Aside: I wish I had discovered pcpartpicker.com earlier in the process as it would have saved me a lot of time). They shipped things to me in 5 different packages which arrived on 4 different days using 3 different shipping services. Woo! The parts I didn't get from canadadirect.com I picked up at a local Canada Computers store. Then, last weekend, I put it all together.
It's been a while since I've built a box so I screwed up a few things and had to rewind (twice) to fix them. Took about 3 hours in total for assembly; somebody who knew what they were doing could have done it in less than one. I mostly blame lack of documentation with the chassis since there were a bunch of different screws and it wasn't obvious which ones I had to use for what. They all worked for mounting the motherboard but only one of them was actually correct and using the wrong one meant trouble later.
In terms of the hardware compatibility I think my choices were mostly sound, but there were a few hitches. The case and motherboard both support up to 6 SATA drives (I'm using 4, giving me some room to grow). However, the PSU only came with 4 SATA power connectors which means I'll need to get some adaptors or maybe a different PSU if I need to add drives. The other problem was that the chassis comes with three fans (two small ones at the front, one big one at the back) but there was only one chassis power connector on the motherboard. I plugged the big fan in and so far the machine seems to be staying pretty cool so I'm not too worried. Does seem like a waste to have those extra unused fans though.
Finally, I booted it up using a monitor/keyboard borrowed from another machine, and ran memtest86 to make sure the RAM was good. It was, so I flashed the NAS4Free LiveUSB onto a USB drive and booted it up. Unfortunately after booting into NAS4Free my keyboard stopped working. I had to disable the USB 3.0 stuff in the BIOS to get around that. I don't really care about having USB 3.0 support on this machine so not a big deal. It took me some time to figure out what installation mode I wanted to use NAS4Free in. I decided to do a full install onto a second USB drive and not have a swap partition (figured hosting swap over USB would be slow and probably unnecessary).
So installing that was easy enough, and I was able to boot into the full NAS4Free install and configure it to have a software RAID5 on the four disks. Things generally seemed OK and I started copying stuff over.. and then the box rebooted. It also managed to corrupt my installation somehow, so I had to start over from the LiveUSB stick and re-install. I had saved the config from the first time so it was easy to get it back up again, and once again I started putting data on there. Again it rebooted, although this time it didn't corrupt my installation. This was getting worrying, particularly since the system log files provided no indication as to what went wrong.
My first suspicion was that the RAID wasn't fully initialized and so copying data onto it resulted in badness. The array was "rebuilding" and I'm supposed to be able to use it then, but I figured I might as well wait until it was done. Turns out it's going to be rebuilding for the next ~20 days because RAID5 has to read/write the entire disk to initialize fully and in the days of multi-terabyte disk this takes forever. So in retrospect perhaps RAID5 was a poor choice for such large disks.
Anyway in order to debug the rebooting, I looked up the FreeBSD kernel debugging documentation, and that requires having a swap partition that the kernel can dump a crash report to. So I reinstalled and set up a swap partition this time. This seemed to magically fix the rebooting problem entirely, so I suspect the RAID drivers just don't deal well when there's no swap, or something. Not an easy situation to debug if it only happens with no swap partition but you need a swap partition to get a kernel dump.
So, things were good, and I started copying more data over and configuring more stuff and so on. The next problem I ran into was the USB drive to which I had installed NAS4Free started crapping out with read/write errors. This wasn't so great but by this point I'd already reinstalled it about 6 or 7 times, so I reinstalled again onto a different USB stick. The one that was crapping out seems to still work fine in other machines, so I'm not sure what the problem was there. The new one that I used, however, was extremely slow. Things that took seconds on the previous drive took minutes on this one. So I switched again to yet another drive, this time an old 2.5" internal drive that I have mounted in an enclosure through USB.
And finally, after installing the OS at least I've-lost-count-how-many times, I have a NAS that seems stable and appears to work well. To be fair, reinstalling the OS is a pretty painless process and by the end I could do it in less than 10 minutes from sticking in the LiveUSB to a fully-configured working system. Being able to download the config file (which includes not just the NAS config but also user accounts and so on) makes it pretty painless to restore your system to exactly the way it was. The only additional things I had to do were install a few FreeBSD packages and unpack a tarball into my home directory to get some stuff I wanted. At no point was any of the data on the RAID array itself lost or corrupted, so I'm pretty happy about that.
In conclusion, setup was a bit of a pain, mostly due to unclear documentation and flaky USB drives (or drivers) but now that I have it set up it seems to be working well. If I ever have to do it over I might go for something other than RAID5 just because of the long rebuild time but so far it hasn't been an actual problem.
|
(ZFS rebuilds are in theory faster since they only care about used space.)
Back when arrays of 15 commodity drives were a thing I had to worry about, I used to trade in drives from my collection of spares so they wouldn't all be of the same vintage. They were mostly 15x1TB, though, and rebuilds would still take a while. I also configured them as RAID-Z3 (or a stripe across three RAID5 arrays) with a hot spare, but I digress.
This is all good information, thanks, since I'm probably going to be doing this for myself pretty soon. I think I will go with Linux, though.