Blog

All timestamps are based on your local time of:

[ View: List | Cloud | Calendar | Latest comments | Photo albums ]

Building a NAS

2014-10-28 21:24:15

I've been wanting to build a NAS (network-attached storage) box for a while now, and the ominous creaking noises from the laptop I was previously using as a file server prompted me to finally take action. I wanted to build rather than buy because (a) I wanted more control over the machine and OS, (b) I figured I'd learn something along the way and (c) thought it might be cheaper. This blog posts documents the decisions and mistakes I made and problems I ran into.

First step was figuring out the level of data redundancy and storage space I wanted. After reading up on the different RAID levels I figured 4 drives with 3 TB each in a RAID5 configuration would suit my needs for the next few years. I don't have a huge amount of data so the ~9TB of usable space sounded fine, and being able to survive single-drive failures sounded sufficient to me. For all critical data I keep a copy on a separate machine as well.

I chose to go with software RAID rather than hardware because I've read horror stories of hardware RAID controllers going obsolete and being unable to find a replacement, rendering the data unreadable. That didn't sound good. With an open-source software RAID controller at least you can get the source code and have a shot at recovering your data if things go bad.

With this in mind I started looking at software options - a bit of searching took me to FreeNAS which sounded exactly like what I wanted. However after reading through random threads in the user forums it seemed like the FreeNAS people are very focused on using ZFS and hardware setups with ECC RAM. From what I gleaned, using ZFS without ECC RAM is a bad idea, because errors in the RAM can cause ZFS to corrupt your data silently and unrecoverably (and worse, it causes propagation of the corruption). A system that makes bad situations worse didn't sound so good to me.

I could have still gone with ZFS with ECC RAM but from some rudimentary searching it sounded like it would increase the cost significantly, and frankly I didn't see the point. So instead I decided to go with NAS4Free (which actually was the original FreeNAS before iXsystems bought the trademark and forked the code) which allows using a UFS file system in a software RAID5 configuration.

So with the software decisions made, it was time to pick hardware. I used this guide by Sam Kear as a starting point and modified a few things here and there. I ended up with this parts list that I mostly ordered from canadadirect.com. (Aside: I wish I had discovered pcpartpicker.com earlier in the process as it would have saved me a lot of time). They shipped things to me in 5 different packages which arrived on 4 different days using 3 different shipping services. Woo! The parts I didn't get from canadadirect.com I picked up at a local Canada Computers store. Then, last weekend, I put it all together.

It's been a while since I've built a box so I screwed up a few things and had to rewind (twice) to fix them. Took about 3 hours in total for assembly; somebody who knew what they were doing could have done it in less than one. I mostly blame lack of documentation with the chassis since there were a bunch of different screws and it wasn't obvious which ones I had to use for what. They all worked for mounting the motherboard but only one of them was actually correct and using the wrong one meant trouble later.

In terms of the hardware compatibility I think my choices were mostly sound, but there were a few hitches. The case and motherboard both support up to 6 SATA drives (I'm using 4, giving me some room to grow). However, the PSU only came with 4 SATA power connectors which means I'll need to get some adaptors or maybe a different PSU if I need to add drives. The other problem was that the chassis comes with three fans (two small ones at the front, one big one at the back) but there was only one chassis power connector on the motherboard. I plugged the big fan in and so far the machine seems to be staying pretty cool so I'm not too worried. Does seem like a waste to have those extra unused fans though.

Finally, I booted it up using a monitor/keyboard borrowed from another machine, and ran memtest86 to make sure the RAM was good. It was, so I flashed the NAS4Free LiveUSB onto a USB drive and booted it up. Unfortunately after booting into NAS4Free my keyboard stopped working. I had to disable the USB 3.0 stuff in the BIOS to get around that. I don't really care about having USB 3.0 support on this machine so not a big deal. It took me some time to figure out what installation mode I wanted to use NAS4Free in. I decided to do a full install onto a second USB drive and not have a swap partition (figured hosting swap over USB would be slow and probably unnecessary).

So installing that was easy enough, and I was able to boot into the full NAS4Free install and configure it to have a software RAID5 on the four disks. Things generally seemed OK and I started copying stuff over.. and then the box rebooted. It also managed to corrupt my installation somehow, so I had to start over from the LiveUSB stick and re-install. I had saved the config from the first time so it was easy to get it back up again, and once again I started putting data on there. Again it rebooted, although this time it didn't corrupt my installation. This was getting worrying, particularly since the system log files provided no indication as to what went wrong.

My first suspicion was that the RAID wasn't fully initialized and so copying data onto it resulted in badness. The array was "rebuilding" and I'm supposed to be able to use it then, but I figured I might as well wait until it was done. Turns out it's going to be rebuilding for the next ~20 days because RAID5 has to read/write the entire disk to initialize fully and in the days of multi-terabyte disk this takes forever. So in retrospect perhaps RAID5 was a poor choice for such large disks.

Anyway in order to debug the rebooting, I looked up the FreeBSD kernel debugging documentation, and that requires having a swap partition that the kernel can dump a crash report to. So I reinstalled and set up a swap partition this time. This seemed to magically fix the rebooting problem entirely, so I suspect the RAID drivers just don't deal well when there's no swap, or something. Not an easy situation to debug if it only happens with no swap partition but you need a swap partition to get a kernel dump.

So, things were good, and I started copying more data over and configuring more stuff and so on. The next problem I ran into was the USB drive to which I had installed NAS4Free started crapping out with read/write errors. This wasn't so great but by this point I'd already reinstalled it about 6 or 7 times, so I reinstalled again onto a different USB stick. The one that was crapping out seems to still work fine in other machines, so I'm not sure what the problem was there. The new one that I used, however, was extremely slow. Things that took seconds on the previous drive took minutes on this one. So I switched again to yet another drive, this time an old 2.5" internal drive that I have mounted in an enclosure through USB.

And finally, after installing the OS at least I've-lost-count-how-many times, I have a NAS that seems stable and appears to work well. To be fair, reinstalling the OS is a pretty painless process and by the end I could do it in less than 10 minutes from sticking in the LiveUSB to a fully-configured working system. Being able to download the config file (which includes not just the NAS config but also user accounts and so on) makes it pretty painless to restore your system to exactly the way it was. The only additional things I had to do were install a few FreeBSD packages and unpack a tarball into my home directory to get some stuff I wanted. At no point was any of the data on the RAID array itself lost or corrupted, so I'm pretty happy about that.

In conclusion, setup was a bit of a pain, mostly due to unclear documentation and flaky USB drives (or drivers) but now that I have it set up it seems to be working well. If I ever have to do it over I might go for something other than RAID5 just because of the long rebuild time but so far it hasn't been an actual problem.

[ 3 Comments... ]

Prompts with ANSI escape sequences

2014-10-19 21:35:15

Tip of the day: If your shell uses the readline library (e.g. bash), and you have ANSI escape sequences in your prompt, you should surround the ANSI escape sequences with \001 and \002 so that readline knows they are "invisible characters". If you don't, readline can end up mis-positioning your cursor and generally screwing up the display.

For example, I used to have this in my bashrc file:

export PS1='\u@\033[01;31m\h\033[m \W$ '

and that caused problems if for example I had a long command and used ctrl+a to get to the start of it.

Now I have this:

export PS1='\u@\001\033[01;31m\002\h\001\033[m\002 \W$ '

and all is well.

The \001 and \002 are defined as RL_PROMPT_START_IGNORE and RL_PROMPT_END_IGNORE in readline.h.

[ 0 Comments... ]

Google-free android usage

2014-10-18 22:42:19

When I switched from using a BlackBerry to an Android phone a few years ago it really irked me that the only way to keep my contacts info on the phone was to also let Google sync them into their cloud. This may not be true universally (I think some samsung phones will let you store contacts to the SD card) but it was true for phone I was using then and is true on the Nexus 4 I'm using now. It took a lot of painful digging through Android source and googling, but I successfully ended up writing a bunch of code to get around this.

I've been meaning to put up the code and post this for a while, but kept procrastinating because the code wasn't generic/pretty enough to publish. It still isn't but it's better to post it anyway in case somebody finds it useful, so that's what I'm doing.

In a nutshell, what I wrote is an Android app that includes (a) an account authenticator, (b) a contacts sync adapter and (c) a calendar sync adapter. On a stock Android phone this will allow you to create an "account" on the device and add contacts/calendar entries to it.

Note that I wrote this to interface with the way I already have my data stored, so the account creation process actually tries to validate the entered credentials against a webhost, and the the contacts sync adapter is actually a working one-way sync adapter that will download contact info from a remote server in vcard format and update the local database. The calendar sync adapter, though, is just a dummy. You're encouraged to rip out the parts that you don't want and use the rest as you see fit. It's mostly meant to be a working example of how this can be accomplished.

The net effect is that you can store contacts and calendar entries on the device so they don't get synced to Google, but you can still use the built-in contacts and calendar apps to manipulate them. This benefits from much better integration with the rest of the OS than if you were to use a third-party contacts or calendar app.

Source code is on Github: staktrace/pimple-android.

[ 10 Comments... ]

Devil's advocate on solar

2014-10-15 22:17:55

Sometimes I wonder if solar panels really are better than fossil fuels. One the one hand, yay renewable energy. On the other hand, aren't we trying to reflect away more of the sun's energy to avoid amplifying global warming? Solar panels seem to work counter to that goal - they capture more and more of the sun's energy and trap it here.

Arguably that energy ends up mostly dissipated as heat (e.g. in the case of a solar-powered data center), and that would be the same if the data center were powered by fossil fuels. But I can't help thinking of our planet as a mostly-closed system that ordinarily emits about the same amount of energy that it takes in (that may not be right). Solar panels seem like they would upset that balance.

I wonder if anybody has done studies to measure this sort of thing.

[ 3 Comments... ]

Maker Party shout-out

2014-09-18 10:48:00

I've blogged before about the power of web scale; about how important it is to ensure that everybody can use the web and to keep it as level of a playing field as possible. That's why I love hearing about announcements like this one: 127K Makers, 2513 Events, 86 Countries, and One Party That Just Won't Quit. Getting more people all around the world to learn about how the web works and keeping that playing field level is one of the reasons I love working at Mozilla. Even though I'm not directly involved in Maker Party, it's great to see projects like this having such a huge impact!

[ 0 Comments... ]

Cracking libxul

2014-05-22 09:02:20

For a while now I've been wanting to take a look inside libxul to see why it's so big. In particular I wanted to know what the impact of using templates so heavily in our code was - things like nsTArray and nsRefPtr are probably used on hundreds of different types throughout our codebase. Last night I was have trouble sleeping so I decided to crack open libxul and see if I could figure it out. I didn't persist enough to get the exact answers I wanted, but I got close enough. It was also kind of fun and I figured I'd post about it partly as an educational thing and partly to inspire others to dig deeper into this.

First step: build libxul. I had a debug build on my Linux machine with recent gecko, so I just used the libxul.so from that.

Second step: disassemble libxul.

objdump -d libxul.so > libxul.disasm

Although I've looked at disassemblies before I had to look at the file in vim a little bit to figure the best way to parse it to get what I wanted, which was the size of every function defined in the library. This turned out to be a fairly simple awk script.

Third step: get function sizes. (snippet below is reformatted for easier reading)

awk 'BEGIN { addr=0; label="";}
     /:$/ && !/Disassembly of section/ { naddr = sprintf("%d", "0x" $1);
                                         print (naddr-addr), label;
                                         addr=naddr;
                                         label=$2 }'
    libxul.disasm > libxul.sizes

For those of you unfamiliar with awk, this identifies every line that ends in a colon, but doesn't have the text "Disassembly of section" (I determined this would be sufficient to match the line that starts off every function disassembly). It then takes the address (which is in hex in the dump), converts it to decimal, and subtracts it from the address of the previous matching line. Finally it dumps out the size/name pairs. I inspected the file to make sure it looked ok, and removed a bad line at the top of the file (easier to fix it manually than fix the awk script).

Now that I had the size of each function, I did a quick sanity check to make sure it added up to a reasonable number:

awk '{ total += $1 } END { print total }' libxul.sizes
40263032

The value spit out is around 40 megs. This seemed to be in the right order of magnitude for code in libxul so I proceeded further.

Fourth step: see what's biggest!

sort -rn libxul.sizes | head -n 20
57984 <_ZL9InterpretP9JSContextRN2js8RunStateE>:
43798 <_ZN20nsHtml5AttributeName17initializeStaticsEv>:
41614 <_ZN22nsWindowMemoryReporter14CollectReportsEP25nsIMemoryReporterCallbackP11nsISupports>:
39792 <_Z7JS_Initv>:
32722 <vp9_fdct32x32_sse2>:
28674 <encode_mcu_huff>:
24365 <_Z7yyparseP13TParseContext>:
21800 <_ZN18nsHtml5ElementName17initializeStaticsEv>:
20558 <_ZN7mozilla3dom14PContentParent17OnMessageReceivedERKN3IPC7MessageE.part.1247>:
20302 <_ZN16nsHtml5Tokenizer9stateLoopI23nsHtml5ViewSourcePolicyEEiiDsiPDsbii>:
18367 <sctp_setopt>:
17900 <vp9_find_best_sub_pixel_comp_tree>:
16952 <_ZN7mozilla3dom13PBrowserChild17OnMessageReceivedERKN3IPC7MessageE>:
16096 <vp9_sad64x64x4d_sse2>:
15996 <_ZN7mozilla12_GLOBAL__N_119WebGLImageConverter3runILNS_16WebGLTexelFormatE17EEEvS3_NS_29WebGLTexelPremultiplicationOpE>:
15594 <_ZN7mozilla12_GLOBAL__N_119WebGLImageConverter3runILNS_16WebGLTexelFormatE16EEEvS3_NS_29WebGLTexelPremultiplicationOpE>:
14963 <vp9_idct32x32_1024_add_sse2>:
14838 <_ZN7mozilla12_GLOBAL__N_119WebGLImageConverter3runILNS_16WebGLTexelFormatE4EEEvS3_NS_29WebGLTexelPremultiplicationOpE>:
14792 <_ZN7mozilla12_GLOBAL__N_119WebGLImageConverter3runILNS_16WebGLTexelFormatE21EEEvS3_NS_29WebGLTexelPremultiplicationOpE>:
14740 <_ZN16nsHtml5Tokenizer9stateLoopI19nsHtml5SilentPolicyEEiiDsiPDsbii>:

That output looks reasonable. Top of the list is something to do with interpreting JS, followed by some HTML name static initializer thing. Guessing from the symbol names it seems like everything there would be pretty big. So far so good.

Fifth step: see how much space nsTArray takes up. As you can see above, the function names in the disassembly are mangled, and while I could spend some time trying to figure out how to demangle them it didn't seem particularly worth the time. Instead I just looked for symbols that started with nsTArray_Impl which by visual inspection seemed to match what I was looking for, and would at least give me a ballpark figure.

grep "<_ZN13nsTArray_Impl" libxul.sizes | awk '{ total += $1 } END { print total }'
377522

That's around 377k of stuff just to deal with nsTArray_Impl functions. You can compare that to the total libxul number and the largest functions listed above to get a sense of how much that is. I did the same for nsRefPtr and got 92k. Looking for ZNSt6vector, which I presume is the std::vector class, returned 101k.

That more or less answered the questions I had and gave me an idea of how much space was being used by a particular template class. I tried a few more things like grouping by the first 20 characters of the function name and summing up the sizes, but it didn't give particularly useful results. I had hoped it would approximate the total size taken up by each class but because of the variability in name lengths I would really need a demangler before being able to get that.

[ 6 Comments... ]

Brendan as CEO

2014-03-31 16:37:55

I would not vote for Brendan if he were running for president. However I fully support him as CEO of Mozilla.

Why the difference? Simply because as Mozilla's CEO, his personal views on LGBT (at least what one can infer from monetary support to Prop 8) do not have any measurable chance of making any difference in what Mozilla does or Mozilla's mission. It's not like we're going to ship Firefox OS phones to everybody... except LGBT individuals. There's a zero chance of that happening.

From what I've read so far (and I would love to be corrected) it seems like people who are asking Brendan to step down are doing so as a matter of principle rather than a matter of possible consequence. They feel very strongly about LGBT equality, and rightly so. And therefore they do not want to see any person who is at all opposed to that cause take any position of power, as a general principle. This totally makes sense, and given two CEO candidates who are identical except for their views on LGBT issues, I too would pick the pro-LGBT one.

But that's not the situation we have. I don't know who the other CEO candidates are or were, but I can say with confidence that there's nobody else in the world who can match Brendan in some areas that are very relevant to Mozilla's mission. I don't know exactly what qualities we need in a CEO right now but I'm pretty sure that dedication and commitment to Mozilla's mission, as well as technical expertise, are going to be pretty high on that list. That's why I support Brendan as CEO despite his views.

If you're reading this, you are probably a strong supporter of Mozilla's mission. If you don't want Brendan as CEO because of his views, it's because you are being forced into making a tough choice - you have to choose between the "open web" affiliation on your personal identity and the "LGBT" affiliation on your personal identity. That's a hard choice for anybody, and I don't think anybody can fault you regardless of what you choose.

If you choose to go further and boycott Mozilla and Mozilla's products because of the CEO's views, you have a right to do that too. However I would like to understand how you think this will help with either the open web or LGBT rights. I believe that switching from Firefox to Chrome will not change Brendan or anybody else's views on LGBT rights, and will actively harm the open web. The only winner there is Google's revenue stream. If you disagree with this I would love to know why. You may wish to boycott Mozilla products as a matter of principle, and I can't argue with that. But please make sure that the benefit you gain from doing so outweighs the cost.

[ 5 Comments... ]

Javascript login shell

2014-03-23 10:46:29

I was playing around with node.js this weekend, and I realized that it's not that hard to end up with a Javascript-based login shell. A basic one can be obtained by simply installing node.js and ShellJS on top of it. Add CoffeeScript to get a slightly less verbose syntax. For example:

kats@kgupta-pc shelljs$ coffee
coffee> require './global.js'
{}
coffee> ls()
[ 'LICENSE',
  'README.md',
  'bin',
  'global.js',
  'make.js',
  'package.json',
  'scripts',
  'shell.js',
  'src',
  'test' ]
coffee> cat 'global.js'
'var shell = require(\'./shell.js\');\nfor (var cmd in shell)\n  global[cmd] = shell[cmd];\n'
coffee> cp('global.js', 'foo.tmp')
undefined
coffee> cat 'foo.tmp'
'var shell = require(\'./shell.js\');\nfor (var cmd in shell)\n  global[cmd] = shell[cmd];\n'
coffee> rm 'foo.tmp'
undefined
coffee>

Basically, if you're in a JS REPL (node.js or coffee) and you have access to functions that wrap shell utilities (which is what ShellJS provides some of) then you can use that setup as your login shell instead of bash or zsh or whatever else you might be using.

I'm a big fan of bash but I am sometimes frustrated with some things in it, such as hard-to-use variable manipulation and the fact that loops sometimes create subshells and make state manipulation hard. Being able to write scripts in JS instead of bash would solve that quite nicely. There are probably other use cases in which having a JS shell as your login shell would be quite handy.

[ 1 Comment... ]

The Project Premortem

2014-03-14 21:33:23

The procedure is simple: when the organization has almost come to an important decision but has not formally committed itself, [Gary] Klein proposes gathering for a brief session a group of individuals who are knowledgeable about the decision. The premise of the session is a short speech: "Imagine that we are a year into the future. We implemented the plan as it now exists. The outcome was a disaster. Please take 5 to 10 minutes to write a brief history of that disaster.

(From Thinking, Fast and Slow, by Daniel Kahneman)

When I first read about this, it immediately struck me as a very simple but powerful way to mitigate failure. I was interested in trying it out, so I wrote a pre-mortem story for Firefox OS. The thing I wrote turned out to be more like something a journalist would write in some internet rag 5 years from now, but I found it very illuminating to go through the exercise and think of different ways in which we could fail, and to isolate the one I thought most likely.

In fact, I would really like to encourage more people to go through this exercise and have everybody post their results somewhere. I would love to read about what keeps you up at night with respect to the project, what you think our weak points are, and what we need to watch out for. By understanding each others' worries and fears, I feel that we can do a better job of accommodating them in our day-to-day work, and work together more cohesively to Get The Job Done (TM).

Please comment on this post if you are interested in trying this out. I would be very happy to coordinate stuff so that people write out their thoughts and submit them, and we post all the results together (even anonymously if so desired). That way nobody is biased by anybody else's writing.

[ 2 Comments... ]

More books

2014-02-11 21:58:38

It's been a while since I've posted, but I've been getting through a lot of books that have been queued up on my reading list for a while. Quick rundown:

The old way (Elizabeth Marshall Thomas) - An account of the Kalahari bushmen, written by one of the first outsiders to live/interact with them. At a high level the book is similar to The World Until Yesterday, in that it relates how a pre-agricultural civilization used to live. I found it pretty interesting but probably not everybody's cup of tea. Books like these always make me more aware of how so many things we assume are "normal" really aren't.
Nothing to Envy (Barbara Demic) - The book follows the lives of a few people who lived in and escaped from North Korea. Quite well written. There were definitely some parts that took me by surprise - one of those "fact is stranger than fiction" things. If anybody ever does a detailed psychoanalysis of Kim Jong Il I would like to read it.
Mindset: The New Psychology of Success (Carol Dweck) - This is one of those books everybody should read, ideally before they have children. Life-changing in some ways. I think this book has been popular enough that some of its messages have seeped out into "general knowledge" but there's still a lot of stuff there that I hadn't encountered before.
Moonwalking with Einstein (Joshua Foer) - Ehh. It was certainly an entertaining read, but of little practical value. He describes how to create memory palaces so that you can rapidly memorize things like decks of cards, but that sort of stuff doesn't help me with being absent-minded and forgetting where I left my phone. There's some good discussion in the book about the pros and cons (and history of) of developing your memory which I found interesting.
Revelation Space (Alastair Reynolds) - Science fiction book. Pretty good overall although I was unsatisfied with the ending.
Your Money or Your Life (Vicki Robin, Joe Dominguez) - Pretty comprehensive book on personal finance management. I only skimmed this because there wasn't much in here that I didn't already know, either from reading The Wealthy Barber or my own experiments. But a good book if you're looking for something in this category.
Influence: The Psychology of Persuasion (Robert Cialdini) - Another must-read book. All about the subtle ways people exert influence on you, and what you can do to defend against it. What surprises me here is how easy it is to drastically improve the odds that somebody will agree to do something they fundamentally don't want to just by using a few of these tricks. (You can also use this knowledge to influence others, although the book is not written from that standpoint.)
Drawing on the Right Side of the Brain (Betty Edwards) - This book teaches you how to draw, and more importantly, how to see things differently. I haven't finished this book yet but I have gotten through enough to know it's good. If you're looking for a hobby I suggest picking up this book. Note that my best drawings prior to starting this book are in the form of stick figures, but I'm already confident that I will be able to draw well after finishing this book and practicing some.
Dogfight (Fred Vogelstein) - I started this book recently but abandoned it. I don't know why I even started to read it, but it wasn't worth the time.

That is all.

[ 2 Comments... ]

[ « Newer ]

[ View: List | Cloud | Calendar | Latest comments | Photo albums ]

[ Older » ]