An ode to architecture

All timestamps are based on your local time of:

Posted by: stak
Tags:
Posted on: 2009-12-23 20:29:15

I was reading this article about MinWin and what they're trying to accomplish. I hadn't really understood (or cared) what it was before, but after reading that article I gotta say, I'm really glad that they're doing it. I can feel their pain in trying to untangle large piles of old code, because I've been doing much the same thing at work for the past little while.

There's code that's been lying around for around 8-9 years, predating anybody on my team. There are some fundamental differences between code like this and "new" code. For one, there's nobody around who still understands what the code does, and obviously any documentation is so out of date that it's only purpose is to mislead you. The only way to understand it is by examining the code itself - you have to read it, poke at it, and break it.

I believe that in pretty much any real-world production system, built under real-world constraints, there will be some code like this. In order to maintain it, or even to rewrite it, being able to read and understand code is a fundamental skill. I think of this as another argument against writing documentation. If developers need to acquire the skill of reading code anyway, then you might as well use that skill on new code as well. This makes the documentation redundant.

Of course, that's not all. Usually when reading new code, variable names and classes do have some relationship to the concepts they represent. The older the code is, the weaker that relationship becomes, because the concepts get shifted and skewed whereas the names do not. As far as I can tell the main reason this happens is because it's just a chore to rename things, particularly in systems where the code has been branched into different versions. Integrating fixes after you've renamed things (particularly in Java, where you have to rename the file if you rename the class) is a major pain with no tangible benefit.

I feel the blame here lies mostly on revision control systems. Renaming a file in, say, Bazaar is trivial compared to the same operation in Perforce. An even better revision control system (which incidentally is also my solution to the subjective readability problem) would store the parsed syntax tree of the code rather than a flat text file so that operations like variable renaming could be tracked as a single change and integrated into branches trivially. Such an RCS would also have a long list of other advantages, but I'm not going to get into that until I start writing one :)

Another thing mentioned in the MinWin article is how "countless spaghetti strands extend outwards from the core of Windows to the layers higher up in Windows" - this is basically the programmer's version of dependency hell. When you have dependencies running amok between different parts of the code, everything gets really bad really fast. It becomes easy to end up with circular dependencies - to solve that you either end up compiling both pieces of code as a unit so the compiler can deal with it, or changing one of the dependencies to be some sort of runtime/reflection thing, which makes the code an order of magnitude harder to follow. Code like this is also (by definition) not very modular, and so is hard to unit-test.

When we were writing Mango, one of the design principles we enforced was that even though all the code was compiled together as a unit for production use, the packages in the code were arranged in a DAG. This allowed us to build -- and more importantly, test -- subsets of the rendering engine with each layer adding more functionality to the previous subset. The MinWin team seems to be realizing similar benefits in being able to build standalone subsets of windows for different purposes. In my subjective opinion, of all the design decisions we made, this was probably the single most useful one. Without it, the code would have collapsed in on itself and become an unmaintainable mess within a year, given the rate at which we were churning out code.

Enforcing that design decision from the start was key, though. As I'm discovering with my current refactoring efforts, it is extremely difficult to handle code that was developed without that sort of modularity. Coercing the code into a more elegant design requires several passes of refactoring and lots of time. I'm just thankful I have a smaller codebase to deal with than the tar pit that is Windows.

[ Add a new comment ]

	An ode to architecture
Home Blog Snippets About All timestamps are based on your local time of:	Posted by: stak Tags: Posted on: 2009-12-23 20:29:15 I was reading this article about MinWin and what they're trying to accomplish. I hadn't really understood (or cared) what it was before, but after reading that article I gotta say, I'm really glad that they're doing it. I can feel their pain in trying to untangle large piles of old code, because I've been doing much the same thing at work for the past little while. There's code that's been lying around for around 8-9 years, predating anybody on my team. There are some fundamental differences between code like this and "new" code. For one, there's nobody around who still understands what the code does, and obviously any documentation is so out of date that it's only purpose is to mislead you. The only way to understand it is by examining the code itself - you have to read it, poke at it, and break it. I believe that in pretty much any real-world production system, built under real-world constraints, there will be some code like this. In order to maintain it, or even to rewrite it, being able to read and understand code is a fundamental skill. I think of this as another argument against writing documentation. If developers need to acquire the skill of reading code anyway, then you might as well use that skill on new code as well. This makes the documentation redundant. Of course, that's not all. Usually when reading new code, variable names and classes do have some relationship to the concepts they represent. The older the code is, the weaker that relationship becomes, because the concepts get shifted and skewed whereas the names do not. As far as I can tell the main reason this happens is because it's just a chore to rename things, particularly in systems where the code has been branched into different versions. Integrating fixes after you've renamed things (particularly in Java, where you have to rename the file if you rename the class) is a major pain with no tangible benefit. I feel the blame here lies mostly on revision control systems. Renaming a file in, say, Bazaar is trivial compared to the same operation in Perforce. An even better revision control system (which incidentally is also my solution to the subjective readability problem) would store the parsed syntax tree of the code rather than a flat text file so that operations like variable renaming could be tracked as a single change and integrated into branches trivially. Such an RCS would also have a long list of other advantages, but I'm not going to get into that until I start writing one :) Another thing mentioned in the MinWin article is how "countless spaghetti strands extend outwards from the core of Windows to the layers higher up in Windows" - this is basically the programmer's version of dependency hell. When you have dependencies running amok between different parts of the code, everything gets really bad really fast. It becomes easy to end up with circular dependencies - to solve that you either end up compiling both pieces of code as a unit so the compiler can deal with it, or changing one of the dependencies to be some sort of runtime/reflection thing, which makes the code an order of magnitude harder to follow. Code like this is also (by definition) not very modular, and so is hard to unit-test. When we were writing Mango, one of the design principles we enforced was that even though all the code was compiled together as a unit for production use, the packages in the code were arranged in a DAG. This allowed us to build -- and more importantly, test -- subsets of the rendering engine with each layer adding more functionality to the previous subset. The MinWin team seems to be realizing similar benefits in being able to build standalone subsets of windows for different purposes. In my subjective opinion, of all the design decisions we made, this was probably the single most useful one. Without it, the code would have collapsed in on itself and become an unmaintainable mess within a year, given the rate at which we were churning out code. Enforcing that design decision from the start was key, though. As I'm discovering with my current refactoring efforts, it is extremely difficult to handle code that was developed without that sort of modularity. Coercing the code into a more elegant design requires several passes of refactoring and lots of time. I'm just thankful I have a smaller codebase to deal with than the tar pit that is Windows. [ Add a new comment ]

	(c) Kartikaya Gupta, 2004-2024. User comments owned by their respective posters. All rights reserved. You are accessing this website via IPv4. Consider upgrading to IPv6!