So, as the article has to do with "code compression" and the comments seem to take various interpretations of the word "compression", I will lend my interpretation as it fits in this blog.
I take it to mean "concept compression", or the ability to express consisely more abstractions. Less code doing more work. Higher levels of abstractions. He identifies "patterns" as a source of issue. Mainly in that they don't provide the right level of abstraction, instead turn into copy and paste, leading to further bloat. I can't remember if he says it directly, but a Lisp/Scheme programmer would manipulate the language, write a macro or somehow roll the pattern into an abstraction available from a keyword, or as a "property" available to all data structures that "inherit" from the macro/keyword. I think that is what he is really referring to.
I also feel his pain... For me, it isn't that the code is "bloated" necessarily, it is that the cost to maintain it is SUBSTANTIALLY larger than the initial cost to develop it. Part of that might be that the code takes on Von Neumann machine characteristics as more and more features are bolted onto it. But a more likely source of issue is related to the lack of abstraction in general. Patterns (and here I mean any pattern, not just the GoF) internal to the system aren't abstracted up a level, which is another way of saying that the dev doesn't produce more features with LESS code. (In most languages it is harder to do abstraction than copy and paste, so the dev does a lot of C&P or leans on the IDE to do it.) And I think this is the heart of what he is talking about with "compression". It is about finding a way to write the code so that you get more functionality with less code. It is also about using a tool that makes it easier for the dev to abstract than to C&P.
I have found that this is language agnostic for the most part. The trick is:
- Realizing when to use which language
- Creating the system in a waythat allows different languages to play together
But, this isn't necessarily easy with the state of the current toolsets.
Another piece of the puzzle is that smaller code bases and more abstractions usually (maybe always) mean more conceptually advanced models and topics. Which in turn means that more time goes into thinking about the model than writing the code. It is probably common for an experienced dev to spend a week thinking over the various abstractions of a problem and writing 5-10 lines of code to support it, where in the past that dev would have spent 10 minutes thinking and written 500 lines. This becomes more difficult when there is a deadline. I tend to think that regardless of the toolset, there is a flight or fight response that happens to devs when timeline pressure is added. That causes tunnel vision, which, in turn means they can't write the short version of the code, only the brute-force version. Even when a dev might know that time in design is time well spent, the panic to "just get it done" kicks in. I would be suprised to find out that I am the only dev that this has happened to.
Related is the "trainability" of that codebase. It isn't that the code is hard to read, it just requires the next dev to take a step back when they read it. Pretend that there is no deadline to learn the code and be productive.
If the code reads top to bottom in a procedural way, then it is easy to understand. Humans can grok process pretty quickly. If there are a handful of classes holding state, it is a little more complex, but there will still be a state transition graph to map the process (at least conceptually). But making the jump from (for example) a graph based implementation to a constraint solvability problem type of implementation can be a head scratcher for the person learning the code. Now the graph rendering portion of the code might be non-deterministic, or non-procedural.
I have found that humans need a picture in their minds. From the picture they start to draw the model. An analogy might be a 2-D rendering that becomes 3-D as more and more is learned about it. This is I think, one of the keys to writing large codebases and maintaining them... The model has to be simple to grok, and it has to be obvious how the code follows the model. And the model is tied in large part to the abstractions that are available. In fact, the abstractions are likely the "language" of the model.
Finally, the last point that I saw come up in the comments over and over, was the fact that he wants to do it with one person. The team isn't something that he really drills into, but there-in lies the crux of the "People who get it" and the "People who don't get it" arguement he lays out. The people who get it realize that you don't need a team to support a large number of dev efforts. In fact, it might be preferential to not have a team at all. So, if the assumption is first and foremost that there is only 1-2 people, then this is a totally different problem space altogether. If you can't throw more resources (up to a limit) at the problem, then popular opinion says that the work can't be done. It is like the more shovels you have to move a pile of dirt, the more dirt you can move. This only holds true if everyone has more or less the same toolset and knowledge level. But, suppose one day someone shows up with a steam shovel... Then it really isn't a fair comparison anymore. The steam shovel can move the whole pile of dirt alone in less time than it takes all the individual shovels.
There are plenty of examples out there, I site PG at Viaweb or RMS on the Lisp that kept pace with (was it Allegro?). RMS might be urban legend now, I would have to check... But, I have my own examples of this happening, so I know it can be done. And not through a hero's effort. At least, as far as man hours go... The initial thought process might be a hero's effort. Someone has to figure out how to build a steam shovel and realize that it can be applied to moving dirt, but once that work is done, it is "easy" for the driver to go through the physical process.
What is needed to make this happen is a toolkit that frees the dev to work the model and the code in parallel. An OO analogy might be a UML/Code integration package that did 100% seemless roundtripping. Picture making radical structural changes in an OO language via a UML GUI with 100% certainty that it didn't break the code. Picture being able to write the code, reverse engineer the UML, adjust it when the UML "picture" isn't correct and have the resultant code compile. Another example might be realizing that a hierarchy should really be a delegate, an swapping it out with a line or two of code. Or, realizing that switching to multiple inheritence might be a better abstraction mechanism for you problem space... (try to do that in your average language!) Pre-processor macros in a sufficiently advanced language can solve 80% of these problems in a lightweight way. But - they have to be easier to use than not use inorder for it to be effective.
I think the industry is heading towards this toolkit. Someone said that most languages are just a half-baked, partial implementation of Lisp, and that might be true. So, given enough time to catch up, most languages will become a Lisp? I have said that if Lisp just had CPAN, then there would be few excuses... But, in general, the sentiment is probably right on. There are a handful of features that really make that language powerful, and if they were just available to all languages, then it would make the ceiling higher. Of course, people argue the challenge to beginners, and that is, in fact probably one of the biggest gaps. But - then, you wouldn't expect a beginner chem major to start of splitting DNA, so I don't know why we expect beginning comp scis to understand everything right away. That is really the point of an advanced concept. The trick is to NOT take it off the table completely... Rather, let it be available when it is needed. If a language doesn't do macros, or expose the symbol tree in code, or have regular syntax, then the ceiling comes down.
I also tend to agree that the IDE is currently a bandaid and that will need to change for the toolkit to come about. The IDE should be the last in a series of steps. I think Unix has it more or less right, where the tools are written in libraries available at the command line and the last thing someone does is write a GUI around it. I think Unix has it wrong where no one spends any time on the GUI :). But Windows has it backwards, where the GUI is the first thing that a beginning programmer learns/writes about. There needs to be a healthy balance. And there needs to be more standardization in GUIs. Code Generation too... There are a bunch of efforts in this area to make more intuitive GUIS that convey more information faster. But all of that work needs to come after the robust libraries exist. The thing I point to first and foremost with GUIs is the lack of automatability. Which - probably your average user doesn't need (yet). But - if there is a bunch of functionality locked up in the GUI (which seems to always be the case) and you want to integrate that with something else, then you need a way to get access to the code in an automated way. Since GUIs also tend to be closed (at least on Microsoft), the options are things like SendKeys and Mouse movements, which are a bear to make consistently work. But, as it relates to Steve's article, GUIs have very little to do with code compression. If anything it will limit it. The GUI cannot keep pace with the model in someone's head. If it did, it would just be rendering data points in visual space. Writing a GUI is mostly about choosing how the model becomes concrete. And once that GUI exists, it becomes the handcuffs that prevent change.
Perhaps this will change with AJAX (which looks more like data driven GUI rendering to me than the good 'ole servlet/applet model). Or maybe it will be XAML or some other Microsoft project... Maybe it will be a total transformation of the "desktop" into something more along the lines of Johnny Lee's Wiimote hacks. I don't know. But I probably feel stronger that Steve that the IDE is a crutch that programmers could do without. Well... Still need a text editor, so how about an extensible one... vi or emacs :)
