Part 1 - Basic Observations


Demystifying Applications

	The GOTO has been the whipping boy of bad style.  Whether deservedly or not,
	it became the poster child of what not to do in programming.  We should answer 
	the question, why were GOTO's so bad?  By necessity, programming relies 
	on a tremendous number of relationships.  The GOTO is one form of a 
	relationship, a very very SUBTLE and NON-OBVIOUS relationship between 
	two pieces of logic in different parts of a file.  The term "spaghetti code" was
	mostly likely coined as a direct result of GOTO statements.  The 
	term describes a program where the relationships cannot be easily 
	seen or understood.

	Anything that helps us see the relationships in a concise manner is useful.


Injecting Intelligence

	A batch program is supposed to translate a file with a thousand records.
	One of the records has data that is unusual, and for which the code
	in the batch program cannot account for.  The simple example of
	this is a null check (field in record was null, program expected all
	records would have a value for that field).

	But the programmer was running it in "Debug Mode".  So when the error
	occurred, he not only gets the see the special scenario, and the
	problematic code that doesn't account for it, but he can even hack around
	the problem by changing the field value in run-time memory.  He doesn't actually
	change the code, but he is added some on-the-fly intelligence to the
	program by doing some of its work for it.  Latter on of course, he'll fix
	the program itself so its smart enough to handle this special situation.
	But in order to finish the current batch, which he doesn't wish to 
	restart, he uses the Debugging Tool to modify the program's memory
	and offer a bit of assistance (intelligence) to get it along.

	A program is a bunch of computer instructions, that control various parts of
	the computer (e.g. memory being the main component).  The debugging tool
	can issue computer instructions of its own over the same resources.


Debugging

	The most common debug technique is isolation.  Create an environment which allows
	quick trial-&-error cycles, to discover what you do not know yet (e.g. what is the
	proper parameter I need to pass to this Oracle API function call).  People can
	scream for better documentation all they want, but the good programmer never lets
	himself be dependant on others.  He always has a way past the "person information
	bottleneck".  He has learned how to self-discover, and this really is the most
	important part of programming.  There are entire realms of useful facts no one will
	tell you about.  Certainly making this information easier to discover for all
	who want to know it is one of the major areas that can be improved on in our field
	of computer science.  Verbal knowledge transfer, is of course, momentary and doesn't
	do the next guy looking for facts any good.  And the problem with written documentation 
	is that its not a direct artifact of software development, and no business-process can 
	provide adequate oversight to force it into a reliable (i.e. non-optional, 
	always-correct and current) artifact.  So the solution seems to be in using 
	the required artifacts of software development as the source of information.  They 
	have the supreme luxury of being a current and accurate reflection of the real 
	application, and they even may have the additional benefit of being historical (who made
	what change when) if the artifacts are files and have been placed in version-control.
	
	
Too Many Options

	So corporate likes to send us these e-mails with lengthy instructions: "Click this link, then go 
	under XYZ, and then choose ABC, etc."  Why can't the URL have a direct link that does the equivalent
	of all that jumping around?  Why must my brain temporarily hold those names in my head, so that I can
	manually pick from the list of options the one that applies to me?

	I suspect its because they want you to land on their home page, filled with tons of options, hoping
	that in your hunt for the one option you want to click, you'll notice all the other things they have 
	to offer.  This is usually
	a case of false self-importance, as they're site had nothing valuable to offer.  10 times out of 10, I'm visiting 
	their page because I'm required to; to complete an annual training course or some other corp. task.
	I do not cruise their page in my spare time, no value could be derived from that.  My entry point
	should be the e-mail that says "complete this task by Aug. 28th".  Anything between that e-mail and the task
	is just in the way.

	But let's look at this from another perspective, what if buried in the massive corp pages, there was something
	useful for employees, something that could make them more effective in their normal job role.  But its buried.
	Why does the home page have so many options?  Have they let it gloat by refusing to trim the tree of worthless 
	outdated crap, because they're bureaucratic and can never get approval to remove items.  Is that decision hard 
	because they never know if somebody/somewhere still might need it?  And the firestorm for removing an item 1 
	person uses is more real then the value of having a concise home page for the +100,000 users who navigate to it?

	Take Google's home page.  A massive company filled with lots of high-paid talent, brimming to the seams
	with ideas.  And yet their landing page has only a few options, the most relevant ones for the
	average man.

	This brings to mind a feature seen in Windows.  Its the "remember the most commonly used" 
	menu options, where an application keeps track of what options a user selects.  The idea is
	that these are the user's favorite options, and favorite options should be shown first before 
	any others.  Its a way of reducing the massive sets into something relevant.  The tree is automatically
	trimmed when the user's behavior changes and, for whatever reason, stops using option X.



Expression

	Grammar Syntax is a definition of available terms used to express something.  If
	we attempt to make a grammar solve a very large problem domain, then the set of terms it
	needs are quite larger.

	The problem can be the syntax itself, if it becomes too broad.

	One example of this comes to mind with .aspx files.  It combines both the HTML syntax, a syntax
	for describing logic to a client-side browser, and ASP.NET C# syntax, which is meant for describing
	logic to server-side IIS.  It is ALWAYS better to make two files (.htm and .aspx).  Because
	the .htm editor is very quick.  The sets of acceptable syntax its checking is much much smaller, because
	it gets to ignore the whole server-side syntax.

	What we want to avoid is "comprehensive grammars" because they are bloated sets of options.
	Instead, we want "context-specific grammars" that tackle just 1 problem and do it well, in a 
	highly-expressive manner.



Sequence and Relationships

	As I'm writing some ideas down in notepad for a new UI design, I realize I'm organizing my ideas
	by using just a few simple techniques: order (top to bottom listing) and indentation.  
	The indentation is to denote sections of related items, to break up the document into its logical groups.
	That is, things indented under a title section are apart of that title.  The title is the context
	for all the items indented below it.

	I'm wondering if all expressiveness relies on these two components: the ability to relate things,
	and the ability to order things.


Ad-hoc Expressiveness

	THE most useful computer application (arguably) is e-mail.  E-mail is free form and has no structure,
	only a handful of formatting tools beyond normal letters and symbols (e.g. font-styles, colors, 
	bullets, etc.)
	
	A good writer will use devices, like indentation, to organize the message and its meaning.  This 
	organization could be formalized, but there really is no need to.  The variety of the kind of 
	expressiveness needed in written correspondence varies greatly depending on the subject-material 
	and purpose of the e-mail.  While formalization of these expressions (i.e. grammar def) could bring 
	benefits (e.g. type-checking, intelli-sense, and better querying) it just doesn't seem worth the effort.

	So creating new rules of expression "on-the-fly" to meet the need of a particular communication
	is commonplace.  



Domains

	Domains are finite sets of options.

	Computer logic essentially rests on domains, because an instance of a domain can be
	represented as a number.  Computers are really good with numbers, a fact which everyone
	knew back in the early days of computers, but a fact which is now sometimes forgotten.  
	Computers have become excellent at solving our high-level problems, and we are rarely
	see anymore the numeric calculations behind the scene.

	SOME EXAMPLES:

		- Drop Down Lists & Radio Buttons

		- Intelli-sense (which essentially is a DDL).

		- Trying to get my Dad's computer working.  Mother-board wouldn't recognize the old 
		  hard-drive he had.  There were a variety of IDE options in the BIOS.  One was called "PIO",
		  with a Drop-Down List of values called "None", "Mode 0", "Mode 1", "Mode 2", and "Mode 3".
		  Obviously these names didn't help me very much in guessing what PIO meant, and what each 
		  option would do for me.


	The keyboard, with its discrete buttons, are the best way to create domain values.  For example,
	when you want to Save a document, does a power-user move the mouse up to the File menu, and down
	to the Save option?  Or is the sequence Ctrl+S quicker for them?



Light-Weight and Keyboard-Driven

	Considering Notepad.exe remains one of the most popular Win32 programs
	for high-end computer users (i.e. developers), their is an obvious value in 
	light-weight/keyboard-driven tools.

	Simply put, the keyboard is the best interface with machines.  It has LOTS of buttons.  And buttons 
	are great because they have clear distinct states (on/off).  Though the mouse can mimicking natural 
	physics and the ambiguity of the real world, this turns out to be slow and imprecise communication tool.  

	Thus a responsive application that can be driven strictly through the keyboard is the most useful kind of app.

	I wanted Zippy to be LIGHT-WEIGHT.  So I thought a MS-DOS text-editor written in C would be the 
	simplest kind of application.  I started with VIM, which is open source, and has all the File 
	I/O, string manipulation, reg-exp, etc. what I would need.



Best Part of the Mouse

	...is the scroll-wheel.  And I don't believe its simply because scrolling vertically through documents is a common
	task.  In fact, it might be that the task became popular because of the scroll-wheel (causing HTML pages to go to
	longer formats).  The mouse pointer is scanning over an X,Y coordinate system, while the scroll wheel is just 
	one-dimensional.  AND, the scroll-wheel does this in larger increments (low-res grid vs. high-res free-form).
	
	In short, the scroll-wheel allows less options.  The mouse-pointer allows vastly more options.  And this is
	the rope that a user commonly hangs themselves on: too many options.  E.g. navigating through a 5 level menu system
	on a high-resolution screen, one mistake and you get to start over.  

	Its the difference between a multiple-choice question and an essay question.  With a few clicks of the mouse-wheel,
	you can select A, B, or C, like a Drop-Down List.
	
	Its all sets.
	
	(think smaller sets)



Type-Checking vs. Intelli-Sense

	Both enforce the limitation of a domain, its just a matter of when.  This same question of "when" applies
	to any user-interface.  Type-checking it like a batch report that lists all of your mistakes at the end.
	Intelli-sense is like a drop-down box that only allows you to select valid options.  But programmers
	and developers cheer alike when they get intelli-sense.  Knowing about mistakes sooner rather then latter
	is always valuable.

Part 2 - A better IDE?


The Data Model

	The fundamental building block of all data-model formalization is the STRUCT.  I am referring to the concept 
	introduced in the C programming language, where data could be abstracted, through the use of composition; 
	either of the same types (i.e. arrays) or mixed types (names members of a struct).

	I like the term STRUCT because I think C was the very the first language to pioneer this concept of composition for 
	data-types.  Many future languages took what C had done and enhanced it.  And almost everyone of those languages 
	decided to use a new name, probably because they believed, in their hubris, that the enhancements were so 
	significant as to warrant a new term.  "Abstract data type" is the proper term for this concept, but it is 
	inconveniently long.  The worst name was the one that became the most commonly used: "type".  I do not wish to refer 
	to this concept as "type", only because that word has broader meaning (it derives from the word "typical")
	which is a concept that does plays a role in other parts of computer science.  TYPE is in fact, the basic 
	concept of all abstraction, not just data abstraction (e.g. logical abstraction).

	Data-Models are really constraints; a way to reduce the domain of possibilities/options.  You could 
	always fallback on byte[] to describe data, but this is a completely open-ended/unrestricted data-type.  
	So anything would go.  It is better to model the constraints, as they limit the options and give us 
	type-checking and intelli-sense.  



Creating Models

	- Models are basically descriptions of patterns in reality, of underlying cause and effects at work.
	  A tool that allows the quick creation of new models (truths) is essential for clear communication
	  with a computer.  The more expressive we are in our intent and meaning, the better we are able
	  to withstand the change that revelations bring (e.g. "Opps, we forget to talk to Mary, and she says
	  that waivers must be included!  Crap!").

	- CHANGE: Our model's are always too simplistic initially, and will need to be enhanced latter.
	  But all modeling changes must consider what the effect will be to pre-existing data and logic made 
	  under the old simplistic model.

	- DEBUGGING: Modeling rules will have errors, and a tool is needed to see the state-transitions
	  in order to find the bugs and fix them.



Change Management

	Software development must be regarded as something that changes, for more reasons then just shifting 
	user requirements.  Let me give a simple example.  Developer #1 comes in and writes a working application.  
	But because of deadlines it was not a work of art.  Developer #2 comes along and needs to make a functional 
	enhancement to the application.  But before they can even touch the code, they must first look at the code 
	and try to understand it, so they may know where the additional functional can be injected without causing 
	harm to any of the existing functionality.  This means Developer #2 tries to make sense of what 
	Developer #1 had written.  #2 uncovers a few things in the code that made it difficult for them to understand it:
		- unclear or misleading names.  Name identifiers are ways to relating things together in the code 
		  logic.  The application itself does not care what names are used, only the relationship matters 
		  to the application.  But a name that accurately reflects the meaning of the thing being discussed 
		  really helps in human comprehension.  I once knew a girl in CS class who liked to name her 
		  variables "tree" and "flower".  Imagine if meaningless names like that had been used in the 
		  logic below.  Its plenty hard enough to understand with the good names.

			var principal = (houseCost * 0.20);
			var interestRatePerMonth = interestRatePerYr / 12;
			var loanTermMonths	 = loanTermYr * 12;
			var monthlyPayment	 = ((houseCost-principal) * 
							(interestRatePerMonth / 
							 ( 1 - Math.pow(interestRatePerMonth+1,-1 * loanTermMonths) )
							)
						   );

		- dead code.  And in many cases it is not immediately obvious that code is unused.  Only a 
		  careful examination of logic in other parts of the app reveal that certain code could 
		  never be reached.  Dead code can be very misleading when trying to understand the purpose 
		  and logic of an existing application.

	So developer #2 wants to fix these things, that impeded their ability to understand the existing logic.  They want to
	do this, not only to help out the next developer who may have to look at this and go through the same process of
	understanding.  But #2 wants to fix it just to add immediate clarity, to help them keep things straight as 
	they make their enhancement.  They may have an understanding for the moment, but there's a lot of things they
	need to keep in their head.  It really helps to write down the understanding in the current code.

	The development environment cannot forbid these kind of changes.  It must expect them to occur on a 
	regularly basis, and the amount of effort involved in making these kind of changes should be 
	reduced to its bare minimum to keep software development efficient.


	
	One of the BIGGEST advantages Visual Studio's text-editor has over VI is tab-indentation.  You can highlight
	a block of text and either press TAB or SHIFT+TAB to move it all one tab right or left.  Why is this so useful?
	Because a common task in the evolution of a software logic is to wrap a whole block of code in an extra if-statement
	(or a while-loop, or some other construct which has a block of code as an inner component).  Developers also have
	to commonly get rid of a construct that they previously thought they needed.  So that's where SHIFT+TAB 
	comes in handy.
	
	You want to make it easy to change logic, so that a programmer does not stall out, trying to plan really far ahead
	like a chess player, not wanting to make a decisional mistake before he types out a lengthy bit of code.
	The decision to include or not include an if-statement should not have to be made now.  It should be trivially easy 
	to add latter.  In many cases, we do not know if we need the if-statement until the block of code has been
	written (test run, details in the block bring new revelations to light, etc.).  There is a natural 
	order to discovery, depending on the particular problem.  The tool should not get in the way of this order.  
	It should not impose an unnatural order of decision-making.
	
	

	So what if we devised a new IDE, that wasn't a dumb text editor, but instead it allowed changes to be
	made at a slightly high-level of understanding.  If the IDE could understand that a member name was being
	changed, it could automatically record this as ("oldName" -> "newName"), and any future code we run into
	that still has the "oldName" could be automatically updated to the "newName"?

	Such an IDE would have to be careful that it properly understood all the potential types of changes that can
	be made to an application.

	KINDS OF CHANGES:

		DATA:
			- change data-type or member's name (doesn't effect data, but does effect logic built on the old name)
			- change inheritor (e.g. was Object, now its System.UI.Control)
			- add new data-type
			- delete old data-type
			- add new member
			- delete old member
			- change data-type of member
				- This includes the problem where method overloading was used for default-value 
				  parameter passing, and the addition of one new parameter caused the logic of 
				  the method to shift.  The old method was VIRTUAL, and classes had overridden it.  
				  But now the logic had shifted to a new method with more parameters.  The super 
				  version of the old method now calls the new method.  But callers that directly 
				  use the new method will be calling the super version of that new method.  Those that
				  overloaded the original method are now bypassed.

				  As near as I can tell, method overloading is really used for two different purposes:
					- Default values for optional parameters.
					- data-type switch, for variance of logic that all wants to use the same method 
					  name.  E.g. Convert.ToString(), which has many overloads, all with 1 
					  parameter (either int, double, DataTime, etc.).

				  Perhaps if we got rid of the low-level technique - overloading - which has just a bit
				  too much freedom, and instead made the two practical purposes above their own
				  distinct features.  Then we'd avoid the nightmare caused when extra parms are added
				  to methods, which in the past could unintentionally swap callers to a different (and
				  unwanted) overloaded method.

		LOGIC:
			- change literal value
			- change operator (e.g. < to <=)
			- change option (e.g. enum value, member name, method call overload)
			- change variable's name 
			- for things with order, e.g. statements in a block, expressions, parameter lists, etc.
				- append new
				- insert new
				- delete


	Another Example:

		new BenefitType( treeItem.e ) => BenefitType.Get( treeItem.e )

	This is a real example, where for a large project, I needed to change how BenefitTypes were created.
	I needed to use a unique hash, so that the code requesting a BenefitType did not always get a new
	object, it could get an existing object.  So that's why all the callers had to switch from the
	class constructor to a static method.

	Here's a slightly harder example (its contrived, but believable):

		new BenefitType() => BenefitType.Get( treeItem.e )

	This change assumes that every caller is going to have access to an XMLElement.  It can often
	be the case that we increase the parms of a method, and we know (or expect) that every caller
	should be able to meet the new burden.  Those that cannot are INTERESTING odd-ball scenarios
	that we want to know about, because they may tell us more about the real problem at hand.


Part 3 - Foundations and Changing Sets Over Time

FOUNDATION: "that which does not move".

Logical components can build on top of other logical components. When programmers
say that "a low-level change broke the whole system" they are referring to a component
that many other components built on top of. (I like the term "FOUNDATION component",
one that we really wish didn't have to be changed - ever). And at the time of the change,
no one knew what were all the other components built on top of it. So no one knew
what impact that change would have on the whole system.

Let me give a specific example.

There was a very large enterprise web application. One part of it was called BPA.
The screens for BPA use floating DIV tags with style="width: 100%". A new requirement
came along, and this was added to all HTM pages in the system...

<html xmlns="http://www.w3.org/1999/xhtml">

This fundamentally CHANGES how Internet Explorer works. BPA was built on top
of IE, and expected style="width: 100%" to be the width of the screen relative
to a tags position on the screen.
But now that 1999/xhtml is added, the CSS-engine of IE (a very low-level component),
has been changed and things built on top of it are affected.
No longer is width:100% relative to the screen's width, its relative to the entire
page's width, including the parts not visible (off the right-hand side of the screen).
The people adding 1999/xhtml did not know the BPA screens
would break. They were probably not even aware of all the style behavior
changes that came along with the 1999/xhtml addition.


So the question is, how can you introduce "enhancements" to low-level components
without breaking existing functionality? In an ideal sense, you'd want to know
ALL DEPENDENCIES. That is too say, if we change the CSS behavior, we would love
a list of places in the current application affected by that change. The question
should be ask, could a large system be built while always maintaining a complete
list of its own dependencies. Why do we not have these dependencies listed in the
large apps you see today? Why are so many of our relationships casual/indirect/implicit?

The notion of an ABSOLUTE FOUNDATION, of something that will not
change is an illusion, one of THE most dangerous illusions in the whole IT
industry. In short, EVERYTHING is susceptible to change. Progress has
been made in the IT industry by the careful guesswork of what things are
MORE likely to change, and which things will are LESS likely to change, or
change LESS OFTEN. And people in the industry have discovered; this
guesswork is quite maddening, as you are wrong about 10% of the time.
Its a very costly 10%, and it seems that no amount of forethought
can drive it down any less.

Let's use a simpler example (with a timeline)...

Jan
	create a data-type with 3 distinct values

		enum AddressType
			Home
			Office
			Mailing


Feb
	write some logic that makes a decision based on
	data of that type...

		if ( addressType == AddressType.Home )
		{
			...
		}
		else
		{
			...
		}


		if ( addressType != AddressType.Office )
		{
			...
		}


March

	The application purrs along.  We process millions of records, with a wide variety 
	of all 3 types.  All is well.


April

	Add a new value to that data-type...

		enum AddressType
			Home
			Office
			Mailing
			BILLING

May

	More data runs.  Some of this data has the new value "BILLING".



The original data-type called "AddressType" is a FOUNDATION SET.
Other things (e.g. logic) were built on top of it. At the time
of writing that logic, it was assumed AddressType only had 3 values.

Now it would be nice, if at the moment we change AddressType,
we can instantly find out what was built on top of it, what
other pieces were built with the assumption that AddressType
only had 3 values? Because that assumption has changed, and
anything built on that assumption is now potentially affected.

But dependency relationships like this cannot always be maintained.
AddressType could be defined in library X, and the logic defined
in program Y. And who knows where program Y is. Dependencies of
this nature can span across many kinds of barriers (e.g. program
Y is maintained by a completely different company, and X doesn't
even know about their existence).
It is easy for Y to build on top of X, without X knowing about it.
So the relationship is one-way: X <= Y.

CONSTRUCTION DEPENDENCIES are, by their very nature, one-way.
Extra record keeping can make them two-way, but the sheer number
of decencies, and the different types of dependencies, and the
kind of barriers that these dependencies span across, all lead us
to the conclusion that comprehensive record keeping is impossible.
There is almost nothing you can do to maintain all of your
construction dependencies as two-way relationships.

So how do humans adapt when they learn new facts? How do
humans handle growing sets? Don't humans build logic, based on
assumptions? Are humans maintaining bi-directional dependencies in
their head all the time? If so, then once an assumption changes,
can a human instantly know everything built on that assumption,
everything that must now be revisited? That doesn't seem right.
Is it not true that we also have disconnected (i.e. one-way)
dependencies? An assumption changes,
and only several months down the road do we find ourselves using
a familiar piece of logic (built on the assumption), but a sense
of the NEW hits us and we realize that for the first time we
are a using a new kind of data through our old-logic, and
we feel the need to pause and REVISIT that logic before
actually using it on the NEW. Is that not a more accurate
example of human learning?

So let us consider for a moment what we mean by REVISIT.
Logic, like the if-statements above, is a concrete implementation
of some need (e.g. business requirement). It is a codified
version of a reason. That is to say, we
can ask the question "why is the if-statement coded like that?", and
the answer would lead us back to some reason. It could be a business
reason, could be technical reason, whatever, doesn't matter. What
matters is the REASON has an explanation of why the codified
logic is the way it is, and as assumptions are changed
we must revisit the REASON to understand how the codified logic can
be rebuilt to address the new assumptions.

To put it another way, the code performs a task. It performs
this task faithfully, over and over again, without ever asking
the reason for why it must be done that way. It is mechanical.
There was something else that drove its creation into being. The
details of the code, the decision-making its goes through and
the actions it takes, give us some idea of what original force
created it. But in many cases, the original force has more
information then what the code currently tells us. Here is
an example of a few details that are left out of the code (because
remember the code only needs to PERFORM the task, and as quickly
as possible, it does not need anything else).

            RuleList list = f.GetBenefitRules(bs, decisionPrefix, planVariableCode, planValue);
            foreach (Rule rule in list)
            {
                ...
                if (rule.SakExcludedRule > 0 && !nulls.ContainsKey(rule.SakExcludedRule))
                    nulls.Add(rule.SakExcludedRule, null);
            }

            IBenefitSet p = bs.Parent;
            while (p != null)
            {
                list = f.GetBenefitRules(p, decisionPrefix, planVariableCode, planValue);
                foreach (Rule rule in list)
                {
                    if (rule.SakExcludedRule > 0)
                    {
                        if (!nulls.ContainsKey(rule.SakExcludedRule))
                            nulls.Add(rule.SakExcludedRule, null);

                        ...
                    }

                    if (nulls.ContainsKey(rule.Sak))
                        continue;

                    ...
                }

                p = p.ParentSet(benefitClassification);
            }

We can see that rule's with SakExcluded > 0 are added to a hash called
"null". And we see that if another rule has its Sak in that null hash,
then that rule will be skipped. But why? What business reason drove
this design? The reason was to let users create high-level rules
that would apply to many low-level benefits, but then to also allow
exceptions at the lower level. So the SakExcluded is a way for a low
level rule to exclude a high level rule from the claim-engine's consideration.

Understanding these reasons become important when a new concept, like
Rule Versioning (modify a rule inactivates it, and it gets replaced by a new
rule) enters the picture and affects the logic above. In order to change
the logic above correctly, we must go back the original reasons for why
the logic was written like this.


It is also quite common for code to be a combination of many
driving forces. While a business spec might only say "return the
count of benefit groups", the code might in fact be a lot more
verbose:

        public int GetCount(Classification c)
        {
            if (this._countCache.ContainsKey(c))
                return this._countCache[c];

	        int count = 0;
            foreach (BenefitGroup g in c.RootGroups)
            	count += CountSubGroups(g);
            this._countCache[c] = count;

            return count;
        }
        Dictionary<Classification, int> _countCache;

There are other requirements mixed in with this code, such
as "the user needs to see the count quickly". And the
assumption, "classifications do not change while this application's
process is alive, so we can use the application's memory to cache
the count".

Sometimes code is wordy because it tells a machine, step by step,
how to perform a task. Collectively the steps result in meeting a
business need, but its not always obvious what the need is by
just looking at the steps.

	SPEC: Re-order the ArrayList so that an exact match comes first
	LOGIC:
            for (int i = 0; i < totalCount; i++)	<= this is scanning the whole list
            {
                Benefit b1 = trimList[i];
                if (i > 0 && b1.ServiceCode == code)    <= "code" is what were searching on (i.e. the "exact match")
                {
                    trimList.Insert(0, trimList[i]);	<= this is putting it at the front
                    trimList.RemoveAt(i + 1);
                    break;
                }
                else if (i == 0 && b1.ServiceCode == code) <= if already at the front, break out, no need to 
                    break;				      waste CPU cycles examining the rest of the list
            }



What is critical here is that we do not lose the link
between codified logic and original reasons. Should any
tool ever make us aware the "our code is wrong and cannot
handle value X properly", then we must be able to trace back
to the original intent of the logic to enhance it for this
new scenario.

But even more importantly, we must understand that many times
no tool will tell us that "hey your code is wrong!". It
may hum along just fine with the new value. Like our very
first example:


		if ( addressType != AddressType.Office )
		{
			...
		}

Something will screw up eventually down the line (e.g. report totals
are off), but its a bit of a problem for us when it doesn't crash sooner
then latter. It leaves us with a mystery, for which we must now
spend a lot of time back-tracking to where the problem started.

The compiler will not complain about the code above when we add BILLING
as another acceptable value for AddressType. And when we run
with new data, some of which has BILLING, the code executes without
a hitch. The current logic already handles BILLING. In fact, it had
made a decision about what to do with BILLING, even before BILLING
was ever conceived. Such is the power of "everything BUT these values":
the exclusive set. (Both the NOT operation, and ELSE statement
are examples of this).

At the time that logic was written, AddressType had 3 values. Did
the developer assume that AddressType would always have 3 values? Even
if he knew AddressType might get new values in the future, he could not
foresee what those new values would be. Did he really intend for this
logic to be applied against those new values, when he could not know
the specifics about them?

AddressType, like all sets, represents a problem domain. Humans,
as they learn, expand their sets. A child might think the only kind
of apples are red and green ones. But latter on they learn to make
finer distinctions, between Red Delicious (which suck) and Jonathan
(which are quite good). Perhaps there is a DEFINITIVE set of the kinds of
apples that exist in reality. But humans do not know it up front,
and for all practical purposes we can say that we never know it, or
at least are never sure if there might be one more exception to the
rule, a new distinction to be discovered. Our learning is open ended.
We expect to discover new things, but this expectation doesn't stop us
from building logic based on the details we have acquired up to
that point. We realize that as we learn new things, some of our logic
will have to be thrown out or revisited. We accept the potential for
future re-work, and we handle it quite well. How is it that we handle
it so well?

So if logic uses exclusion sets (everything BUT some values) from
a domain, then the logic is actually referring to all the other
values of the domain...AND any future new values that get added to
the domain! The 1st part we want. The 2nd part, we do not want, or at
least we are not sure if we want.

Let us go back to humans. I like both Jonathans and Gala apples.
Both are red. The only red apples I know of "to date" are Jonathans,
Gala, and Red Delicious. I keep my logic simple and say "I like any
red apple, except Red Delicious". I come across a red apple. And I
notice its not Red Delicious so I'm about ready to dig in. But then
I notice that it doesn't look like a Jonathan or a Gala. I've discovered
something NEW. And now I'm not sure if my original logic
will apply to this new apple. Chances are good. 2 out of 3 red apples
have proven to be tasty so far, so I'm betting this one will be too.
But I cannot say for sure, because my original logic was created when
I thought there was only 3 kinds of red apples. Unlike a computer
program, that would just blindly apply the logic, a person knows not
to because its for a scenario that is different.

So to build better computer programs, that are more adaptable to change,
we could try and apply a few of the concepts from human learning:
- all logic keeps a tie back to a reason for its existence
- all logic keeps a tally of the number of times it has executed against
different kinds of data. If it can recognize a new situation, then it
can know when it is entering uncharted waters and should proceed more
slowly (i.e. re-examining the whys before actually executing the current
codified logic, that codified logic was built on old assumptions and
the NEW situation might change one of them).


This last bullet is probably the reason humans do not like doing repetitive
tasks. They like variety. They like to learn. They do not want to do
a mechanical tasks 1,000 times in a row. Because they are discretely AWARE
of each occurrence, and AWARE of the blatant similarity in each occurrence.
The sameness is killer. Humans cannot turn off the introspection, it is always
on (well, unless we're sleep walking). We are not merely DOING, we are also
THINKING about what we are doing, and trying to make sense of it. It is the
struggle of "making sense" that we love, because its an indication of
learning, discovering something new - if only just slight variations in
experience. In the absence of the struggle, in the presence of perfect
SAMENESS, we are bored senseless. Our raison d'etre is gone. Solitary
confinement is the prime example of how sameness is killer, because of the
acute absence of anything to learn.

The last bullet also highlights how this human concept: INTROSPECTION (record
keeping of every task we perform and identifying as the expected/familiar or
new) could never be applied to all computer programs. Sometimes, we really
want them to be dumb. The virtue with dumb and mechanical is it never gets
bored. It never seeks to pause and learn. It just executes, and does it with
blazing no-holds barred speed.

Our very first example, the IE CSS engine change, is good example of where
INTROSPECTION would be very difficult to apply. The logic is "width: 100%".
This statement, when issued to the browser, is expecting that the div tag
will be stretched to the far right of the viewable window. After handing
off this instruction to the browser, should it always check to make
sure that what it expected to happen did in fact happen?


Let's back up again, and consider humans. Do we ever take action before
fully realizing a new situations is upon us, and our existing logic really
won't work? Another way of asking this is, do we make mistakes? (Of course).
And what do we do? Well, at some point we realize the mistake, and then we
do a bit of back-tracking to find the cause. We repeat the same task we
just did, but a little slower, looking for something out of the norm.

What this implies is that humans do in fact have highly-compressed logic,
which we can use without much thought. I recall, driving down a familiar
highway at night, and I realized how even while my brain was completely elsewhere,
my eyes watched the yellow line and adjusted my hand on the steering wheel,
almost like it was doing this mechanically without any need for conscious thought.

Back to the IE CSS-engine, we issue "width: 100%" to the browser mechanically,
without any checks. It worked a million times before, why should we doubt it?
But a little latter, we realize that we cannot see an item on the page, because
its way too far to the right, off the viewable window. Hmmm, we've never had
to scroll to the right before in order to see that field. Something is amiss!
And when something is amiss, we fall back and start wondering if one of our
assumptions is wrong. We replay, more slowly, and instead of using our optimized
thoughtless logic, we now use our back-up "whys" to guide us, and check outputs
after each one of our actions, to see if they match what we expect. Eventually,
we discover that issuing "width: 100%" does not stretch the floating panel as
we expected it to. It stretches it much wider. We do not know why exactly,
but we have narrowed the problem down to this, and we can begin to wonder, "well
where did our assumption come from, that width:100% would only stretch to the
window's right side?". Where did it come from indeed. We'll we learned it by
experimenting with IE. We played around with these styles and width:100% always
did the same thing. But why is it now different? Our browser version hasn't
changed. Must be some other field on the page that is affecting the behavior
of this style-setting. We trim down the logic, reduce it to the bare essentials.
Still it misbehaves. Maybe we notice the xmlns="http://www.w3.org/1999/xhtml".
If we do, then like everything else, we'd try removing it to see if it has an
affect. Ah ha! It does! We have learned something new. Or maybe we don't notice
it. Of we see it, but just assume that its a non-functional statement.
So we don't bother trying to remove it. But we go to an
old version of the app, where the floating panel still works. We do the same
reduction. At some point, we would diff the two files and see the
xmlns="http://www.w3.org/1999/xhtml" as the only thing left to try, the only
outstanding difference. Upon removing it, we would discover that this harmless
looking statement is indeed affecting the functionality.

So the point is codified/optimized logic came from somewhere. We need to keep that
tie whenever something new is discovered. We may have to recreate the codified
logic due to a shift in our assumptions. Our higher-level intentions rarely change
drastically. Its the low-level details though that will be in constant flux
as minor shifts in our knowledge of the world occur. Low-level code is in fact quite
expandable, or at least it should. There is the disconcerting fact that for any
large app that has lived +5 years, what percentage of it is dead-code? What percentage
of it is written in a contorted fashion? And why did it get like this? Well, for a
large app, we've lost the ties to original intent/original reason. The codified logic, in
its current state, becomes a trusted source of rational. SOMEBODY asked for the app
to work this way, we are no longer sure why exactly. But in the absence of requirements
or specs or an actual USER who can tell us the business, we have to assume the existing
codified logic is correct. And when new assumptions come along, we can not properly
mix them with the old assumptions to create new clean logic. We are not even able to
identify if an assumption is new. We just learned it, and it seems like its hot off
the press, straight from the USERs. But did the original developers already know about
this assumption? Since we don't have the list of original assumptions, how would we
even be able to tell if this is a new one?

A great deal of time is spent reverse-engineering the existing application's codified
logic, to figure out what assumptions the original developers were working with. That
logic has been tested in the fires of reality. We DO NOT trust written documentation!
Who knows when that doco was written. It was probably written at a very naive stage
of the development process, when everyone thought everything was fine, they had it all
figured out, and they had enough time on their hands to actual write doco. Then the
deadline approaches, and the app isn't working. A veteran developer works the 11th hour
discovering all kinds of important details no one had bothered to learn previously,
things fundamental to making the app work. He barely had time to breath. Did he write
down any of those details he just discovered? Could he even remember them all after
the task is done? Or is his brain so overloaded he goes straight to bed, and he's
just amazed that he even got the app to actually WORK, to pass all the test cases.
Don't ask him how exactly it works, thank goodness it just does. Of course his details
ARE written down...but its in CODIFIED LOGIC. That's good enough
to make the computer do the task, and consequently that's all anyone really cares
about at the moment. When we revisit this logic to ENHANCE it, there will be some
cursing as some other developer has to reverse-engineer the code to find out what
that previous developer has discovered in that 11th hour.




MIXING

The REAL problem with code, is that its a mix of needs. This is to say, as
we codify needs, they do not remain isolated/separated. They get mixed
in together with other needs. The level of mixing varies, but at the extreme
level the code is a slurry where the different parts can no longer be discerned.
In fact, we can't even be sure what all the original needs were. If we look,
the code is an onslaught of low-level details. We start with questions like "why
would they increment i after the comparison to x?", but if there are too
many of these questions all in the same logic, our hypothesis
seeking/testing becomes overwhelmed and the whole reverse-engineer becomes
very very difficult.

Think of code like cookie dough. It starts out as sugar, flour, eggs, etc.
But then these all get mixed together, and now its hard to change our
minds. After a few stirs of the handle, taking out the flour is no longer
an option (unless you have tweezers and some spare time).

A lot of development can be described as tweezer work. Users/managers
realize they really want wheat flour instead of white, and the tedious pain begins.

All IT philosophies stress "know what you want before you build it".
But this cannot be realized in an ABSOLUTE sense. Clearer-communication, better
requirement-gathering helps to the extent that it can, but there will always be changes in
direction. That's just inevitable, so the real push in IT comes from the
other direction. In small degrees, developers have found
ways to keep the needs separated during the codify process. That is, we've
found ways to avoid unnecessary mixing. This is a step in the direction of
an IT Holy Grail: agile applications, where on a whim we can change the
requirements, adjust one small part of the app, and the rest of it doesn't break.