Tuesday, August 29, 2017

How Codebases Rot

Frank Deford
A lot of guys in my line of work have no use for sports.  But the passing a few months ago of the great sports writer and commentator Frank Deford means I'm going to talk about sports.  Or at least about sports writing.  On NPR every Wednesday morning for decades, in what The New Yorker's Nicholas Dawidoff calls a "breezy vernacular," Frank told a long-form sports story that me feel like I -- a non-athlete and professional nerd -- understood what sports was about.

Frank was up front with Dawidoff about his secret -- Chekov's rifle.  Chekov, the great Russian playwright, said that if there's a rifle above the fireplace in Act One, by the end of the play someone had better take it down and use it.  Defore always opened his reports with some interesting sports tidbit, then spun a seemingly unrelated tale that always managed to find its way back to home plate and that opening tidbit.  Unconsciously you knew he was going to do this, so you listened.

Despite years of listening to Frank Deford, I'm no good at sports tidbits.  But I do know history-of-computing tidbits.  The venerable Fred Brooks authored one of the earliest and still-best books on software engineering project management and left with us Brooks' Law:
How does a large software project get to be one year late? One day at a time!
The notion of incremental crapulence attends us in many ways other than just schedule slippage.  I'm going to focus on one of those other ways, with an example.

Code Rot

"Code rot" is industry jargon for the effects of various forces that tend, over time, to make code more buggy, less maintainable, and less reusable.  We use the metaphor of decay to illustrate that this seems to happen without deliberate effort.  Code that sits largely unchanged will still start to "smell," giving off odors such as API skew, abandoned idioms, obsolete patches, and the like.  Especially when its original developers move on to other companies or tasks, then rotten stanzas of code whose original purpose is known only to the departed author begin to fester and beget bugs.

Let me tell you a typical story of rot.

There once was a great software stack for a best-selling appliance.  It incorporated everything, as we say, "from sand to pixels" -- custom hardware and drivers all the way through the operating system and middleware applications to an attractive web GUI.  But as with many such stacks it had moved in places, stagnated in others, and was about as rotten as a codebase could get while still being profitable.  It had originally been written by a small startup and arrived at a major corporation through a series of acquisitions and reorganizations.  The OS was ten years old (although security-patched).  The build had splintered into a half-dozen incompatible tool chains, the bugaboo of more than a decade of unbridled developer preference.

The newly minted head of the fifty-strong development team immediately prescribed a refresh.  The OS was traded out for a brand-new commercial-grade Linux distro.  The build was normalized to a single compiler and set of libraries.  A handful of the best and brightest on this team worked six months to refresh the entire system from top to bottom.  All the old code rot was gone; the system gleamed and hummed like a lovingly restored Shelby Mustang.

Everything new is old again

All was fine until a developer told me he'd been chastised for making some needed changes in a particular application, which we'll call Whiskey.  We'll call the developer Mike.

Whiskey was the newest member of the software stack.  Prior to the great refactoring, it had been postured as the glimpse of the future.  It used the latest tools, adhered to the 2011 standard of C++ with aspirations to go to C++14.  After the refactor, it was almost entirely free of rot.

Almost.  Soon after, the lead developer -- we'll call him Peter -- had checked into the Whiskey code base a new and safer database layer and used it in one of Whiskey's modules.  Mike had made some changes in that module's usage and Peter was furious.  So we called a meeting of all the stakeholders, including George the system architect.

"Mike didn't understand the workings of this code," began Peter.  "It uses a generic framework and relies on generated modules to do the detailed work."  Mike had unknowingly made his changes in generated code.  Okay, generic frameworks and modules are good.  Automatic generation of tedious code is good.  Should be easy to get everyone on the same page.

"All right, lets start at the beginning," I said.  "Where did this framework come from?"  Come to find out it had been copy-pasted from elsewhere in the system and modified to interface with Whiskey's tool chain and reporting APIs.  "So we have our first problem.  We have two slightly different versions of the same code being compiled and run in different apps.  That's undesirable from a maintenance perspective."  Our refresh team had spent considerable effort collecting and centralizing all pseudo-shared and pseudo-reused code.

Turns out this "generic" framework wasn't as generic as Peter had advertised.  It had to be modified for use in each different applications, and Peter hadn't bothered to backport or refactor in order to achieve his goals for genuine reuse.  As usual, schedule pressure had persuaded him to add that to the tech-debt pile and move on.

"I saw the code generator," said Mike.  "But it wasn't being invoked in the build.  There was just this file in the SCM repository and that's what needed to be changed.  So that's where I made the change."  That surely merited a follow-up.

"Wait -- so the generated code was checked into the SCM?"

"Yeah, it had to be," said Peter.  "The generated code needed some rework in order to fit into Whiskey, so I manually ran the code generator, made the edits, and then checked in the edited, generated code."  George was starting to look a little worried.

I pressed the question.  "So it sounds like Mike was only following the policy you yourself put into place."  Manually-generated, hand-patched code was part of the rot we'd worked so hard to eradicate.  It was bothersome for developers unfamiliar with parts of the system to have to figure out the "secret sauce" for building the parts that weren't automated.  "And that's not a good policy, certainly not one we want to embrace moving forward."

Of course Peter knew that.  You didn't get to be a lead developer without knowing what best practices were supposed to be.  You were supposed to go back and fix the code generator so that the code it generated worked wherever it was supposed to.  You were supposed to incorporate the code generation step into the build so that developers would know to make their changes to the generator's prototypes.  But schedule pressure and momentum had overcome Peter's desire to do the right thing instead of the expedient thing, and now there was Mike laying there under the bus and George shaking his head at seeing the Athlete's Foot start to grow on his brand new creation.  (See, we're still sort of talking about sports.)

Happy ending, of course.  Mike committed to learning the ins and outs of the code generator so that he could backport his fixes.  Peter committed to fixing the code generator and integrating it into the Whiskey build, pulling his one-off changes out of the SCM.

Jay's Corollary to Brooks' Law

So here's my corollary to Brooks:
How do codebases rot?  One shortcut at a time.
Keep in mind that everything still worked after the shortcuts.  The code built and ran correctly.  Out of the 400 or so source files that made up Whiskey, only one of them needed this special handling.  But that's at the outset.  Incremental crapulence is, well, incremental.  It doesn't seem so intolerable at the time you make it.  The first file to become a special case is the first spore of rot.  People normalize to it and start again to work that way habitually.  The Peters help us meet important deadlines.  The Mikes do their best to keep rot at bay.  And the Georges keep us profitable.

Fred Brooks didn't write a book on how to manage a software development project.  Brooks used software development as a framework to describe how people behave in groups when given a task to accomplish.  And you either come to grips with that group behavior or you go under, no matter what your task.  One of Brooks' discoveries -- the self-organizing team -- is still the core of nearly all today's Agile methods.  Some people are the short-cut developers who get quickly to the milestones and others are the housekeeper developers who keep things tidy and maintainable.  It's like having the heavy-hitters who can knock one out of the park, and the clean-up batters who can hit consistently to some neglected corner of the infield, if not always for a home run.  You need all of those to make a team, so that their strengths compound and their weaknesses cancel out.

There, a baseball metaphor.  I've shot Chekov's rifle and brought us back to what Frank Deford was doing all those years.  Deford wasn't really writing about sports.  Like Chekov and Brooks, he was using what he knew to frame important insights about people.  And codebases rot because people are people, and they take one seemingly insignificant shortcut at a time.  It's management's job to make sure the clean-up batters on the software team have not only the key positions in the lineup, but are allowed to do what they do to save the team.  Thanks, Frank.  I'll miss you.

Wednesday, March 29, 2017

Why the passage of HJR86 is so bad

Let's say one thing up front.  When the U.S. House of Representatives passed HJR86 yesterday, sending it to President Trump's desk, nothing actually changed.  The FCC rule -- the one that requires ISPs to get your consent before they sell your private information -- had not yet gone into effect.  The rhetoric in the blogosphere leading up to the vote, however, made it sound like people believed their private data was already protected.  It's a no-brainer that people to whom you entrust such intimately personal information as what you do online ought to get your permission before they sell it to third parties for their profit.  During the debate on the House floor, one opponent of the bill to abolish the rule urged its supporters to leave Capitol Hill for five minutes and try to find three normal people who didn't want the opt-in consent requirement.  Even worse, since Congress nixed the rule via the "nuclear option" of the Congressional Review Act, no similar rule can be made except by explicit act of Congress.  A future FCC or FTC is now forbidden to regulate ISPs in the way Americans overwhelmingly want.

It gets worse.

To understand the real impact of this bill, we need to dissect the argument by which big corporate ISPs fought to have it passed.  Social networks like Facebook and portals like Google are allowed to sell the information they obtain when you use their services.  If you like a post on Facebook, Facebook is allowed to record that you did that and use it for its own marketing purposes, or to sell it to partners for whatever they want to use it for.  It's part of the terms of service.  If you don't like it, don't use Facebook.  (I don't like it, so I don't use Facebook.)

Big ISPs like Comcast argued they should be able to do the same thing.  They persuaded the more business-friendly factions of Congress and the FCC that they were being treated unfairly and demanded a level playing field in order to compete for marketing-data dollars with other major players.  Facebook is able to build up profiles of its users based on their activity.  Turns out those profiles are worth quite a bit of money in the marketplace of attention.  People looking to promote goods and services want to efficiently target their ad dollars.  They'd rather pay Facebook for a list of web sites you visit than to shotgun their promotions to everyone hoping to interest even just a tiny fraction.

That's a great lesson in market forces until you realize that Comcast isn't at all like Facebook for this particular regulatory purpose.

I control the information I put out on social media.  The social media provider may indeed sell that, but he can only sell what I voluntarily provide.  Facebook doesn't know that you also lurk anonymously on Brony forums unless you explicitly tell them so.  The same is generally true for any service endpoint.  What they can know about you is generally limited to what you have to share with them within the confines of their service they provide to you.  The profile my travel booking site has on me is limited to the travel habits they can infer from my use of their service.  My pizza-ordering habits are known to the local pizzeria and not generally to anyone else.  They can sell my pizza profile to the highest bidder, but there's a limit to how much such a thing is worth.

My ISP sees everything.  That puts them in a unique position to build up a much more comprehensive profile than any one service endpoint could achieve.  That's immediately alarming because it enables metadata analysis.

The practice of analyzing metadata rocketed to public attention when Edward Snowden revealed that American intelligence services were routinely collecting the metadata from the communications of millions of unsuspecting Americans.  Phone call metadata includes the number dialed and how long the call lasted, but not the content of the conversation.  Thanks to the limits of aging regulations, that information isn't covered by the Fourth Amendment, just like what's written on the outside of an envelope.  You have a right to privacy in what was said, but you don't have a right to privacy in the fact that the conversation took place via a voluntarily-contracted third-party service.

Metadata analysis attempts to infer useful information from those facts, without having to delve into content.  And we do it because it works.  Not only does it work, it works very well.  Everyone knows about the intelligence-gathering applications.  But they don't necessarily know that even ordinary, commercially-available cybersecurity solutions use it.  ISPs and enterprise businesses rely heavily on metadata analysis systems to protect their networks from intrusions and exploits.  They know how valuable metadata is.

And it works better the more comprehensive the metadata.  All someone could learn from my pizza-ordering habits is that I don't like anchovies or chewy crust.  That profile has limited value because it comes from only one sector of my daily activity.  What if the metadata profile were able to aggregate information from several unconnected sectors?  ISPs know how much more valuable their particular metadata is.

And worst of all, what if an ISP could do this regardless of any privacy agreements I have with the endpoint providers?

The analogy that's going around the net today is to the phone company.  Let's say the county health department calls me up to give me the results of an anonymous test.  Let's say it's bad news.  So I call up my doctor and discuss the diagnosis and treatment.  Then I call my mom to tell her what's up.  Individually, each of those phone calls is protected by prior agreements of client privilege and privacy.  My doctor isn't selling my medical records in order to make a buck on the side.  But the phone company is in unique position to know that I had phone calls, in rapid succession with, (1) a health monitoring facility, (2) a doctor, and (3) a close relative.  That information alone might be very interesting to my insurance company or employer because of what can be easily inferred from it.  And this is a fairly on-the-nose example.  In real life, metadata analysis is able to infer an astonishing amount of correct information from even more nebulous connections.

As a matter of policy, the phone company doesn't sell that sort of information.  But that's exactly what ISPs can do.  They can sell to anyone for any reason a comprehensive profile of you that has been acquired using their comparatively godlike powers of observation over all facets your life.  That comprehensive perspective is why they aren't like social media or other limited forums to which they insist they should be compared.  The voluntary and limited use from which Facebook has to infer its profile of you justifies why it's allowed to do it.

Internet service isn't an optional novelty these days.  You don't have the luxury of just not using the Internet.  Even for the most disadvantaged Americans, access to services such as low-cost healthcare and public assistance requires access to the Internet to manage the case.  While we're not yet to the point of the Internet being a mandatory service, we're close.  Close enough to regulate ISPs as a service that people cannot easily choose not to have.  And in most markets there isn't meaningful competition for broadband access.  That's why one of the rules that HJR86 eliminated would have prevented broadband ISPs operating practically as monopolies in a market from insisting that you opt into their data-sharing program as a condition of service.

Comcast and others insist they just want a level playing field.  But it's not level; it's significantly downhill for them compared to the companies they designate as competitors.  They insist they should be allowed to merely innovate like all the others.  But their ability to see everything you do gives them the power to create a profile none of their competitors can hope to match.  Now you see why they're antsy to enter the market for your private data.  It would certainly be "innovative" for my doctor to sell all his patient data to pharmaceutical firms.  They'd be better able to target their ads and he'd be able to recognize a new revenue stream.  But we instantly recognize that would be immoral.  And the new FCC, Comcast, and Congress don't want to talk about whether or not their policy toward ISPs is moral.  Businesses make money, therefore it must be good.

Friday, March 7, 2014

Rage Dictionary: essential complexity

Essential complexity is the inherent complexity of any solution to a problem after paring away the particulars of how the solution is implemented by some engineering means.  A measure of the simplicity and efficiency of a candidate solution is how well it solves the entire problem using as few constructs as possible.
"Consider the complexity on a scale from 1 to Java..." --one of my software engineers
In programming terms, essential complexity of a task such as "Determine whether a list contains duplicate elements" dictates in the general case that you must examine each pair of elements at least once.  There is the question of loop optimization, to be sure, but essential complexity looks at the problem space and the algorithm, while the remaining complexity is taken up with the mechanics of the data structure, its iterators, and whatever syntactical framework is required.

Programs therefore commit one of two converse errors.  Incorrect programs often oversimplify the problem and miss corner cases.  Corner cases are part of the essential complexity of the problem; handling them elegantly and efficiently is part of the task.

But more often programmers err in addressing parasitic or imaginary requirements  through a plethora of poorly-managed, strung-together components, an overgeneralized framework, or other elements that add complexity to the program solution well above and beyond what is required to solve the problem.

Wednesday, March 5, 2014

Rage Dictionary: tracking development

Tracking development is a special case of recurring engineering in which the integration of an outside (typically third-party) application or code base comonent requires non-trivial revision in a glue layer or the component code itself every time the foreign component is revised.  It should be avoided for several reasons.
  • Recurring engineering generally exceeds the one-time cost of proper up-front design over the life of the product.
  • It complicates component upgrades and reduces the incentive to stay current.
  • Pollution of third-party code bases may introduce defects for which the third-party vendor's test program has no test.
  • It is a common omission in workflow when a project is transfered among developers.
  • End users who know the component is included may be surprised when it behaves different from how they expect.
Tracking development is not merely coping with changes introduced in newer versions of a foreign component.  That is ordinary software refreshment and occurs on an ad hoc basis according to the component's development plan.  Instead, tracking development denotes modifications made to the foreign component itself for inter-operation with the host system, such as adding local options to a configuration file by means of modifying the parsing function.  These modifications must be repeated for every release of the component.

Techniques for avoiding the burden of tracking development include (in order of preference)
  1. Careful selection of the foreign component, and limiting interaction with it to its documented out-of-the-box behavior.
  2. Collaboration with the component vendor to support the needed interaction.
  3. Wrappers and glue layers that translate between the component's out-of-the-box behavior and the needed behavior.
  4. Expression code-level modifications as patch(1) files.

Thursday, February 20, 2014

Tuesday, February 11, 2014

The Right Way to Get Support

Forwarded to me by a front-line customer support representative:
To Hosting Provider: [redacted]
re: Domain [redacted]
My SQL DB has decided that it will pretend to be invisible. I have tried to repair, update, and upgrade to no avail. Besides playing it comfort music and preparing it grilled cheese sandwiches and tomato soup, I am officially lost. Oh ya, and the Backup in [web panel] won't load... Totally not zesty. Ya dig?
As the senior engineer I deal with escalations of last resort.  By the time a customer's issue reaches me it has already gone through two tiers of support, and I have nowhere to escalate it further.  Sadly, unlike the Supreme Court, I can't just let stand the lower court's decision.  I have to fix it.

Software developers of all ranks hate escalated support requests more than an empty soda machine.  In practical terms esclations wreck a well-planned day since they typically have to be deal with before going on to forward-looking work -- and production schedules don't yield.  In softer, fuzzier terms, software developers take a hit to morale every time their software doesn't work as planned, or some user has failed to grok its obvious genius.

The excerpt above is how you blow past all that and get good service.  The user has a point.  He's tried all the "usual stuff" before picking up the phone.  And he tells the representative this, so we know where to start.  But the obvious win is the humor.  It's going to win you top-notch effort from all involved, including the senior engineer who may eventually have to drop what he's doing and investigate your problem.

An approach like this is far and away better than berating the support representative with profanity, telling him how much your business is losing for every second your web site isn't working, demanding to speak to supervisors, etc.  A soft answer turneth away wrath, saith the good book.  In the same vein, a cleverly worded, deferential, entertaining trouble ticket not only turneth away wrath and getteth really good service, it can windeth up immortalized on someone's blog.

Tuesday, December 3, 2013

What's in a name?

What's a developer? A programmer? A software engineer?  I think it matters because how we describe ourselves tells us a lot about how we approach our jobs.  And that's not a value judgment, because the industry is quite vast and thus encompasses necessarily a wide range of approaches.  Here's my take.

Computer Scientist

This describes the field in academic terms.  For decades "computer science" referred to the abstract principles by which computing happens, and it's what went on the diplomas, regardless of what people actually did to .  People who call themselves computer scientists want you to see them as theorists and brainiacs, even if they also write code.  They tend toward the analytical activity in the industry and will produce more paper than code if you let them.  They are notorious for setting aside "good enough" in the pursuit of "best."

However, lately the industry has suffered from a lack of computer science among its practitioners -- many of whom enter with a paucity of formal training.  Early Fortran is an example of what you get without some good theory behind you -- a disorganized mess.  Sound computation theory gave us such abstractions as map-reduce (the genius behind the Google search engine's scalability) and useful technologies such as face recognition.  Closer to home, practitioners with a sound theoretical vocabulary are more apt to understand the more esoteric factors that contribute to the success of a software project.

Software Engineer

Full disclosure:  I'm an engineer who frequently writes software.  That's different in many respects to being a software engineer, as I will explain.  The term "engineer" was thrown around far too often in the 1990s and 2000s, used to describe practically any technical job.  I backlashed when the term "sales engineer" entered the business vocabulary.

True engineering is a licensed profession.  The road to certification is long and rigorous, and requires passing the Core Engineering curriculum in college.  All engineers, regardless of specialization, must do this.  Now very few software jobs require an expert understanding of thermodynamics and differential equations, but that's not the point.  The core engineering curriculum teaches a specific, rigorous approach to problem analysis and solving.  It also teaches the important concept of how technical requirements and limitations are balanced against schedules, budgets, and natural limitations.

Computer scientists learn this too, but not in nearly as helpful and rigorous a way.  Engineering is not just building machines -- it's how to think about all the things that contribute to a problem and its various solutions.  It's about creating methods that keep you safe and honest, and sticking to them.  It's about knowing as much as you can before you need to know it.  Engineers ask questions like "What happens if this piece breaks?"  "Can this be made simpler?"  "How are we going to build and test this?"  "Is this component good enough?"

The fact remains that as engineers use the term, there is very little engineering in software engineering.  But a "software engineer" is someone who wants you to know that he's producing a software product on a prescribed budget, within a predictable schedule, and having the functional properties listed, and with as few parasitic behaviors as required.  Well engineered software is everything you need and nothing you don't, no surprises -- software engineers are those who know how to get that.

Software Architect

This term just needs to go away.  And I say that from the perspective of someone who does a lot of what this title is meant to convey -- the high-level, overarching design of software systems.  I embrace the activity, but I eschew the abstractionist overtone.

The title connotes that a certain person is responsible for the high-level or critical design elements of large-scale software, while the detailed component designs are left to others and do not concern the architect.  My father and sister are architects of the brick-and-mortar variety, and they assure me the title is misplaced.  Your high-level designs in physical architecture are not credible if they gloss over the problems that arise in the details.  Thither software.

This truth is confirmed from my personal experience with some of the so-called architects of systems and software.  I have seen so many high-level designs fail miserably because of naive assumptions for what the components are and what they will do, and most importantly:  how they will interact.  System behavior is not generally determined by the elegance of the high-level design.  It is not determined by the individual behavior of components -- although this is sometimes true, such as in the case of a mis-tuned database.  System behavior is most visibly the product of the interactions among components, most of which were not foreseen or accounted for by the architect.  The ability to see these interactions and explore their behavior before building the system is what constitutes good architecture.

This isn't a rant against high-level design, which obviously has to occur and be done correctly.  It's a harangue against those who practice only this type of design and consider it an appropriate division of labor.  Good design simultaneously considers the system at multiple levels of abstraction and implementation, and as proper subsystems having their own unique holistic behavior.  Insisting upon the title "software architect" conveys to me a dangerous indifference toward where the success of a software design truly lies, and improperly aggrandizes the role of high-level design.

Programmer / Coder / Analyst

I've long considered these to be mostly obsolete terms.  They're ironically the most straightforward, at least in terms of their common-sense meanings.

In decades of doing this kind of work, I've never figured out what an "analyst" is, either in software or in engineering.  In my experience, analysis (in the dictionary sense) is something everyone needs to do.  "Requirements analysis" is, at first, just reading comprehension.  Breaking down requirements into actionable tasks may elude some people, but really it's part of everyone's job.  Everyone learns to do it, so there isn't any need for a dedicated analyst.

Programmer used to be a broad, general term.  But now it seems synonymous with "coder," which in turn seems to suggest the foot-soldier of the software industry.  Again, this is part of everyone's job who has hands on the design and production of software, so one should never be just a coder.

In a more amusing sense, I considered this part of some 1970s ideal of software workflow in which necktie-wearing geniuses wrote software on paper pads -- yes I still have a pad of "coding forms" -- and turned them over to talented typists who punched them onto manila cards without understanding what they meant.

Developer

I think "developer" is coming to the fore as a replacement for "engineer," and I applaud the change.  Not because I'm an elitist engineer, but because I think the development of software encompasses a range of activities that doesn't fit any of the earlier pigeonholes well.

Classical engineering operates in a world dominated by small design vocabularies and rigid silos of activity from design to test to manufacturing and operation.  It is successful in this pattern, but the pattern doesn't always fit software well.  The component software producer must have a broader scope of activity than the classical engineer, and therefore a broader scope of competency.

Calling yourself a software developer tells me you can work in requirements analysis, design, programming, unit and integration testing, and deployment with relatively equal success.  While the production of the actual finished code is obviously paramount, the software developer has a larger perspective.

More so than the others, this title lends itself to specialized prefixes.  People introduce themselves to me, for example, as "Windows developers" or "firmware developers," and this tells me important information.  However, the term "web developer" has come to comprise an astounding range of talent, from highly-skilled and highly-trained practitioners who apply their talents to products for the Web, to self-proclaimed (and often astonishingly incompetent) folks who simply read a book on PHP and now offer their services to a paying public.

Lest elitism rear its ugly head again, some of the worst program code I've ever seen has been written by electrical engineers and academic scientists -- in other words, by people with appreciable intellect and training (albeit in other fields).  "Web developers" most often arise from the ranks of graphic designers and other people skilled in visual or literary arts, who discover suddenly that computer programming is now part of their jobs.  The smarter and richer ones hire competent software developers to produce the technology behind their web sites.  But the height of pretense, in my book, is to install Wordpress, upload some content, and call yourself a "web developer."

That's probably going to be an unpopular opinion.  Production pressures and values dictate that this is probably how ninety percent of all web sites have been and will be produced.  And that's as it probably should be, since honestly I'd rather have the visual artists concentrating on what they do best, and because gear-heads are notorious for producing very ugly web sites.  But equating "web development" with "software engineering" or "computer science," when what you are doing is content management, is a bit unfair to the technically savvy whose title you're borrowing.

So there's the cast of characters as I see them, in late 2013.  Imagine what we'll be calling ourselves in ten years.