Tuesday, August 29, 2017

How Codebases Rot

Frank Deford
A lot of guys in my line of work have no use for sports.  But the passing a few months ago of the great sports writer and commentator Frank Deford means I'm going to talk about sports.  Or at least about sports writing.  On NPR every Wednesday morning for decades, in what The New Yorker's Nicholas Dawidoff calls a "breezy vernacular," Frank told a long-form sports story that me feel like I -- a non-athlete and professional nerd -- understood what sports was about.

Frank was up front with Dawidoff about his secret -- Chekov's rifle.  Chekov, the great Russian playwright, said that if there's a rifle above the fireplace in Act One, by the end of the play someone had better take it down and use it.  Defore always opened his reports with some interesting sports tidbit, then spun a seemingly unrelated tale that always managed to find its way back to home plate and that opening tidbit.  Unconsciously you knew he was going to do this, so you listened.

Despite years of listening to Frank Deford, I'm no good at sports tidbits.  But I do know history-of-computing tidbits.  The venerable Fred Brooks authored one of the earliest and still-best books on software engineering project management and left with us Brooks' Law:
How does a large software project get to be one year late? One day at a time!
The notion of incremental crapulence attends us in many ways other than just schedule slippage.  I'm going to focus on one of those other ways, with an example.

Code Rot

"Code rot" is industry jargon for the effects of various forces that tend, over time, to make code more buggy, less maintainable, and less reusable.  We use the metaphor of decay to illustrate that this seems to happen without deliberate effort.  Code that sits largely unchanged will still start to "smell," giving off odors such as API skew, abandoned idioms, obsolete patches, and the like.  Especially when its original developers move on to other companies or tasks, then rotten stanzas of code whose original purpose is known only to the departed author begin to fester and beget bugs.

Let me tell you a typical story of rot.

There once was a great software stack for a best-selling appliance.  It incorporated everything, as we say, "from sand to pixels" -- custom hardware and drivers all the way through the operating system and middleware applications to an attractive web GUI.  But as with many such stacks it had moved in places, stagnated in others, and was about as rotten as a codebase could get while still being profitable.  It had originally been written by a small startup and arrived at a major corporation through a series of acquisitions and reorganizations.  The OS was ten years old (although security-patched).  The build had splintered into a half-dozen incompatible tool chains, the bugaboo of more than a decade of unbridled developer preference.

The newly minted head of the fifty-strong development team immediately prescribed a refresh.  The OS was traded out for a brand-new commercial-grade Linux distro.  The build was normalized to a single compiler and set of libraries.  A handful of the best and brightest on this team worked six months to refresh the entire system from top to bottom.  All the old code rot was gone; the system gleamed and hummed like a lovingly restored Shelby Mustang.

Everything new is old again

All was fine until a developer told me he'd been chastised for making some needed changes in a particular application, which we'll call Whiskey.  We'll call the developer Mike.

Whiskey was the newest member of the software stack.  Prior to the great refactoring, it had been postured as the glimpse of the future.  It used the latest tools, adhered to the 2011 standard of C++ with aspirations to go to C++14.  After the refactor, it was almost entirely free of rot.

Almost.  Soon after, the lead developer -- we'll call him Peter -- had checked into the Whiskey code base a new and safer database layer and used it in one of Whiskey's modules.  Mike had made some changes in that module's usage and Peter was furious.  So we called a meeting of all the stakeholders, including George the system architect.

"Mike didn't understand the workings of this code," began Peter.  "It uses a generic framework and relies on generated modules to do the detailed work."  Mike had unknowingly made his changes in generated code.  Okay, generic frameworks and modules are good.  Automatic generation of tedious code is good.  Should be easy to get everyone on the same page.

"All right, lets start at the beginning," I said.  "Where did this framework come from?"  Come to find out it had been copy-pasted from elsewhere in the system and modified to interface with Whiskey's tool chain and reporting APIs.  "So we have our first problem.  We have two slightly different versions of the same code being compiled and run in different apps.  That's undesirable from a maintenance perspective."  Our refresh team had spent considerable effort collecting and centralizing all pseudo-shared and pseudo-reused code.

Turns out this "generic" framework wasn't as generic as Peter had advertised.  It had to be modified for use in each different applications, and Peter hadn't bothered to backport or refactor in order to achieve his goals for genuine reuse.  As usual, schedule pressure had persuaded him to add that to the tech-debt pile and move on.

"I saw the code generator," said Mike.  "But it wasn't being invoked in the build.  There was just this file in the SCM repository and that's what needed to be changed.  So that's where I made the change."  That surely merited a follow-up.

"Wait -- so the generated code was checked into the SCM?"

"Yeah, it had to be," said Peter.  "The generated code needed some rework in order to fit into Whiskey, so I manually ran the code generator, made the edits, and then checked in the edited, generated code."  George was starting to look a little worried.

I pressed the question.  "So it sounds like Mike was only following the policy you yourself put into place."  Manually-generated, hand-patched code was part of the rot we'd worked so hard to eradicate.  It was bothersome for developers unfamiliar with parts of the system to have to figure out the "secret sauce" for building the parts that weren't automated.  "And that's not a good policy, certainly not one we want to embrace moving forward."

Of course Peter knew that.  You didn't get to be a lead developer without knowing what best practices were supposed to be.  You were supposed to go back and fix the code generator so that the code it generated worked wherever it was supposed to.  You were supposed to incorporate the code generation step into the build so that developers would know to make their changes to the generator's prototypes.  But schedule pressure and momentum had overcome Peter's desire to do the right thing instead of the expedient thing, and now there was Mike laying there under the bus and George shaking his head at seeing the Athlete's Foot start to grow on his brand new creation.  (See, we're still sort of talking about sports.)

Happy ending, of course.  Mike committed to learning the ins and outs of the code generator so that he could backport his fixes.  Peter committed to fixing the code generator and integrating it into the Whiskey build, pulling his one-off changes out of the SCM.

Jay's Corollary to Brooks' Law

So here's my corollary to Brooks:
How do codebases rot?  One shortcut at a time.
Keep in mind that everything still worked after the shortcuts.  The code built and ran correctly.  Out of the 400 or so source files that made up Whiskey, only one of them needed this special handling.  But that's at the outset.  Incremental crapulence is, well, incremental.  It doesn't seem so intolerable at the time you make it.  The first file to become a special case is the first spore of rot.  People normalize to it and start again to work that way habitually.  The Peters help us meet important deadlines.  The Mikes do their best to keep rot at bay.  And the Georges keep us profitable.

Fred Brooks didn't write a book on how to manage a software development project.  Brooks used software development as a framework to describe how people behave in groups when given a task to accomplish.  And you either come to grips with that group behavior or you go under, no matter what your task.  One of Brooks' discoveries -- the self-organizing team -- is still the core of nearly all today's Agile methods.  Some people are the short-cut developers who get quickly to the milestones and others are the housekeeper developers who keep things tidy and maintainable.  It's like having the heavy-hitters who can knock one out of the park, and the clean-up batters who can hit consistently to some neglected corner of the infield, if not always for a home run.  You need all of those to make a team, so that their strengths compound and their weaknesses cancel out.

There, a baseball metaphor.  I've shot Chekov's rifle and brought us back to what Frank Deford was doing all those years.  Deford wasn't really writing about sports.  Like Chekov and Brooks, he was using what he knew to frame important insights about people.  And codebases rot because people are people, and they take one seemingly insignificant shortcut at a time.  It's management's job to make sure the clean-up batters on the software team have not only the key positions in the lineup, but are allowed to do what they do to save the team.  Thanks, Frank.  I'll miss you.

Wednesday, March 29, 2017

Why the passage of HJR86 is so bad

Let's say one thing up front.  When the U.S. House of Representatives passed HJR86 yesterday, sending it to President Trump's desk, nothing actually changed.  The FCC rule -- the one that requires ISPs to get your consent before they sell your private information -- had not yet gone into effect.  The rhetoric in the blogosphere leading up to the vote, however, made it sound like people believed their private data was already protected.  It's a no-brainer that people to whom you entrust such intimately personal information as what you do online ought to get your permission before they sell it to third parties for their profit.  During the debate on the House floor, one opponent of the bill to abolish the rule urged its supporters to leave Capitol Hill for five minutes and try to find three normal people who didn't want the opt-in consent requirement.  Even worse, since Congress nixed the rule via the "nuclear option" of the Congressional Review Act, no similar rule can be made except by explicit act of Congress.  A future FCC or FTC is now forbidden to regulate ISPs in the way Americans overwhelmingly want.

It gets worse.

To understand the real impact of this bill, we need to dissect the argument by which big corporate ISPs fought to have it passed.  Social networks like Facebook and portals like Google are allowed to sell the information they obtain when you use their services.  If you like a post on Facebook, Facebook is allowed to record that you did that and use it for its own marketing purposes, or to sell it to partners for whatever they want to use it for.  It's part of the terms of service.  If you don't like it, don't use Facebook.  (I don't like it, so I don't use Facebook.)

Big ISPs like Comcast argued they should be able to do the same thing.  They persuaded the more business-friendly factions of Congress and the FCC that they were being treated unfairly and demanded a level playing field in order to compete for marketing-data dollars with other major players.  Facebook is able to build up profiles of its users based on their activity.  Turns out those profiles are worth quite a bit of money in the marketplace of attention.  People looking to promote goods and services want to efficiently target their ad dollars.  They'd rather pay Facebook for a list of web sites you visit than to shotgun their promotions to everyone hoping to interest even just a tiny fraction.

That's a great lesson in market forces until you realize that Comcast isn't at all like Facebook for this particular regulatory purpose.

I control the information I put out on social media.  The social media provider may indeed sell that, but he can only sell what I voluntarily provide.  Facebook doesn't know that you also lurk anonymously on Brony forums unless you explicitly tell them so.  The same is generally true for any service endpoint.  What they can know about you is generally limited to what you have to share with them within the confines of their service they provide to you.  The profile my travel booking site has on me is limited to the travel habits they can infer from my use of their service.  My pizza-ordering habits are known to the local pizzeria and not generally to anyone else.  They can sell my pizza profile to the highest bidder, but there's a limit to how much such a thing is worth.

My ISP sees everything.  That puts them in a unique position to build up a much more comprehensive profile than any one service endpoint could achieve.  That's immediately alarming because it enables metadata analysis.

The practice of analyzing metadata rocketed to public attention when Edward Snowden revealed that American intelligence services were routinely collecting the metadata from the communications of millions of unsuspecting Americans.  Phone call metadata includes the number dialed and how long the call lasted, but not the content of the conversation.  Thanks to the limits of aging regulations, that information isn't covered by the Fourth Amendment, just like what's written on the outside of an envelope.  You have a right to privacy in what was said, but you don't have a right to privacy in the fact that the conversation took place via a voluntarily-contracted third-party service.

Metadata analysis attempts to infer useful information from those facts, without having to delve into content.  And we do it because it works.  Not only does it work, it works very well.  Everyone knows about the intelligence-gathering applications.  But they don't necessarily know that even ordinary, commercially-available cybersecurity solutions use it.  ISPs and enterprise businesses rely heavily on metadata analysis systems to protect their networks from intrusions and exploits.  They know how valuable metadata is.

And it works better the more comprehensive the metadata.  All someone could learn from my pizza-ordering habits is that I don't like anchovies or chewy crust.  That profile has limited value because it comes from only one sector of my daily activity.  What if the metadata profile were able to aggregate information from several unconnected sectors?  ISPs know how much more valuable their particular metadata is.

And worst of all, what if an ISP could do this regardless of any privacy agreements I have with the endpoint providers?

The analogy that's going around the net today is to the phone company.  Let's say the county health department calls me up to give me the results of an anonymous test.  Let's say it's bad news.  So I call up my doctor and discuss the diagnosis and treatment.  Then I call my mom to tell her what's up.  Individually, each of those phone calls is protected by prior agreements of client privilege and privacy.  My doctor isn't selling my medical records in order to make a buck on the side.  But the phone company is in unique position to know that I had phone calls, in rapid succession with, (1) a health monitoring facility, (2) a doctor, and (3) a close relative.  That information alone might be very interesting to my insurance company or employer because of what can be easily inferred from it.  And this is a fairly on-the-nose example.  In real life, metadata analysis is able to infer an astonishing amount of correct information from even more nebulous connections.

As a matter of policy, the phone company doesn't sell that sort of information.  But that's exactly what ISPs can do.  They can sell to anyone for any reason a comprehensive profile of you that has been acquired using their comparatively godlike powers of observation over all facets your life.  That comprehensive perspective is why they aren't like social media or other limited forums to which they insist they should be compared.  The voluntary and limited use from which Facebook has to infer its profile of you justifies why it's allowed to do it.

Internet service isn't an optional novelty these days.  You don't have the luxury of just not using the Internet.  Even for the most disadvantaged Americans, access to services such as low-cost healthcare and public assistance requires access to the Internet to manage the case.  While we're not yet to the point of the Internet being a mandatory service, we're close.  Close enough to regulate ISPs as a service that people cannot easily choose not to have.  And in most markets there isn't meaningful competition for broadband access.  That's why one of the rules that HJR86 eliminated would have prevented broadband ISPs operating practically as monopolies in a market from insisting that you opt into their data-sharing program as a condition of service.

Comcast and others insist they just want a level playing field.  But it's not level; it's significantly downhill for them compared to the companies they designate as competitors.  They insist they should be allowed to merely innovate like all the others.  But their ability to see everything you do gives them the power to create a profile none of their competitors can hope to match.  Now you see why they're antsy to enter the market for your private data.  It would certainly be "innovative" for my doctor to sell all his patient data to pharmaceutical firms.  They'd be better able to target their ads and he'd be able to recognize a new revenue stream.  But we instantly recognize that would be immoral.  And the new FCC, Comcast, and Congress don't want to talk about whether or not their policy toward ISPs is moral.  Businesses make money, therefore it must be good.