Sunday, September 2, 2012

Bugs, TDD and Functional Programming

I'm watching a BBC series produced long ago called "The Machine That Changed The World". At some point, Mitch Kapor (founder of Lotus) says, as the narrator compares building software to building airplanes, buildings and bridges:
So if you've got something that's not holding its weight well, you look to
see if the the joint is tight, or if the screws are right. You don't have to
go and analyze the whole building. Well, software doesn't work like that. If
you see a problem when you attempt to execute a certain command, there is no
simple and direct way of knowing which part of the code could have the problem.
In some sense it could be almost anywhere. And so the detective problem of
hunting down the source of the problem is enormously harder than in physical
media because the digital media don't obey the same simplifying law of
proximity of cause and effect.
 I see both TDD and functional programming as ways to change that.

For TDD, it's quite obvious, and, in fact, one of the benefits lauded by proponents of the practice. If some bug shows up (in the test), then it must be related to the change you just made. Since changes are supposed to be small if you do it right, then the code path you must follow to find the bug should be quite restricted.

The same holds for functional programming, with the advantage of applying to bugs that did not show up on the tests. A program in functional programming is supposed to be a composition of functions, and each function should be referentially transparent -- side-effect free, so that it's output relies solely on the input, and pure, so that it always return the same output for a given input. So, once you can reproduce the problem -- one you have an input that causes the bug -- you can go down the functions that handle it and see if they are returning the correct output for the input they are receiving.

Add both together, and software starts looking a lot more like building an airplane or a bridge. In this respect, anyway.

When I left work for the weekend on Friday, I had just started feeding input from another module of the systems we are building, as my own module finally got to the point one could start doing integration tests. There is a bug: some data I expected to be produced isn't showing up.

As it happens, however, the code that looks at that data is functional and was developed using TDD. If it got the input I expected, it would have returned the output I wanted. That means when I arrive at work tomorrow, I'll look at the input being provided, and I know it will not be the same as the input the system was tested with. I'll see what's different, and, knowing that, I'll be able to tell exactly which piece of code should have handled it, and find out why it isn't doing what's supposed to do.

It really beats the alternative.

17 comments:

  1. U mention about TDD, not clear how writing tests first help u achieve the goal. Fully agree on the utility of having a strong test suite. But writing tests first is the basic tenet of TDD, I guess.

    ReplyDelete
  2. And what happens when the bug is not caught by the test, like most bugs seem to be? Oh.

    I think I understand why people think that functional programming makes debugging and finding bugs easier. In the normal world programs are huge and written by large teams of average, below average, and above average programmers. They are not generally TDDed at all. The very fact of being functional usually means that the system is at "toy" scale, and it doesn't solve any kind of real world problem, and usually means that the programmers involved are best in their field because it is an obscure practise.

    If you put these things together, of course bugs are easier to fix! This illusion would rapidly vanish if functional programming went mainstream.

    All the things you just said were believed and repeatedly stated by lvoers of OOP in the 80s and early 90s - for the same reasons. When OOP went mainstream it was rapidly found it was oversold too.

    ReplyDelete
    Replies
    1. The problem is not that functional programming is being oversold, it's the programmers that are oversold. Programming is done badly today because most programmers don't want to even try thinking "better" and coding following strict rules like functional programming would require them to. Overall, it was the same problem for OOP. OOP is still the wonderful solution it was said to be in the 80'ds, it's just that bringing it mainstream attracted to programming a lot of people who are not qualified enough to use OOP well (plus, the OOP language that went mainstream were inherently flawed). Programming should be an elite activity, because programming well is truly difficult. Programming itself is too mainstream, and that is why it is so difficult to turn good practices into the commonly accepted way to do things.

      Delete
  3. @Debasish TDD servers multiple purposes. First, it dramatically decreases the footprint of code that has to be read to find the cause of a bug when the bug is found by the tests. Second, it helps create a strong test suite. The main problem, as I see it, of not using TDD to create a test suite is that you'll usually either test to the specification, in which case you'll usually leave code untested, or you test to the code, in which case you aren't really testing anything, just increasing test coverage.

    The only way I see to avoid that when you test after is by pruning any code that is not being tested (aside trivial stuff like getters and setters). Maybe there are other techniques, but I don't know them.

    As a practical matter, I never saw a good comprehensive test suite that was not produced by TDD.

    @CopPorn In answer to your first question, that's where the functional programming comes in -- as I clearly said in the post. The rest of what you said are just fallacies, and, thus, not worth responding to. Here's a list of the ones I identified:

    http://en.wikipedia.org/wiki/Fallacy#Fallacy_of_accident_or_sweeping_generalization
    http://en.wikipedia.org/wiki/Fallacy#Irrelevant_conclusion
    http://en.wikipedia.org/wiki/Fallacy#Affirming_the_consequent.2FDenying_the_antecedent
    http://en.wikipedia.org/wiki/Fallacy#Fallacy_of_false_cause
    http://en.wikipedia.org/wiki/Fallacy#Straw_man

    In the future, try to resort to verbal fallacies -- they are much harder to detect.

    @Alex In a sense, pure, strictly typed, functional languages reduces the problem by raising the bar, even more so the complete ones. IMHO, that's the main reason why the *second* programming paradigm in the world (Lisp was created before COBOL) has never reached mainstream. Note the "O" in "IMHO" -- that's my opinion, not anything I'm putting forward as fact (as opposed to the main content of the post).

    ReplyDelete
    Replies
    1. TDD also ensures that you build meaningful APIs and helps avoid anti-patterns such as anaemic domain models.

      Delete
  4. Whenever I have a tricky bug, I try to think of a test I could have written that would have trapped it. Invariably, the bugs that really matter can't be tested for - a memory leak or memory spike in certain conditions on an embedded device, a user interaction problem that occurs only in certain scenarios. Or, unit tests mock out the part of the system most likely to fail - the test runs fine against the mock database, or the mock web service, but problems occur when run against the real system.

    The kinds of bugs that unit tests could catch are, I find, better fixed by thinking about your code more carefully and using better abstractions (like those FP provides). The tricky ones will only be outed through relentless, manual testing.

    FP is good; TDD is quicksand.

    ReplyDelete
  5. @anonymous do you think that neither testing specific scenarios or integration tests are things that can be TDDed ? Both of these things are done on my team on a regular basis.

    Automated tests can be written for catching memory leaks too, but yes they require more work and are probably not the going to come from TDD. Not sure I see the importance of that observation. TDD is just a tool, a means to an end.

    ReplyDelete
  6. James, TDD is a very expensive, time-consuming tool. You spend twice the time, write twice the code, and have to make twice as many changes when you need to redesign your system. It's quicksand.

    We're programmers, so we like the idea that we can just write a program to test our program. But all nontrivial bugs are too complicated to catch with tests, so in practice they usually aren't.

    The only way you know your program works is through black-box testing. Unit tests, at best, have you fix situations that are vanishingly unlikely to come up in your actual production environment; at worst, they slow you down, make changes more complicated and still leave you vulnerable to the big bugs, which your test suite is unlikely to catch.

    ReplyDelete
  7. @anonymous You are lacking on backing for your claims.

    First off, there's no indication that TDD is that time expensive -- I don't recall any study putting it at over 30% time, and there's been plenty of studies that actually show increase in productivity. Simply put, the time lost to writing tests is gained by finding the bugs faster (as I mentioned in my post -- in fact, that's what the post is about). There's yet no clear indication of how productivity is affected, but absolutely no support for 100% time claim -- neither from studies, nor from simple logic.

    Then you introduce a "trivial bug" concept, and claim all bugs caught by TDD are trivial. As a measurement of external quality, that is outright contradicted by studies. There's a strong correlation between TDD and external quality.

    Then there's a thing about internal quality -- a superior design of programs written in TDD resulting from the low coupling required by tests and the decreased disincentive to refactoring due to good code coverage. While I personally believe in that from my own experience, studies are not clear on this respect.

    See chapter 12 of Making Software for references about these studies (that's 32 studies, about half of which are industry studies -- the "real world" you speak of).

    ReplyDelete
    Replies
    1. @Daniel This is the experience that I and many of my colleagues have had. Glad to look into the studies you cite, but I'll have to get back to you on that.

      For me, TDD is the height of cargo-cult programming. Again, programmers love the idea that we can program our testing, and lots of famous people advocate it, so it's popular.

      But TDD means every new feature, every bugfix, every redesign requires maintaining tests. It increases your maintenance burden, and forces you to write code in a warped, step-by-step process where, at the end of a few hours, you've written tests to exercise every line, when a simple mental walkthrough and code review would've done the same thing with less wasted time and more opportunity for improvement (other programmers are much better than the test suite you write yourself at telling you where things can go wrong. After all, your test suite merely replicates your assumptions, and it's the things you've neglected to think about that will get you).

      I've also found that TDD-based programs have the opposite of high internal quality. If you want to do tests first and get good code coverage, some simple, elegant designs (like making a class a private, nested class) don't work. You have to break open your encapsulations if you want full coverage.

      I'm not "introducing" the concept of a trivial bug. I gave several examples of real-life problems: invalid sequencing in response to UI events, memory leaks, deadlocks, etc. These cannot be easily caught by unit tests. TDD catches obvious problems, which will come up in your code reviews and manual testing anyway.

      The fundamental idea of TDD is that you can avoid making bugs by writing code that will show you that the bugs you've thought of are unlikely to happen. That's stupid. The bugs that are going to take time for you are the ones you didn't anticipate. By all means, put invariant checks in your code, but don't think that codifying them into a test suite buys you a lot. Real problems are caught only through high-level manual testing, code review, deep thought and better abstractions.

      I've done a lot of TDD in my career, and the projects that get completed and work well were the ones where we eschewed TDD and instead instituted code review and did a lot of manual testing. The projects where I've tried TDD never either never got finished or had to abandon TDD to make any progress. I'm positive that you can finish TDD projects, but only if you're willing to sacrifice a lot of time.

      My alternative to TDD is going to sound trite, but it's just to think more carefully about your code. Walk through it relentlessly, have others review it, use tried and true abstractions. I agree with you on FP here: the ability to use a map or a fold instead of spreading that logic through a clumsy for loop helps rule out problems. And of course, static typing is great - it lets you enforce invariants in a systematic way that doesn't count on manual labor or rely on you to obsessively check them throughout your codebase.

      Delete
    2. I realize my view is unpopular, but Wil Shipley, at least, agrees with me: http://wilshipley.com/blog/2005/09/unit-testing-is-teh-suck-urr.html

      Delete
    3. This comment has been removed by the author.

      Delete
  8. @anonymous

    I'm sorry but this is just a strawman: "The fundamental idea of TDD is that you can avoid making bugs by writing code that will show you that the bugs you've thought of are unlikely to happen."

    TDD is *not* first about avoiding bugs. It's about feedback and design.

    You say design is compromised by TDD and use the notion of 100% coverage and breaking encapsulation as an example. 100% coverage is neither the point nor considered a good thing. If you need a private inner class then use that? Why test it directly? Why not test it through the external class?

    Sequence of UI interactions causing a problem that you couldn't think of in your tests? Maybe you're carrying around too much state? TDD isn't making your code complex and stateful.

    Regarding experience and failed projects using TDD. I'm not sure what to say there. It sounds like the tool wasn't helping you get the job done there and not using it may have been the right answer in those cases. Every project (3) I've been on in the past 3 years we've used TDD and we've delivered working code into production multiple times. Obviously that's not proof that it's more effective than not doing TDD but should serve as counter to the implication that it just doesn't work.

    Lastly, I'm in 100% agreement with thinking more carefully about your code. The moment someone thinks TDD means you think less about the code you're writing or changing then they've lost. TDD is not about thinking less, it's about faster feedback.

    ReplyDelete
  9. I definitely believe TDD carries much value. But I can see staking that it may be effective for 80% of the problems we run into. There are nefarious situations that don't lend themselves to automated testing, but that is no reason to dismiss all of TDD as trivial.

    I took over an app that was worked on by six different people over a couple of years. It was non-functional and not usable. I vowed to only fix something if I could first automate a test. My first test case involved making the app digest a spreadsheet and then inspecting the database tables to confirm the results. Hardly "unit test" grade and quite slow, but it helped me get something repeatable and confirmable that I could run over and over.

    After six months of work with the painful process, I started to get enough comfort with the codebase, that I could throw away chunks of frivolous anti-patterns and thereby decrease coupling while increasing cohesion.

    Since the Swing UI was not practical to auto-test, I started to lighten it by migrating the gobs of business logic coded into buttons into a service layer. The service layer was testable, allowing me to make the UI purely a wiring of buttons, behavior, and other things that didn't need such automated.

    I reached a point where I started deleting entire classes of functionality that had been written but were not used, not practical, and downright in my way. My total code count was dropping yet this app was being used by a team of finance people to actually do their job. When I presented it to program management, they were stunned that my demo worked 100%. I reached a point where if someone reported a bug, I was able to write a test that exposed it, fix the code, and release the patch the next day. The users really liked that.

    I credit being willing to make everything testable as the reason this type of positive feedback was possible. In fact, one time, I didn't run the test suite for two days. When I did, half of them failed. I had made too many changes and couldn't spot what broke everything. I threw away two days of work and started fresh the next day. I had the problem fixed a day later. Something that would have taken me 2-3 weeks without such a test suite took 3 days.

    I know this highly anecdotal, but that was some of the happiest coding I ever did. Not just because I was writing software and test software, but because I handed my users a tool they could use effectively. We met weekly, and they always had new feature requests. I cranked out releases at least once-a-week, and felt like I was really doing what I had signed up for.

    Yours in software development, GLT

    ReplyDelete
  10. One day I want to see people complaining about actual problems with TDD, like duplicating tests in multiple files, writing test for stuff that wont break, or not ruthlessly refactoring based on test smells.

    testing will never catch a bug as you develop software. it will help catch regressions from refactoring, but the primary reason to do it is to speed up your feedback cycle. It slows you down for the first hour or so of a feature, but after that you are gaining time until you finish it. Saying that TDD is expensive is an exceptionally naive view of things, it is only expensive if you are writing a ton of code without any sort of feedback, and that is usually a pretty terrible idea.

    It's secondary purpose is to help with internal code quality. You can't really tell if something is a pain to use until you use it. Tests tell you how good your apis are, and how flexible your design is. This is pretty damn valuable information.

    It's third purpose is to give you confidence in changing your code. With a full test suite, you can change the implementation of a given thing, and see immediately the impact it has on the rest of the class. And by immediately, i mean a fraction of a millisecond. This gives a huge amount of confidence, and it means you don't have parts of your app people are terrified to touch because they may break something.

    There are a ton of problems and difficulty with TDD. But you never hear about any of them from the detractors, only things that highlight how poorly they understand the practice.

    ReplyDelete