https://www.facebook.com/notes/kent-beck/when-tdd-doesnt-matter/797644973601702
A group of students apologized for "not using TDD" today. It was like
they were apologizing to their dentist for not flossing--they should do
it but just weren't motivated to take the trouble. In that moment I
realized that TDD is small part of a large and complicated space. As
long as the students mindfully chose where they wanted to be in the
space, I didn't really care where they ended up.
Here's the
explanation I invented. I use three dimensions to characterize getting
feedback for programming decisions. I think these three are the most
important ones to consider, plus visualizing in three dimensions is
(just barely) possible for me. Here they are:
One
dimension to choose when seeking feedback is how much scope you are
interested in. If you try to take in absolutely every potential effect
of a programming decision, you would have to analyze its global economic
and social impact. Scope trades off between the tactical utility of
feedback and leverage. Knowing you have a syntax error means you
definitely need to make changes, knowing that there might be adverse
effects to society as a whole because of a decision doesn't help
minute-to-minute even if it's very important.
The
second dimension characterizing programming feedback is how clear you
want to be about the consequences of your decisions. Sometimes you just
want to wait and see what happens, sometimes you want to predict up
front exactly what you are going to see and if you see anything
different at all you want to know about it unambiguously. Clarity trades
off between the effort needed to specify expectations and the
information produced when you have actual data to compare to the
expectations.
Scope x Clarity
For
TDD, I chose to get feedback at the next level above the compiler. The
compiler tells you whether it's worthwhile trying to run a program. The
tests (potentially) provide much more feedback about whether a program
is "good". A secondary effect of writing tests is that you can
double-check my programming work. If you get to the same answer two
different ways, you can have a fair amount of confidence that the code
behaves as expected. Feedback with larger scope takes more work to
gather, so test-level feedback is a reasonable tradeoff.
On the
clarity axis, for TDD I chose the full monty--binary feedback, red or
green, good or trash. Not all feedback can be cast in binary terms.
Reduced engagement in one part of a program can be compensated by
increased engagement in another part. Analyzing subtle tradeoffs take
time, though, and I was looking for feedback that wouldn't require much
effort to analyze, to avoid interrupting programming flow. The
combination of these two choices--complete clarity and test-level
scope--defines what is conventionally called unit and integration tests:
The
final dimension along which to choose feedback is frequency.
Fortunately this can measured in a tidy linear way (even if the effects
of delay are non-linear) from years to seconds (and milliseconds if you
listen to Gary Bernhardt, which I do).
TDD = Frequency:seconds * Scope:tests * Clarity:binary
TDD
is a little box in this big space of feedback, the combination of
expressing binary expectations, expressing them as tests, and receiving
feedback every few seconds about whether those expectations are being
met. This seems to me to be a sweet spot in the feedback space. You
don't get perfect information, but you get pretty good information at a
reasonable cost and quickly enough that you can a) fix problems quickly
and b) learn not to make the same mistake again.
Consequences
Expressing
feedback as a space suggests experiments. What if you relaxed the
frequency dimension? Would the additional scope you could cover provide
enough value to make up for the reduced timeliness? What new mistakes
could you catch if you reduced clarity and increased scope? How
frequently should such feedback be considered? If you increase the
modularity of the system, how much could you increase frequency? How
long would it take for the investment in design be paid for (if ever)?
The
feedback space illustrates one of the dangers of feedback, putting the
response loop inside the measurement loop. This happens when businesses
manage quarter-to-quarter even though the effects of decisions aren't
known for years. Technically, if you got feedback about server load
every four minutes, you would be nuts to change server configuration
every minute. You're just going to cause oscillation. We can label parts
of the feedback space "here abide dragons".
Now I can explain
when TDD doesn't matter. When the students understand the shape of
feedback space, when they understand the tradeoffs involved in moving
along each dimension, and when they understand the interaction of the
dimensions, then I don't care if they "do" TDD or not. I'd rather focus
on teaching them the principles than policing whether they are aping one
particular ritual. That's when TDD doesn't matter.
No comments:
Post a Comment