Entropic Thoughts

Programming Apprenticeships

Programming Apprenticeships

This morning I read Reading Challenging Books With Kids is Fun and Probably Useful by Henrik Karlsson at Escaping Flatland. When Henrik Karlsson says,

the reason learning reading comprehension is hard, and why most never learn to do it in the true sense of the word, is that most of what we do when we read is hidden in our heads. Unlike a kid learning to cook by hanging out in the kitchen, a novice reader can’t figure out what to do by looking at Dad reading an essay with his face pressed against his phone.

[…]

Collins et al, who worked on cognitive apprenticeships, suggested that the reason so few can read is because they have never seen anyone do it. They have seen books. They have seen words. They have seen grave faces looking at words. But they have never seen the questions and strategies that are playing out behind those grave faces.

I was reminded of programming.1 Sure, reading is even less physical than programming, which after all comes down to manipulating symbols that exist outside the head of the programmer, rather than just looking at them. You can argue that you can program without manipulating symbols but that disagreement is a separate article. So half an hour later when my son (age four) sat down next to me as I was trying to refactor code, I started describing to him what happened in my head, and it got interesting!

Before refactoring, we wanted to write a test to verify that we didn’t accidentally change the existing behaviour of the method we were refactoring. That was easy to explain2 I’m not saying his four-year-old brain understood all of it, only that I was able to explain it.. It should also be said that we were working on the kind of legacy code nobody really knows what it’s doing, and nobody wants to pay anyone to figure out either, so we were trying to get by with reading as little of the code in the underlying system as possible.

The method to be refactored was a query-type method, and it took an id number as a parameter. Thus, we needed to pass the id number of an existing object. We had a list of those in the test data import file, and I asked my son to pick one of the numbers, and he did. Fair enough. Then we created a test that asserted the result was not null. Also easy to explain.

Then I set a breakpoint on the assertion of that test and ran it under the debugger to look at what sort of data the result was. The goal was easy to explain: we want to create a test with a parameter that generates a result that is good for testing. But what the heck constitutes a result that is good for testing? I can recognise it when I see it, but I had trouble explaining it.3 I tried with “it contains many details” but I guess the concept I was aiming for was some sort of conditional entropy. Not entirely sure, to be honest.

It turned out that the id number my son selected resulted in an empty4 But non-null. result set. We tried another, and it was also empty. At that point we had to dig into what it was the method was actually doing to find an id number we could know would give a better result. That part was easy to explain, but why did I try two randomly selected id numbers before going deeper? Why not try five different numbers? Or give up and go deeper at the first failure?5 I still don’t know. But I do think it’s an interesting question to try to answer!

Vaguely recalling Accelerated Expertise, maybe this is actually the way to train new programmers? It gets them soaked in the full complexity of skilled software engineering right from the start, instead of carrying around wrong and dangerous simplifications that take long time to correct.

Shadowing experienced programmers would also take the pressure off of beginners to somehow be productive. They wouldn’t have to commit things; their job would be to sit around and ask stupid questions.