Document Actions

agile

Up one level

Document Actions

Test so you have less code to test

by rbp posted at 2007-05-16 02:28 last modified 2008-01-23 18:27

More than one year ago, Danilo Sato and I started a pet project to practice TDD. We ended up not making all that much progress so far, codewise (over one year! So much for "agile"...), but it's been a great learning exercise. It's a Python test-driven implementation of the Stratego game, and Danilo has documented our early programming sessions on his blog (in Portuguese, I'm waiting for him to catch up to start bitching for an English translation, but I'll post stuff about it here as well). We've recently resumed work on it and it should show up both here and at Danilo's.

Anyway, I have since developed a taste for testing (a "teste", you could say) and, when I started cleaning up nnebs, I found myself writing tests for everything as I went through the code (no, it had no tests before; yes, it was my graduation project; anyone can have a Comp. Sci. B.Sc. these days...). Doctests, even, but more on that later.

One thing that struck me as interesting, in this process, came up while I was writing tests for some state-related classes. Nnebs is basically a spam filter, and I had decided its classifiers (as well as a few other classes) should be stateful, that is, they should be able to easily represent a particular configuration and switch back to it when requested. Since a few different classes would be stateful, I created a Stateful class from which classes wishing to be stateful should inherit. So far, so good. All these classes needed to do was inherit from Stateful and define which attributes would make up their states.

To describe a state itself, I created another class (aptly named "State") to hold attributes and their default values. It defined methods to add, delete, set and query attributes. It also defined that two state objects would be equal if their attributes and values were the same.

The State class was about 30 lines of code and worked quite well. However, while writing its tests I noticed they seemed a bit too trivial, which struck me as odd. I was having to write tests like

>>> State(attr1=val1, attr2=val2) == State(attr1=val1, attr2=val2)
True
>>> State(attr1=val1).get('attr1')
val1

That just doesn't seem right :)

I figured I could delegate some of State's behaviour by making it a subclass of "dict". Great, there goes my "get" implementation. And my equality implementation. And my __del__ implementation. And my initialization implementation. And, well, I assume you can take it from here.

So State is now simply a dict. Running the tests confirmed that everything worked as before. My code is 30 lines of bug-food (a.k.a. "code") shorter. And I'm a lot happier with nnebs and more confident in its code.


What's up, doctest?

by rbp posted at 2007-05-21 03:33 last modified 2008-01-23 18:26

I've been refactoring nnebs for a few days, now. Bugs found, some code rewritten, some mercilessly thrown away. But, more importantly, I've been adding doctests as I go.

This is my first real-world go at doctests. One thing I thought I'd miss were fixtures, but I found out I don't (not much, anyway). "Setting up" usually means creating an object, sometimes a few auxiliary variables, rarely anything more complex than that. "Tearing down" nearly always means, well, closing the docstring :). I'd use fixtures if I had them, but its absence doesn't bother me at this point.

I am a bit overwhelmed by the doctests-to-code ratio, though. Two-line methods often get a 10-line doctest. I am being very thorough and I know some of my tests could be more compact, but simulating all usage possibilities helps me rethink the code (specially after not touching it for almost a year). I get a clear view of how the code should work, both in words and in execution. On the other hand, when reading back the now-tested code I feel all those docstrings tend to clutter up a bit.

A silly, over-simplified, quasi-fictional example, but similiar to situations I've been facing these days. Say I have an object representing a word from an email message (remember, nnebs is a spam filter). This object stores the word itself and the number of times this word showed up in spam and nonspam messages. Now I want to define the addition of two such objects as a third object representing the same word, with added occurrence counts.

def __add__(self, other):
"""Returns self + other.
This sum is a Word object representing the same value
as self and other, but containing the sum of the
occurrences of both.

>>> w1 = Word('semprini', spam=2, nonspam=5)
>>> w2 = Word('semprini', spam=1, nonspam=10)
>>> w1 + w2 == Word('semprini', spam=3, nonspam=15)
True

Words with different values cannot be added:

>>> Word('semprini') + Word('dinsdale')
Traceback (most recent call last):
...
ValueError: Cannot sum distinct Words

"""
if self.value != other.value:
raise ValueError('Cannot sum distinct Words')
return Word(self.value,
spam=self.spam+other.spam,
nonspam=self.nonspam+other.nonspam)

I like this docstring, it's very explicit about what the operation should (and should not) do. But it's also three times bigger than the code itself (more, if you notice I wrapped the last line to better fit this blog's layout), and after half a dozen of these there's basically only docstrings on my screen.

So they will almost surely be moved to a separate file. The doctest module integrates nicely with the unittest one, so a next step might be to follow Leo Rochael's suggestion and mix both. I considered starting the doctests file right away, but I find it helpful to write the examples while looking at the code (remember, this is not test-driven, I already had the code and didn't quite remember what all of it did). Of course, I must commit to being honest with myself and not adjusting the tests to the code, but I started this with refactoring in mind anyway, so more often than not I use the code to figure out what I originally expected it to do, write tests and then fix what's broken or unnecessary. It's always trivial to cut and paste all docstrings into a new file later.

But probably the best of both worlds would be to view the doctests as either part of the code or as one single extended interpreter session. That is, docstring folding ("go away, I want to look at the code!") and a docstring-only view ("go away, code!"). I wonder if (how) xemacs can do that...


[isnomore.net]
software
blog
completely different things
Google Reader shared items
Google Reader shared items rss feed
bê do érre
ali ckel
cybershark
Recent entries
pyconbrasil[3][1] rbp 2007-10-06
Fun for the whole family! rbp 2007-09-27
Not equal not not equal. By default. rbp 2007-09-10
pyconbrasil[3].pictures rbp 2007-09-09
pyconbrasil[3][0] rbp 2007-09-08
Recent comments
Re:pyconbrasil[3][1] Danilo Sato 2007-10-06
Re:pyconbrasil[3][1] rbp 2007-10-06
Re:Fun for the whole family! rbp 2007-10-02
Re:Fun for the whole family! Anonymous User 2007-10-02
Re:You vicious, heartless bastard! Anonymous User 2007-08-15
Categories
python (13)
meta (8)
english (19)
portugues (0)
OLPC (1)
spam (4)
agile (2)
nnebs (3)
coreblog (2)
community (6)
pyconbrasil3 (5)
About this blog
rbp's random ramblings (and alliterations, as always)
 

Powered by Plone CMS, the Open Source Content Management System

This site conforms to the following standards: