I've learned. I'll share.

October 22, 2009

Introducing pyrec: The cure to the bane of __init__

I finally discovered how cool github is, and have started putting some code up there. My first entry is Record.py. I'm calling it the cure to the bane of __init__" because

  1. Mutable data structures are a bane to concurrent (multi-threaded) code.
  2. Writing self.foo = foo, self.bar = bar, etc, is a huge waste of time.
  3. When you have lots of data structures in memory, tuple-based data uses 1/4 the memory of class-based data.
As you can probably tell from my blog, I've experimented with a lot of wacky programming ideas and bending Python in ways it was never intended. But if there's one experiment that's been a success, it's Record.py. I use it for almost all of my classes. It's just so easy to use. I'm calling it "pyrec" because it's easier to write, google, etc.

So, go to the github repo or use it by following the really easy steps:

  1. Download Record.py from http://github.com/pthatcher/pyrec/blob/master/Record.py
  2. put from Record import Record at the top of your code.
  3. make a class by saying something like class Person(Record("name", "age"))
  4. Never write __init__ again (unless you want mutability).
Enjoy!

Update: A commenter (thanks Dan!) pointed out that this is a lot like namedtuple, added in python 2.6. He asked why use this instead of namedtuple. Well, I have to admit that I probably would have never created pyrec if namedtuple existed 3 years ago. I try to avoid NIH syndrome. But it didn't exist, so I wrote pyrec. But I've been using pyrec for three years, so I have some experience on some little things that make a big difference (to me). Here are a few advantages pyrec has over namedtuple:

  1. It has a nicer interface. I prefer new(val1, val2) to _make([val1, val2]), alter to _update, and class Person(Record("name", "age")) to Person = namedtuple("Person", "name, age")
  2. I added the setField methods. That's what I use 90% of the time. Only about 10% of the time do I use alter. setField is a lot more convenient.
  3. With pyrec, you can safely override __iter__ and __getitem__. For example, in Record.py, you'll see the implementation of a LinkedList. I tried doing that with namedtuple, but the overidden __getitem__ clobers the name lookup and __iter__ the tuple unpacking.
  4. You can use tuple.__iter__(rec) to get around the latter, but pyrec's .values is a lot nicer.
  5. pyrec has .namedValues for ordered (field, value) pairs, unlike _asdict() which throws out the order. For many things I use pyrec for, this matters.
  6. You can improve it! Have looked at the code for namedtuple? Ugly. This is pretty clean, so you can improve it very easily if you need additional functionality which will work with all of your records.
If you don't care about those things, use namedtuple. It's still way better than mutable classes. But having used pyrec for three years, these little things matter to me, and so I'm still going to use pyrec. But if you want most of both worlds, I added NamedTuple to pyrec, which is a subclass of namedtuple which adds most of the pyrec goodness (everything but safe __getitem__ overloading). Thanks for update, Dan.

Blog Archive

Google Analytics