I've learned. I'll share.

March 19, 2008

Why are my monads so slow?

If you've read the last few posts, you know that I choose to use monadic parsing for some application code I am writing. At an abstract level, it turned out beautifully. I was able to write the parser very elgantly and compactly despite the serialization format having some unique requirements. But at a practical level, it failed pretty miserably. It was horribly slow. With a deadline looming, I had to abandon my monadic parser and go back to using an older serialization format. But not content to give up so easily, I decided to figure out why are my monads so slow? I hope that my investigation may be of help to you if you choose to use similar techniques in your code.

First, I tried running a profiler like cProfile. While such profilers normally work quite well, they don't work so well with lots of first-class functions. And since monadic code is full of first-class functions, the profiler didn't give much valuable information.

So, I had to do optimizing the "old fashioned" way. I wrote a benchmark and ran both of my serializers against it. To start, I had two data points: one in traditional imperitave code and one in fully monadic code, and this is what I got:

imperative: 0.465 seconds
monadic:    3.893 seconds
Monadic code was 8x slower! But why? To narrow it down, I added some more data points by implementing the serializer in gradually more monadic ways. For example, I wrote a serializer in a functional/immutable style that passes around a "stream" rather than a string, and then another that passed around "state" in a very monadic way, but withoug using the monad class that I used in Pysec. Then I got this:

imperative:                0.465 seconds
functional with stream:    1.018 seconds
almost monadic with state: 2.450 seconds
monadic:                   3.893 seconds
This gave some answers. One answer it gave was that passing around an immutable "stream" object is twice as expensive as merely passing around a string. Also, passing around a state object on top of that is even more expensive. With these clues, I was able to optimize in certain ways and reduce it to this:

imperative:                0.465 seconds
functional with stream:    1.018 seconds
almost monadic with state: 1.098 seconds
monadic:                   1.283 seconds

That's a nice 3x improvement. What things were I able to do? By far the biggest cost that I was able to eliminate was object creation. I flattened two layers of abstraction into one, and thus split the number of objects created in half. The second thing I did was I created a "read until" operation that could be used more efficiently in the "grammar" of the parser. This is a form of pushing performance-critical code down the layers of abstraction. Finally, I didn't used decorators, for some often-called functions.

In the end, it looks like monadic code is about 2-3x slower in Python. Almost all of that is actually the result of just doing things in a functional/immutable way. In particular, creating data structures appears to be the main bottleneck. In functional languagues, these types of operations are very common and optimized heavily, and so are typically very fast. It looks like it's not the same in Python. Just like you have to avoid recursion because there's no tail recursion, it looks like you have to avoid a functional/immutable coding style if you care about performance because object creation is so slow. On the other hand, if you don't mind the peformance hit, it makes the code much more elegant, just like recursion usually does.

One thing of interest is that there is essentially no performance penatly for using monads over using a functional/immutable code style. The 20% penatly seen between "almost monadic" and "monadic" is only because I'm wrapping the monad in a Parser class which allows nice operator overloading.

Here's a summary of what you can do to speed up any functional/immutable-style code, including monadic code when writing in Python:

  • Make object creation as fast as possible. Don't do anything fancy in __init__.
  • Use as few layers of abstraction as possible, especially when there is an object created in each layer.
  • Push common or expensive operations down the layers of abstaction, especially if it avoids creating objects.
  • Avoid using decorators for heavily used functions.
  • Don't use wrapper classes if you don't have to.

As a final thought, I'd like to mention that while there's currently a substantial performance penalty for using immutable data structures, that style is going to become increasingly important as we enter the many-core era. No matter what style of concurrency you like, immutable data is always easier to work with when concurrency is involved. Concurrency and mutable data are just a bad combo. I think that it's going to be very important for language designers to address this when working on the peformance of their languages. I certainly hope future versions Python are much faster with immutable data. If they are, then the peformance penatly of using Monads will almost disappear.

23 comments:

  1. I'd be curious to know if the JIT optimization performed by Psyco reduces the relative penalty for working with immutable objects.

    Seeing what PyPy does to this code would be even more interesting, but that project is so bleeding edge, I have no idea what other issues you would stumble on.

    ReplyDelete
  2. One thing worth keeping in mind about Python is that function calls are expensive in time, more so than most other languages. So to get the fastest Python code, write in an imperative style and rely heavily on builtins.

    Not good news for someone who wants to write fast functional code in Python. :D

    ReplyDelete
  3. Very interesting. D language version 2.x has true closures, I'd like to see how your code works there.

    ReplyDelete
  4. Python doesn't do any optimization for them at all. Monads are incredibly flexible, but as any decent C.S. major knows, everything comes at a price. Monads give a high level of abstraction and can model a huge variety of computations. There's no reason to expect them to be on par with simpler, to-the-point styles of programming.

    Monads have their place in programming, but that place is Haskell and other pure languages.

    ReplyDelete
  5. AFAIK, decorators only work at function creation time and do not penalize actual calls, except for an extra function call.
    Function calls are relatively expensive in Python, though.

    ReplyDelete
  6. When I was optimizing pyparsing, I found the @profile decorator described on the Python wiki to be very helpful.

    The most primitive monads (such as pyparsing's Literal and Word) are those that benefit most from code optimization.

    Function calls are peformance death in python, but some operations are more expensive then function calls. When optimizing Literal, I found that replacing

    if instring[curLoc:curLoc+literalLength]==literalstring:

    with

    if literalstring.startswith(instring, curloc):

    was quite a healthy win. I also apparently invented a cacheing system for exceptions that is unseen elsewhere in Python. Exceptions are used everywhere to signal when one of several parsing options doesn't match - this means that thousands of Exception instances are created and almost immediately discarded. Each monad ends up raising the same exception (with some minor, er, exceptions), so I create the Exception at monad creation time, and then just keep raising the cached Exception instead of creating a new one.

    ReplyDelete
  7. How about you check out this site for some information about writing good essay

    ReplyDelete
  8. here you can get more information about joining the Pakistan navy
    Join Pak Navy

    ReplyDelete
  9. Different views have been shared ibn this blog by expert writers who have their own choices about papers because most of the students are worried about their papers and they are feeling the need of qualified writers. dissertation writing services

    ReplyDelete
  10. Thanks for sharing this informative content.,
    Leanpitch provides online training in Scrum Master Certification during this lockdown period everyone can use it wisely.
    Join Leanpitch 2 Days CSM Certification Workshop in different cities.

    Scrum master certification online

    CSM certification online

    CSM online

    CSM online certification

    CSM online training

    CSM training online

    Scrum master training online

    ReplyDelete
  11. I think this is the best I’ve seen till now. You can certainly visit your expertise inside the article you write.

    성인야설

    ReplyDelete
  12. An outstanding share! I have just forwarded this onto a co-worker who was doing a little homework on this. And he actually bought me dinner simply because I discovered it for him… lol. So let me reword this…. Thanks for the meal!! But yeah, thanks for spending some time to talk about this topic here on your web site.

    마사지

    ReplyDelete
  13. I really like your blog site.. excellent shades & style. Do a person pattern this excellent website oneself or even have people hire an attorney to make it happen available for you? Please answer while I!|m seeking to style and design my very own blog as well as would wish to learn where u obtained this specific out of thanks a lot 스포츠마사지

    ReplyDelete
  14. To be honest, I do not have any idea about monads. I tried to understand its meaning from your post, but could not get it. However, I want to do more research on it, but I do not have time to do it because I have to find the best and Cheap Assignment Help UK based service to complete my assignment.

    ReplyDelete
  15. Good blog. I learned a lot from this blog. Are you also searching for University Assignment Help ? we are the best solution for you. We are best known for delivering cheap essays to students without having to break the bank

    ReplyDelete
  16. Your blogs are great.Are you also searching for NURSING PICO ESSAY WRITERS UK? we are the best solution for you. We are best known for delivering nursing writing services to students without having to break the bank.

    ReplyDelete
  17. This blog is too informative and very helpful. I really like your good work and very happy to read this. Now it's time to avail african gowns for ladies for more information.

    ReplyDelete
  18. what is monad's I don't know any information about monad's so please tell me some information about monad's and thanks for posting this information.

    ReplyDelete
  19. It provides a fantastic job of outlining the various difficulties we encounter and offering solutions for how to deal with them. I especially value the emphasis on fiscal policy and how it can be applied to boost the economy.
    Abogados de divorcio en atlantic city nueva jersey

    ReplyDelete
  20. Wow! what a wonderful blog for monadic parsing. I really like your good work and very happy to read your informative blog. Now it's time to avail Best 10 seat minibus hire in Dartford for more information.

    ReplyDelete

Blog Archive

Google Analytics