Python profiling of 2 string concatenation techniques



This is my OLD blog. I've copied this post over to my NEW blog at:

http://www.saltycrane.com/blog/2007/10/python-profiling-of-2-string/

You should be redirected in 2 seconds.



In Efficient String Concatenation in Python, the author tests the performace of 6 methods for concatenating strings in Python. I wanted to test the methods myself, for the experience, and also because the article was a few years old. It turns out the timing module he used for performance profiling is no longer included in Python 2.5. So I went to the Python documentation and found that there are 3 profilers currently included with Python 2.5: profile, cProfile, and hotshot. (See 25. The Python Profilers in the Library Reference for more info.) I made a quick choice to use cProfile and tried out the fastest and slowest of the 6 methods. Below is the code and the results. It turns out, for me, the second method is not as significantly different as the original test. (Maybe because this was improved between Python 2.2 and 2.5? I'm guessing.) However, it is much more concise, and appears to be the more elegant, declarative approach that I have founnd myself reading about recently.

test.py:
import cProfile

BIG_NUMBER = 1000000

def method1():
    mystring = ''
    for i in xrange(BIG_NUMBER):
        mystring += `i`
    return mystring

def method2():
    return ''.join([`i` for i in xrange(BIG_NUMBER)])

cProfile.run('method1()')
cProfile.run('method2()')

Results:
$ python test.py
         3 function calls in 2.515 CPU seconds
   
   Ordered by: standard name
   
   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
        1    0.000    0.000    2.515    2.515 <string>:1(<module>)
        1    2.515    2.515    2.515    2.515 test.py:5(method1)
        1    0.000    0.000    0.000    0.000 {method 'disable' of '_lsprof.Profiler' objects}
         
         
         4 function calls in 1.734 CPU seconds
   
   Ordered by: standard name
   
   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
        1    0.000    0.000    1.734    1.734 <string>:1(<module>)
        1    1.609    1.609    1.734    1.734 test.py:11(method2)
        1    0.000    0.000    0.000    0.000 {method 'disable' of '_lsprof.Profiler' objects}
        1    0.125    0.125    0.125    0.125 {method 'join' of 'str' objects}

2 comments:

Unknown said...

Hi, how do you manage that the pygmented code block is in a cell with a slider and does not stretch to the far right?

It certainly is a HTML tag around it, which I don't know. In the source of this page I found only the <pre></pre> tags around it, but that does not have the same effect in my blogger blog. )-:

sofeng said...

Hi Sven,
I had to search around for this after I saw other blogs that used this technique also. I set the "overflow" ?attribute? to "auto" for the pre ?tag?. I set it in my blogger template for all pre blocks. You can also set it with inline style:
<pre style="overflow:auto">

About

This is my *OLD* blog. I've copied all of my posts and comments over to my NEW blog at:

http://www.saltycrane.com/blog/.

Please go there for my updated posts. I will leave this blog up for a short time, but eventually plan to delete it. Thanks for reading.