Python seems to use a lot of memory. So what exactly is the overhead of each type of value? Short answer:
int | 24 |
float | 24 |
tuple | 63 |
list | 101 |
dict | 298 |
old-style class | 345 |
new-style class | 336 |
subclassed tuple | 79 |
Record | 79 |
Record with old class mixin | 79 |
Record with new class mixin | 79 |
I measured these by running a simple program that loaded up 1,000,000 values in a list and then did time.sleep(1000). I ran that for different value types and then ran "top" to see how much memory was being used. I took that value, substracted the memory usage for a list of all the same value (14 bytes each), subtracted the value of a child value (usually an int, 24 bytes each), and then divided by 1,000,000. I'll include the code I ran at the end if you want to cut and paste. So what lessons do we learn from this?
- Python objects are very expensive at over 300 bytes each.
- Tuples have 1/5 as much overhead.
- Records are almost as good as tuples, even when a mixin is added.
So, if you want to have lots of values in memory without using lots of memory, use Record.
If you want to run the test for yourself, here's the code. Just comment out the "make_val" that you want to test.
import time from Record import Record class TupleClass(tuple): pass class RecordClass(Record("val")): pass class OldClass: def __init__(self, val): self.val = val def method(self): pass class NewClass(object): def __init__(self, val): self.val = val def method(self): pass class RecordWithOldClass(Record("val"), OldClass): pass class RecordWithNewClass(Record("val"), NewClass): pass make_val = lambda i : 1 #nothing (base overhead) #make_val = lambda i : i #make_val = float #make_val = lambda i : (i,) #make_val = lambda i : [i] #make_val = lambda i : {i:i} #make_val = TupleClass #make_val = RecordClass #make_val = OldClass #make_val = NewClass #make_val = RecordWithOldClass #make_val = RecordWithNewClass count = 1000000 lst = [make_val(i) for i in xrange(count)] time.sleep(100000)
Fun - I was just playing with the same sort of thing and came to a similar conclusion. One item that's useful is a slotted new style object:
ReplyDeleteclass Foo(object):
__slots__ = []
These come out to be (on average) slightly greater than 16 bytes / each.
If you add values:
class Foo(object):
__slots__ = ['a','b']
def __init__(self):
self.a = None
self.b = None
They come out to be ~65 bytes each.
Wow Gary! I tried that too and it came to about a quarter the size of virtual memory. This is way over my head but clearly indicates that __slots__ might be the way forward when memory matters.
ReplyDeleteIt would be interesting to see the exact same test run with a new-style class with __slots__ to eliminate the __dict__ attribute. According to your chart, the instance takes up 345 bytes, but the __dict__ should be taking up 298 of that; that that is eliminated when you use __slots__.
ReplyDeleteOops, I should have read the preceding comments before posting mine! Sorry.
ReplyDeleteI am getting wildly bigger values. Try this:
ReplyDelete#!/bin/sh
(
exec >test.py
echo "\
#!/usr/bin/python
import time"
i=300000
while test $i != 0; do
echo "i$i=$i"
: $((i--))
done
echo "time.sleep(3)"
)
chmod 755 test.py
echo "Before: `grep ^MemFree: /proc/meminfo`"
./test.py &
sleep 1
echo "During: `grep ^MemFree: /proc/meminfo`"
sleep 1
echo "During: `grep ^MemFree: /proc/meminfo`"
sleep 1
echo "During: `grep ^MemFree: /proc/meminfo`"
sleep 1
echo "During: `grep ^MemFree: /proc/meminfo`"
sleep 1
echo "During: `grep ^MemFree: /proc/meminfo`"
sleep 1
echo "During: `grep ^MemFree: /proc/meminfo`"
sleep 1
echo "During: `grep ^MemFree: /proc/meminfo`"
sleep 1
echo "During: `grep ^MemFree: /proc/meminfo`"
sleep 1
echo "During: `grep ^MemFree: /proc/meminfo`"
wait
echo "After: `grep ^MemFree: /proc/meminfo`"
# i=100000, output on my machine (x86-64):
#Before: MemFree: 308688 kB
#During: MemFree: 252144 kB
#During: MemFree: 227460 kB
#During: MemFree: 227468 kB
#During: MemFree: 227460 kB
#After: MemFree: 308200 kB
# Thus, (308200-227460)/100 = 807 bytes per each int variable
# i=300000
#Before: MemFree: 1007572 kB
#During: MemFree: 695952 kB
#During: MemFree: 421548 kB
#During: MemFree: 851084 kB
#During: MemFree: 835708 kB
#During: MemFree: 808800 kB
#During: MemFree: 795656 kB
#During: MemFree: 795656 kB
#During: MemFree: 795648 kB
#During: MemFree: 795656 kB
#After: MemFree: 1007448 kB
# Thus, (1007448-795648)/300 = 706 bytes per each int variable
Great Article
ReplyDeleteFinal Year Projects for CSE in Python
FInal Year Project Centers in Chennai
Python Training in Chennai
Python Training in Chennai
I am a big fan of your blog.i am so excited by a read of your blog's content. great post.Thanks for sharing superb information.
ReplyDeleteProWeb365 Minneapolis web design
Mua vé máy bay tại Aivivu, tham khảo
ReplyDeletevé máy bay đi Mỹ tháng nào rẻ nhất
giá vé máy bay từ mỹ về việt nam
vé máy bay từ canada về việt nam
vé máy bay từ hàn quốc về việt nam