Disk space - Python dictionary vs list
I was asked to create an inverted index and save its binary in multiple ways (with and without compression).
Long story short, I noticed that using a dict
representation takes much less disk space than transforming into a list
.
Sample:
dic = {
'w1': [1,2,3,4,5,6],
'w2': [2,3,4,5,6],
'w3': [3,4,5,6],
'w4': [4,5,6]
}
dic_list = list(dic.items())
import pickle
with open('dic.pickle', 'wb') as handle:
pickle.dump(dic, handle, protocol=pickle.HIGHEST_PROTOCOL)
with open('dic_list.pickle', 'wb') as handle:
pickle.dump(dic_list, handle, protocol=pickle.HIGHEST_PROTOCOL)
If you check both files sizes, you will notice the difference.
So, I am willing to know how and why they are different. Any additional information would be much appreciated
python python-3.x list dictionary pickle
add a comment |
I was asked to create an inverted index and save its binary in multiple ways (with and without compression).
Long story short, I noticed that using a dict
representation takes much less disk space than transforming into a list
.
Sample:
dic = {
'w1': [1,2,3,4,5,6],
'w2': [2,3,4,5,6],
'w3': [3,4,5,6],
'w4': [4,5,6]
}
dic_list = list(dic.items())
import pickle
with open('dic.pickle', 'wb') as handle:
pickle.dump(dic, handle, protocol=pickle.HIGHEST_PROTOCOL)
with open('dic_list.pickle', 'wb') as handle:
pickle.dump(dic_list, handle, protocol=pickle.HIGHEST_PROTOCOL)
If you check both files sizes, you will notice the difference.
So, I am willing to know how and why they are different. Any additional information would be much appreciated
python python-3.x list dictionary pickle
2
Closely related: Python memory consumption: dict VS list of tuples
– Martijn Pieters♦
Nov 19 '18 at 14:29
add a comment |
I was asked to create an inverted index and save its binary in multiple ways (with and without compression).
Long story short, I noticed that using a dict
representation takes much less disk space than transforming into a list
.
Sample:
dic = {
'w1': [1,2,3,4,5,6],
'w2': [2,3,4,5,6],
'w3': [3,4,5,6],
'w4': [4,5,6]
}
dic_list = list(dic.items())
import pickle
with open('dic.pickle', 'wb') as handle:
pickle.dump(dic, handle, protocol=pickle.HIGHEST_PROTOCOL)
with open('dic_list.pickle', 'wb') as handle:
pickle.dump(dic_list, handle, protocol=pickle.HIGHEST_PROTOCOL)
If you check both files sizes, you will notice the difference.
So, I am willing to know how and why they are different. Any additional information would be much appreciated
python python-3.x list dictionary pickle
I was asked to create an inverted index and save its binary in multiple ways (with and without compression).
Long story short, I noticed that using a dict
representation takes much less disk space than transforming into a list
.
Sample:
dic = {
'w1': [1,2,3,4,5,6],
'w2': [2,3,4,5,6],
'w3': [3,4,5,6],
'w4': [4,5,6]
}
dic_list = list(dic.items())
import pickle
with open('dic.pickle', 'wb') as handle:
pickle.dump(dic, handle, protocol=pickle.HIGHEST_PROTOCOL)
with open('dic_list.pickle', 'wb') as handle:
pickle.dump(dic_list, handle, protocol=pickle.HIGHEST_PROTOCOL)
If you check both files sizes, you will notice the difference.
So, I am willing to know how and why they are different. Any additional information would be much appreciated
python python-3.x list dictionary pickle
python python-3.x list dictionary pickle
edited Nov 19 '18 at 15:55
Martijn Pieters♦
705k13524582285
705k13524582285
asked Nov 19 '18 at 14:14
leoschetleoschet
392114
392114
2
Closely related: Python memory consumption: dict VS list of tuples
– Martijn Pieters♦
Nov 19 '18 at 14:29
add a comment |
2
Closely related: Python memory consumption: dict VS list of tuples
– Martijn Pieters♦
Nov 19 '18 at 14:29
2
2
Closely related: Python memory consumption: dict VS list of tuples
– Martijn Pieters♦
Nov 19 '18 at 14:29
Closely related: Python memory consumption: dict VS list of tuples
– Martijn Pieters♦
Nov 19 '18 at 14:29
add a comment |
2 Answers
2
active
oldest
votes
The dic_list
list consists of more objects. You have an outer list of tuples, each tuple a key-value pair. Each value is another list. Those tuples are the reason you need more space.
The dictionary pickle format doesn't have to use tuple objects to store key-value pairs; it is already known up front that a dictionary consists of a series of pairs, so you can serialise key and value per such pair directly without the overhead of a wrapping tuple object.
You can analyse pickle data with the pickletools
module; using a simpler dictionary with just one key-value, you can see the difference already:
>>> import pickle, pickletools
>>> pickletools.dis(pickle.dumps({'foo': 42}, protocol=pickle.HIGHEST_PROTOCOL))
0: x80 PROTO 4
2: x95 FRAME 12
11: } EMPTY_DICT
12: x94 MEMOIZE (as 0)
13: x8c SHORT_BINUNICODE 'foo'
18: x94 MEMOIZE (as 1)
19: K BININT1 42
21: s SETITEM
22: . STOP
highest protocol among opcodes = 4
>>> pickletools.dis(pickle.dumps(list({'foo': 42}.items()), protocol=pickle.HIGHEST_PROTOCOL))
0: x80 PROTO 4
2: x95 FRAME 14
11: ] EMPTY_LIST
12: x94 MEMOIZE (as 0)
13: x8c SHORT_BINUNICODE 'foo'
18: x94 MEMOIZE (as 1)
19: K BININT1 42
21: x86 TUPLE2
22: x94 MEMOIZE (as 2)
23: a APPEND
24: . STOP
If you consider EMPTY_DICT
+ SETITEM
to be the equivalent of EMPTY_LIST
+ APPEND
, then the only real difference in that stream in the addition of the TUPLE2
/ MEMOIZE
pair of opcodes. It's those opcodes that take the extra space.
add a comment |
A dict
can natively handle key-value pairs, while a list
must use a separate container.
Your dict
is a straightforward representation of Dict[K, V]
- pairs plus some structure. Since the structure is runtime only, it can be ignored for storage.
{'a': 1, 'b': 2}
Your list
uses a helper for pairs, resulting in List[Tuple[K,V]]
- pairs plus wrapper. Since the wrapper is needed to reconstruct the pairs, it cannot be ignored for storage.
[('a', 1), ('b', 2)]
You can also inspect this in the pickle dump. The list
dump contains markers for the additional tuples.
pickle.dumps({'a': 1, 'b': 2}, protocol=0)
(dp0 # <new dict>
Va # string a
p1
I1 # integer 1
sVb # <setitem key/value>, string b
p2
I2 # integer 2
s. # <setitem key/value>
pickle.dumps(list({'a': 1, 'b': 2}.items()), protocol=0)
(lp0 # <new list>
(Va # <marker>, string a
p1
I1 # integer 1
tp2 # <make tuple>
a(Vb # <append>, <marker>, string b
p3
I2 # integer 2
tp4 # <make tuple>
a. # <append>
While the surrounding dict
and list
are both stored as a sequence of pairs, the pairs are stored differently. For the dict
, only key, value and stop are stored flatly. For the list
, an additional tuple
is needed for each pair.
protocol 0 differs in enough ways from the current protocol revision 4 that I'd recommend against using it to illustrate what might be going on.pickletools.dis()
is far more readable, at any rate.
– Martijn Pieters♦
Nov 19 '18 at 14:45
@MartijnPieters Well, the nesting is present either way and ASCII is less intimidating for some people. The point is that there is an additional container, not how these are encoded. That, plus your answer already showed the nicerdis
output anyways. ;)
– MisterMiyagi
Nov 19 '18 at 14:52
The nesting is not present, not in thepickle.dumps()
string, nor is the explanation of the opcodes. You had to add those yourself, or you used an IDE that detected you were dumping a pickle.
– Martijn Pieters♦
Nov 19 '18 at 15:07
@MartijnPieters I mean the tuples nested into the list. The relevant part is that the tuples must be dumped as well. I don't think it really matters where the formatting comes from, a pickle dump is low-level gibberish even with formatting/annotations for most people.
– MisterMiyagi
Nov 19 '18 at 15:14
add a comment |
Your Answer
StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");
StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "1"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});
function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});
}
});
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53376510%2fdisk-space-python-dictionary-vs-list%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
2 Answers
2
active
oldest
votes
2 Answers
2
active
oldest
votes
active
oldest
votes
active
oldest
votes
The dic_list
list consists of more objects. You have an outer list of tuples, each tuple a key-value pair. Each value is another list. Those tuples are the reason you need more space.
The dictionary pickle format doesn't have to use tuple objects to store key-value pairs; it is already known up front that a dictionary consists of a series of pairs, so you can serialise key and value per such pair directly without the overhead of a wrapping tuple object.
You can analyse pickle data with the pickletools
module; using a simpler dictionary with just one key-value, you can see the difference already:
>>> import pickle, pickletools
>>> pickletools.dis(pickle.dumps({'foo': 42}, protocol=pickle.HIGHEST_PROTOCOL))
0: x80 PROTO 4
2: x95 FRAME 12
11: } EMPTY_DICT
12: x94 MEMOIZE (as 0)
13: x8c SHORT_BINUNICODE 'foo'
18: x94 MEMOIZE (as 1)
19: K BININT1 42
21: s SETITEM
22: . STOP
highest protocol among opcodes = 4
>>> pickletools.dis(pickle.dumps(list({'foo': 42}.items()), protocol=pickle.HIGHEST_PROTOCOL))
0: x80 PROTO 4
2: x95 FRAME 14
11: ] EMPTY_LIST
12: x94 MEMOIZE (as 0)
13: x8c SHORT_BINUNICODE 'foo'
18: x94 MEMOIZE (as 1)
19: K BININT1 42
21: x86 TUPLE2
22: x94 MEMOIZE (as 2)
23: a APPEND
24: . STOP
If you consider EMPTY_DICT
+ SETITEM
to be the equivalent of EMPTY_LIST
+ APPEND
, then the only real difference in that stream in the addition of the TUPLE2
/ MEMOIZE
pair of opcodes. It's those opcodes that take the extra space.
add a comment |
The dic_list
list consists of more objects. You have an outer list of tuples, each tuple a key-value pair. Each value is another list. Those tuples are the reason you need more space.
The dictionary pickle format doesn't have to use tuple objects to store key-value pairs; it is already known up front that a dictionary consists of a series of pairs, so you can serialise key and value per such pair directly without the overhead of a wrapping tuple object.
You can analyse pickle data with the pickletools
module; using a simpler dictionary with just one key-value, you can see the difference already:
>>> import pickle, pickletools
>>> pickletools.dis(pickle.dumps({'foo': 42}, protocol=pickle.HIGHEST_PROTOCOL))
0: x80 PROTO 4
2: x95 FRAME 12
11: } EMPTY_DICT
12: x94 MEMOIZE (as 0)
13: x8c SHORT_BINUNICODE 'foo'
18: x94 MEMOIZE (as 1)
19: K BININT1 42
21: s SETITEM
22: . STOP
highest protocol among opcodes = 4
>>> pickletools.dis(pickle.dumps(list({'foo': 42}.items()), protocol=pickle.HIGHEST_PROTOCOL))
0: x80 PROTO 4
2: x95 FRAME 14
11: ] EMPTY_LIST
12: x94 MEMOIZE (as 0)
13: x8c SHORT_BINUNICODE 'foo'
18: x94 MEMOIZE (as 1)
19: K BININT1 42
21: x86 TUPLE2
22: x94 MEMOIZE (as 2)
23: a APPEND
24: . STOP
If you consider EMPTY_DICT
+ SETITEM
to be the equivalent of EMPTY_LIST
+ APPEND
, then the only real difference in that stream in the addition of the TUPLE2
/ MEMOIZE
pair of opcodes. It's those opcodes that take the extra space.
add a comment |
The dic_list
list consists of more objects. You have an outer list of tuples, each tuple a key-value pair. Each value is another list. Those tuples are the reason you need more space.
The dictionary pickle format doesn't have to use tuple objects to store key-value pairs; it is already known up front that a dictionary consists of a series of pairs, so you can serialise key and value per such pair directly without the overhead of a wrapping tuple object.
You can analyse pickle data with the pickletools
module; using a simpler dictionary with just one key-value, you can see the difference already:
>>> import pickle, pickletools
>>> pickletools.dis(pickle.dumps({'foo': 42}, protocol=pickle.HIGHEST_PROTOCOL))
0: x80 PROTO 4
2: x95 FRAME 12
11: } EMPTY_DICT
12: x94 MEMOIZE (as 0)
13: x8c SHORT_BINUNICODE 'foo'
18: x94 MEMOIZE (as 1)
19: K BININT1 42
21: s SETITEM
22: . STOP
highest protocol among opcodes = 4
>>> pickletools.dis(pickle.dumps(list({'foo': 42}.items()), protocol=pickle.HIGHEST_PROTOCOL))
0: x80 PROTO 4
2: x95 FRAME 14
11: ] EMPTY_LIST
12: x94 MEMOIZE (as 0)
13: x8c SHORT_BINUNICODE 'foo'
18: x94 MEMOIZE (as 1)
19: K BININT1 42
21: x86 TUPLE2
22: x94 MEMOIZE (as 2)
23: a APPEND
24: . STOP
If you consider EMPTY_DICT
+ SETITEM
to be the equivalent of EMPTY_LIST
+ APPEND
, then the only real difference in that stream in the addition of the TUPLE2
/ MEMOIZE
pair of opcodes. It's those opcodes that take the extra space.
The dic_list
list consists of more objects. You have an outer list of tuples, each tuple a key-value pair. Each value is another list. Those tuples are the reason you need more space.
The dictionary pickle format doesn't have to use tuple objects to store key-value pairs; it is already known up front that a dictionary consists of a series of pairs, so you can serialise key and value per such pair directly without the overhead of a wrapping tuple object.
You can analyse pickle data with the pickletools
module; using a simpler dictionary with just one key-value, you can see the difference already:
>>> import pickle, pickletools
>>> pickletools.dis(pickle.dumps({'foo': 42}, protocol=pickle.HIGHEST_PROTOCOL))
0: x80 PROTO 4
2: x95 FRAME 12
11: } EMPTY_DICT
12: x94 MEMOIZE (as 0)
13: x8c SHORT_BINUNICODE 'foo'
18: x94 MEMOIZE (as 1)
19: K BININT1 42
21: s SETITEM
22: . STOP
highest protocol among opcodes = 4
>>> pickletools.dis(pickle.dumps(list({'foo': 42}.items()), protocol=pickle.HIGHEST_PROTOCOL))
0: x80 PROTO 4
2: x95 FRAME 14
11: ] EMPTY_LIST
12: x94 MEMOIZE (as 0)
13: x8c SHORT_BINUNICODE 'foo'
18: x94 MEMOIZE (as 1)
19: K BININT1 42
21: x86 TUPLE2
22: x94 MEMOIZE (as 2)
23: a APPEND
24: . STOP
If you consider EMPTY_DICT
+ SETITEM
to be the equivalent of EMPTY_LIST
+ APPEND
, then the only real difference in that stream in the addition of the TUPLE2
/ MEMOIZE
pair of opcodes. It's those opcodes that take the extra space.
edited Nov 19 '18 at 14:44
answered Nov 19 '18 at 14:21
Martijn Pieters♦Martijn Pieters
705k13524582285
705k13524582285
add a comment |
add a comment |
A dict
can natively handle key-value pairs, while a list
must use a separate container.
Your dict
is a straightforward representation of Dict[K, V]
- pairs plus some structure. Since the structure is runtime only, it can be ignored for storage.
{'a': 1, 'b': 2}
Your list
uses a helper for pairs, resulting in List[Tuple[K,V]]
- pairs plus wrapper. Since the wrapper is needed to reconstruct the pairs, it cannot be ignored for storage.
[('a', 1), ('b', 2)]
You can also inspect this in the pickle dump. The list
dump contains markers for the additional tuples.
pickle.dumps({'a': 1, 'b': 2}, protocol=0)
(dp0 # <new dict>
Va # string a
p1
I1 # integer 1
sVb # <setitem key/value>, string b
p2
I2 # integer 2
s. # <setitem key/value>
pickle.dumps(list({'a': 1, 'b': 2}.items()), protocol=0)
(lp0 # <new list>
(Va # <marker>, string a
p1
I1 # integer 1
tp2 # <make tuple>
a(Vb # <append>, <marker>, string b
p3
I2 # integer 2
tp4 # <make tuple>
a. # <append>
While the surrounding dict
and list
are both stored as a sequence of pairs, the pairs are stored differently. For the dict
, only key, value and stop are stored flatly. For the list
, an additional tuple
is needed for each pair.
protocol 0 differs in enough ways from the current protocol revision 4 that I'd recommend against using it to illustrate what might be going on.pickletools.dis()
is far more readable, at any rate.
– Martijn Pieters♦
Nov 19 '18 at 14:45
@MartijnPieters Well, the nesting is present either way and ASCII is less intimidating for some people. The point is that there is an additional container, not how these are encoded. That, plus your answer already showed the nicerdis
output anyways. ;)
– MisterMiyagi
Nov 19 '18 at 14:52
The nesting is not present, not in thepickle.dumps()
string, nor is the explanation of the opcodes. You had to add those yourself, or you used an IDE that detected you were dumping a pickle.
– Martijn Pieters♦
Nov 19 '18 at 15:07
@MartijnPieters I mean the tuples nested into the list. The relevant part is that the tuples must be dumped as well. I don't think it really matters where the formatting comes from, a pickle dump is low-level gibberish even with formatting/annotations for most people.
– MisterMiyagi
Nov 19 '18 at 15:14
add a comment |
A dict
can natively handle key-value pairs, while a list
must use a separate container.
Your dict
is a straightforward representation of Dict[K, V]
- pairs plus some structure. Since the structure is runtime only, it can be ignored for storage.
{'a': 1, 'b': 2}
Your list
uses a helper for pairs, resulting in List[Tuple[K,V]]
- pairs plus wrapper. Since the wrapper is needed to reconstruct the pairs, it cannot be ignored for storage.
[('a', 1), ('b', 2)]
You can also inspect this in the pickle dump. The list
dump contains markers for the additional tuples.
pickle.dumps({'a': 1, 'b': 2}, protocol=0)
(dp0 # <new dict>
Va # string a
p1
I1 # integer 1
sVb # <setitem key/value>, string b
p2
I2 # integer 2
s. # <setitem key/value>
pickle.dumps(list({'a': 1, 'b': 2}.items()), protocol=0)
(lp0 # <new list>
(Va # <marker>, string a
p1
I1 # integer 1
tp2 # <make tuple>
a(Vb # <append>, <marker>, string b
p3
I2 # integer 2
tp4 # <make tuple>
a. # <append>
While the surrounding dict
and list
are both stored as a sequence of pairs, the pairs are stored differently. For the dict
, only key, value and stop are stored flatly. For the list
, an additional tuple
is needed for each pair.
protocol 0 differs in enough ways from the current protocol revision 4 that I'd recommend against using it to illustrate what might be going on.pickletools.dis()
is far more readable, at any rate.
– Martijn Pieters♦
Nov 19 '18 at 14:45
@MartijnPieters Well, the nesting is present either way and ASCII is less intimidating for some people. The point is that there is an additional container, not how these are encoded. That, plus your answer already showed the nicerdis
output anyways. ;)
– MisterMiyagi
Nov 19 '18 at 14:52
The nesting is not present, not in thepickle.dumps()
string, nor is the explanation of the opcodes. You had to add those yourself, or you used an IDE that detected you were dumping a pickle.
– Martijn Pieters♦
Nov 19 '18 at 15:07
@MartijnPieters I mean the tuples nested into the list. The relevant part is that the tuples must be dumped as well. I don't think it really matters where the formatting comes from, a pickle dump is low-level gibberish even with formatting/annotations for most people.
– MisterMiyagi
Nov 19 '18 at 15:14
add a comment |
A dict
can natively handle key-value pairs, while a list
must use a separate container.
Your dict
is a straightforward representation of Dict[K, V]
- pairs plus some structure. Since the structure is runtime only, it can be ignored for storage.
{'a': 1, 'b': 2}
Your list
uses a helper for pairs, resulting in List[Tuple[K,V]]
- pairs plus wrapper. Since the wrapper is needed to reconstruct the pairs, it cannot be ignored for storage.
[('a', 1), ('b', 2)]
You can also inspect this in the pickle dump. The list
dump contains markers for the additional tuples.
pickle.dumps({'a': 1, 'b': 2}, protocol=0)
(dp0 # <new dict>
Va # string a
p1
I1 # integer 1
sVb # <setitem key/value>, string b
p2
I2 # integer 2
s. # <setitem key/value>
pickle.dumps(list({'a': 1, 'b': 2}.items()), protocol=0)
(lp0 # <new list>
(Va # <marker>, string a
p1
I1 # integer 1
tp2 # <make tuple>
a(Vb # <append>, <marker>, string b
p3
I2 # integer 2
tp4 # <make tuple>
a. # <append>
While the surrounding dict
and list
are both stored as a sequence of pairs, the pairs are stored differently. For the dict
, only key, value and stop are stored flatly. For the list
, an additional tuple
is needed for each pair.
A dict
can natively handle key-value pairs, while a list
must use a separate container.
Your dict
is a straightforward representation of Dict[K, V]
- pairs plus some structure. Since the structure is runtime only, it can be ignored for storage.
{'a': 1, 'b': 2}
Your list
uses a helper for pairs, resulting in List[Tuple[K,V]]
- pairs plus wrapper. Since the wrapper is needed to reconstruct the pairs, it cannot be ignored for storage.
[('a', 1), ('b', 2)]
You can also inspect this in the pickle dump. The list
dump contains markers for the additional tuples.
pickle.dumps({'a': 1, 'b': 2}, protocol=0)
(dp0 # <new dict>
Va # string a
p1
I1 # integer 1
sVb # <setitem key/value>, string b
p2
I2 # integer 2
s. # <setitem key/value>
pickle.dumps(list({'a': 1, 'b': 2}.items()), protocol=0)
(lp0 # <new list>
(Va # <marker>, string a
p1
I1 # integer 1
tp2 # <make tuple>
a(Vb # <append>, <marker>, string b
p3
I2 # integer 2
tp4 # <make tuple>
a. # <append>
While the surrounding dict
and list
are both stored as a sequence of pairs, the pairs are stored differently. For the dict
, only key, value and stop are stored flatly. For the list
, an additional tuple
is needed for each pair.
edited Nov 19 '18 at 15:17
answered Nov 19 '18 at 14:24
MisterMiyagiMisterMiyagi
7,6282142
7,6282142
protocol 0 differs in enough ways from the current protocol revision 4 that I'd recommend against using it to illustrate what might be going on.pickletools.dis()
is far more readable, at any rate.
– Martijn Pieters♦
Nov 19 '18 at 14:45
@MartijnPieters Well, the nesting is present either way and ASCII is less intimidating for some people. The point is that there is an additional container, not how these are encoded. That, plus your answer already showed the nicerdis
output anyways. ;)
– MisterMiyagi
Nov 19 '18 at 14:52
The nesting is not present, not in thepickle.dumps()
string, nor is the explanation of the opcodes. You had to add those yourself, or you used an IDE that detected you were dumping a pickle.
– Martijn Pieters♦
Nov 19 '18 at 15:07
@MartijnPieters I mean the tuples nested into the list. The relevant part is that the tuples must be dumped as well. I don't think it really matters where the formatting comes from, a pickle dump is low-level gibberish even with formatting/annotations for most people.
– MisterMiyagi
Nov 19 '18 at 15:14
add a comment |
protocol 0 differs in enough ways from the current protocol revision 4 that I'd recommend against using it to illustrate what might be going on.pickletools.dis()
is far more readable, at any rate.
– Martijn Pieters♦
Nov 19 '18 at 14:45
@MartijnPieters Well, the nesting is present either way and ASCII is less intimidating for some people. The point is that there is an additional container, not how these are encoded. That, plus your answer already showed the nicerdis
output anyways. ;)
– MisterMiyagi
Nov 19 '18 at 14:52
The nesting is not present, not in thepickle.dumps()
string, nor is the explanation of the opcodes. You had to add those yourself, or you used an IDE that detected you were dumping a pickle.
– Martijn Pieters♦
Nov 19 '18 at 15:07
@MartijnPieters I mean the tuples nested into the list. The relevant part is that the tuples must be dumped as well. I don't think it really matters where the formatting comes from, a pickle dump is low-level gibberish even with formatting/annotations for most people.
– MisterMiyagi
Nov 19 '18 at 15:14
protocol 0 differs in enough ways from the current protocol revision 4 that I'd recommend against using it to illustrate what might be going on.
pickletools.dis()
is far more readable, at any rate.– Martijn Pieters♦
Nov 19 '18 at 14:45
protocol 0 differs in enough ways from the current protocol revision 4 that I'd recommend against using it to illustrate what might be going on.
pickletools.dis()
is far more readable, at any rate.– Martijn Pieters♦
Nov 19 '18 at 14:45
@MartijnPieters Well, the nesting is present either way and ASCII is less intimidating for some people. The point is that there is an additional container, not how these are encoded. That, plus your answer already showed the nicer
dis
output anyways. ;)– MisterMiyagi
Nov 19 '18 at 14:52
@MartijnPieters Well, the nesting is present either way and ASCII is less intimidating for some people. The point is that there is an additional container, not how these are encoded. That, plus your answer already showed the nicer
dis
output anyways. ;)– MisterMiyagi
Nov 19 '18 at 14:52
The nesting is not present, not in the
pickle.dumps()
string, nor is the explanation of the opcodes. You had to add those yourself, or you used an IDE that detected you were dumping a pickle.– Martijn Pieters♦
Nov 19 '18 at 15:07
The nesting is not present, not in the
pickle.dumps()
string, nor is the explanation of the opcodes. You had to add those yourself, or you used an IDE that detected you were dumping a pickle.– Martijn Pieters♦
Nov 19 '18 at 15:07
@MartijnPieters I mean the tuples nested into the list. The relevant part is that the tuples must be dumped as well. I don't think it really matters where the formatting comes from, a pickle dump is low-level gibberish even with formatting/annotations for most people.
– MisterMiyagi
Nov 19 '18 at 15:14
@MartijnPieters I mean the tuples nested into the list. The relevant part is that the tuples must be dumped as well. I don't think it really matters where the formatting comes from, a pickle dump is low-level gibberish even with formatting/annotations for most people.
– MisterMiyagi
Nov 19 '18 at 15:14
add a comment |
Thanks for contributing an answer to Stack Overflow!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53376510%2fdisk-space-python-dictionary-vs-list%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
2
Closely related: Python memory consumption: dict VS list of tuples
– Martijn Pieters♦
Nov 19 '18 at 14:29