Last active
February 28, 2022 22:02
-
-
Save mattieb/6280653 to your computer and use it in GitHub Desktop.
Time various methods of removing a possibly-present item from a dict
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
#!/usr/bin/python | |
import time | |
def new_d(): | |
return { | |
1: 2, 3: 4, 5: 6, 7: 8, 9: 10, | |
11: 12, 13: 14, 15: 16, 17: 18, 19: 20 | |
} | |
def time_func(iterations, f, *args): | |
start = time.time() | |
for i in xrange(iterations): | |
f(*args) | |
return time.time() - start | |
def try_del(key): | |
d = new_d() | |
try: | |
del d[key] | |
except KeyError: | |
pass | |
def if_del(key): | |
d = new_d() | |
if key in d: | |
del d[key] | |
def pop_del(key): | |
d = new_d() | |
d.pop(key, None) | |
def succeed(f): | |
f(3) | |
def fail(f): | |
f(4) | |
iterations = 1000000 | |
print "pop succeed:", time_func(iterations, succeed, pop_del) | |
print "pop fail:", time_func(iterations, fail, pop_del) | |
print "try succeed:", time_func(iterations, succeed, try_del) | |
print "try fail:", time_func(iterations, fail, try_del) | |
print "if succeed:", time_func(iterations, succeed, if_del) | |
print "if fail:", time_func(iterations, fail, if_del) |
Well, I didn't go into nearly the same level of detail, but I did feel like putting the results into a table allowed for easier consideration of what the data could tell us.
I should probably note that the level of precision in these tables is arbitrary, and we can all agree that a single run on my random pc isn't sufficient to warrant these precision levels. I chose to round to these points in case anyone else had a larger data set to compare this small sample against.
Table 1: execution time (in milliseconds) rounded to to the nearest microsecond
case | succeed | fail |
---|---|---|
pop | 45.673 ms | 49.161 ms |
try | 56.651 ms | 72.481 ms |
if | 58.166 ms | 42.423 ms |
Table 2: execution times relative to the slowest test case
case | succeed | fail |
---|---|---|
pop | 63.014% | 67.8% |
try | 78.16% | 100.0% |
if | 80.25% | 58.5% |
Table 2 data has been rounded to the nearest tenth of a percent (that's about +-70 microseconds for the data from my pc)
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
The general approach to testing is right,
I'm not so sure about the premise of the testing, and the actual methodology.
The premise
The assertion that catching exceptions is slow may be true, but a real world measurement of try/except versus the other options isn't just an average between try success and try failure. The ratio of success/fail will vary to the situation, and I would hope that as consenting adults we would be applying try on data where we expect it to succeed the majority of the time. That's the situation where try is great, where it is indeed easier to ask for forgiveness than permission. If that's not the case then try/except is a bit of a rough approach to dealing with the data structure. In that sense I don't like the reductive assumptions to comparing try-success & try-failure to the other patterns. More generally the discussion is about the debate between using a built-in or one of the EAFP or LBYL patterns**[*]**
The methodology
The code is calling
new_d()
, and passing the functions intosucceed()
orfail()
1000000 times as well as the test subject functions, so these measurements are also including calls to an additional two functions on every iteration. They aren't particularly busy functions, but function calls still take time. It's DRY, sure, and a nice demonstration of higher order function usage which would be great in real code, but an empirical test setup should try to avoid measuring the test harness as well as the subject**[^]**Rolling your own
time_func
is not that easy, better to usetimeit
which will isolate the function, do setup & and measure it more independently.Compare the results of the original function with these modifications.
The first is the same as the original but removes the calls to
new_d()
. You might argue that a real world example function would need to call something to return a dict. It is also likely that it would be explicitly passed a parameter or directly access a global, class, instance, or locally scoped dictionary in the function. Perhaps I'm wrong and a function would indeed usually generate a new dict by calling a function, but this test is supposed to zoom in and focus on the performance of the various approaches topop
,try
, andif
.The second keeps the calls to
new_d()
, but removes the calls to thesuccess
andfail
functions in order to show the impact those have on the measurement, and modifies it to usetimeit
. Sincetimeit
performs the iteration loop for you, thetime_func
had to be removed. In the original, thetime_func
was only called once, but internally ran 1000000 iterations. I could have left it as-is and runtimeit
once andtime_func
many times, but the point is to make use of the machinery oftimeit
. I could have also changedtime_func
iterations to 1, but that seemed a bit silly to run a loop of 1.The third is a combination of the previous two
These are my times of the 4 scripts, starting with the original and then my 3 modified versions.
There is some variability in these, so to take it next level and be scientific you should run each of these multiple times and alongside an average also report at least some measures of variance or standard deviation.
So I did that using 10 rounds of each:
Apart from matplotlib identifying some outliers (which will be related to sporadic things on my system) there is a fairly clear pattern here. They are all very quick, and apart from try-fail they are all pretty much the same performance, and given this is 1000000 iterations, each sits around 500ns, so it's a massively premature optimization to even think this through as far as has been done here. So stop it! Go and just use what works.
So I shamelessly borrowed matplotlib code for the boxplots from here; https://stackoverflow.com/a/20132614/1335793
I made no attempt at making it more DRY, so this code is for reference only, not an example of how you should do it:
[*]
pop
is the built-in which can handle the key not found problem,try
is the EAFP approach andif
is the LBYL approach, and many people (including the glossary) are of the opinion that LBYL is not pythonic. The example of another thread potentially deleting the key inbetween the first thread doing a successful key-test and then trying to access that key only to have it gone and throw an exception anyway is valid, but also says that locks fix that issue. It's only really an issue in Python because the GIL does even round-robin thread switching and can break code execution paths at strange places (like in the middle of an if/elif/else block). For real concurrency or parallelism you wouldn't be using threads (in most cases). The other argument relates to duck-typing where you aren't supposed to be doingtype()
checks (LBYL style) but then goes on to say you'd be doinghasattr()
checks or use Abstract Base Classes and then doisinstance()
orissubclass()
tests, and how would you? Surely with LBYL style, so in fact there is no basis in the argument that EAFP is better than LBYL. There are situations to use either, and situations where one is preferred. I tend to think using the built-in is ideal because they are simpler, and you benefit from implementation improvements in the language without having to change your code. For example usingrange()
in python 2 was fine, but now in python 3 (like so many other things) it's a generator so you get awesome pipeline behaviour and reduced memory usage. Sure there wasxrange
but my point is about benefiting from python enhancements without re-writing code.[^] as much as possible, in reality you can't eliminate all instrument interference from measurement, and there is no such thing as objective truth, but I digress. Philosophy of science is a deep topic, and completely falsifiable, so maybe I'm wrong and there is objective truth and maybe there is a god (the atheists aren't very scientific to be honest, it's hardly a falsifiable perspective to hold).