2018-08-12
I recently had to construct a couple of numpy arrays from a handful of files. I quickly did something like this:
=
=
=
=
This was taking a lot longer than I thought it should. There's a
very simple reason: I was copying my arr
variable len(files)
times to construct a final arr
(and every iteration of the loop
arr
was getting larger). This was of course unnecessary.
A better (and, to those who like to use the label "pythonic", more pythonic) way to do it:
=
So when it comes to repetitive NumPy concatenations... avoid it. A quick test in IPython:
=
=
=
return
=
return
%
%
The output:
1.73 s ± 2 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
247 ms ± 2.52 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
Quite the difference.