Python vs Matlab speed differences #57

kamwal · 2020-05-28T12:43:20Z

I found relatively large differences (~40 seconds) in speed per profile in the update salinity mapping calulations. This potentially could be related with a capability of my computer or there is something that could be improved in Python code.
Did anyone have a simmilar Python code behaviour?

WMO num: 1901847
WMO boxes: wmo_boxes_ctd.mat
Refence data: CTD_for_DMQC_2019V01
Code evaliation times for Matlab and Python:
Matlab 1901847.docx
Python 1901847.docx

Config_file:

`% ===============================
%
% Objective Mapping Parameters
%

% max number of historical casts used in objective mapping
CONFIG_MAX_CASTS=300

% 1=use PV constraint, 0=don't use PV constraint, in objective mapping
MAP_USE_PV=0

% 1=use SAF separation criteria, 0=don't use SAF separation criteria, in objective mapping
MAP_USE_SAF=0

% spatial decorrelation scales, in degrees
MAPSCALE_LONGITUDE_LARGE=8
MAPSCALE_LONGITUDE_SMALL=4
MAPSCALE_LATITUDE_LARGE=4
MAPSCALE_LATITUDE_SMALL=2

% cross-isobath scales, dimensionless, see BS(2005)
MAPSCALE_PHI_LARGE=0.1
MAPSCALE_PHI_SMALL=0.02

% temporal decorrelation scale, in years
MAPSCALE_AGE=5
MAPSCALE_AGE_LARGE=20

% exclude the top xxx dbar of the water column
MAP_P_EXCLUDE=100

% only use historical data that are within +/- yyy dbar from float data
MAP_P_DELTA=200`

cabanesc · 2020-05-28T13:23:38Z

Hi Kamila,

I noticed the same, with longer run times with the python code. In the matlab version, most of the time is spent reading the climatology files (update_salinity_mapping stage). I guess it can be the same with the python code . Also, python code currently reads .mat files, which may not be optimal..??

edsmall-bodc · 2020-05-28T14:41:34Z

@cabanesc That's correct, this isn't optimal. We made this decision initially to make it super simple to compare the outputs from the matlab version. It's a simple change, and @gmaze and I have already started to engage in discussions about we we should use instead, perhaps even allowing users to choose what file formats to use.

Above that, I already have some places that may be acting as bottle necks in the back of my mind. This is going to be a wider investigation task, which @kamwal has already labelled it as. When we have completed the investigation, we will make separate tasks for each of the areas that are a problem and then create fixes and changes.

For now, what I would recommend is that UATs use this thread to post floats and parameter settings that seem to be excessively slow, so that we have examples that we can reproduce and test.

edsmall-bodc · 2020-06-04T15:55:03Z

I've been doing some general experimentation, and I think a lot of the slowdown is most likely coming from the use of the stack and append functions for forming arrays.

Most of the time when we are generating a large array, we know how big the final array is going to be. Because of this, we can create an NxM array in memory and fill it with random values, and then we can change these values to the correct ones as we go through the array. This means that we only have one memory allocation step, and it's done right at the beginning.

However, sometimes we don't know how large our final array is going to be, so instead we have to append columns or rows as we go through our analysis. This is much slower because every time we append to an array what we actually do is allocate a whole heap of memory for a slightly larger array, put the old data and the new data there, and then delete the old array. The bigger the array, the more time consuming this is, which is why we are seeing slowdown for analyses that have a lot of profiles (as the matrices slowly grow larger over time.

To add rows/columns to a matrix I have been using the "stack" and "append" numpy function (or variants of it, such as vstack or hstack, as you can see here and append here).

Basically, appending a matrix is slow, but pre-allocating memory is fast. There must be better ways to do this, by either:

a) finding out ahead of time how large our final matrix will be, or
b) Using better functionality from another library?

@gmaze, Do you have any recommendations on functions or techniques that could provide a speed up here?

It's possible after investigating this further that we will find that we can actually preallocate the memory, but would have to look a little deeper. For example, if a user highlights that they want 300 historical climatology points, we know that the mapped salinity matrix will be Nx300. Is there a way we can find out N right from the get go? Perhaps after the find_besthist stage?

Any comments are, of course, welcome!

edsmall-bodc · 2020-06-04T16:06:14Z

As an example, we have to continually add NaNs to all of the climatological so that each column is the same length. That can be seen here. I can't think of a way around that for this specific example, but just wanted to show what I mean.

I'm planning on using a python profiler to analyse the code at the end of this month (or at the end of testing). This will reveal things like

How long each function takes to run
How many times a function is called

I think we might even be able to add it to the CI pipeline to make sure certain functions continue to operate at an acceptable speed.

edsmall-bodc · 2020-07-01T12:08:37Z

@gmaze I will post the results from the profiler on this thread, if that suits you? I have a couple of profiles that are particularly slow, so will run the same profiles in the matlab version so we can compare results.

The problem is in update salinity mapping. For all the testing so far, the calibration section has been fast.

gmaze · 2020-07-09T15:24:52Z

I've been doing some general experimentation, and I think a lot of the slowdown is most likely coming from the use of the stack and append functions for forming arrays.

pre-allocating memory is fast

Yes, with the current loop design, we can't expect any significant changes before being able to pre-allocate array in memory.

However, we could also try some parallelization of the loop over profiles, gathering a bunch of arrays for each profile and then we would simply sort and concatenate/merge arrays at the end of the process

apswong · 2020-07-09T21:52:01Z

For example, if a user highlights that they want 300 historical climatology points, we know that the mapped >salinity matrix will be Nx300. Is there a way we can find out N right from the get go? Perhaps after the >find_besthist stage?

If CONFIG_MAX_CASTS = n = number of historical profiles used in each mapping, then you'd need a matrix of m x n to put together la_grid_sal, etc, for input into the mapping routine. m = the maximum length amongst the n number of historical profiles. You can determine what m is in find_besthist, then pre-assign it in retr_region, instead of appending rows in a loop.

Not sure if this will help or not.

kamwal · 2020-08-06T08:09:17Z

On of the approach to improve the speed of calulations is to replace current mechanism of running multiple floats in sequence by running multiple floats in parallel.
Our current test of speed comparison between both methods of calulations showed that in running 4 floats with around 150 profiles each the use of the parallel method of calulations of floats is about 1.4 x quicker that running floats in sequence.
Total elapset time of

4 floats run in sequence: 33646.36 seconds
4 floats run parallel: 21228.26 seconds

Moreover, an increase of the number of floats used in parrallel calultions will lead to increase of total time of calualtions compared with the previously run in sequence.

The second improvement in speed between Matlab and Python code will be the use of the pre allocation of memory during the calulations. This work in still ongoing.

kamwal · 2021-04-28T11:05:30Z

The recent works on the updated salinity maping leading to improve the speed of pyowc code are wider descibed in #37

kamwal · 2022-10-25T13:21:12Z

This ticket can be closed.
The speed of the Python code has been significantly improved.

kamwal transferred this issue from euroargodev/User-Acceptance-Test-Python-version-of-the-OWC-tool Feb 11, 2021

kamwal closed this as completed Oct 25, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Python vs Matlab speed differences #57

Python vs Matlab speed differences #57

kamwal commented May 28, 2020

cabanesc commented May 28, 2020

edsmall-bodc commented May 28, 2020

edsmall-bodc commented Jun 4, 2020

edsmall-bodc commented Jun 4, 2020 •

edited

Loading

edsmall-bodc commented Jul 1, 2020

gmaze commented Jul 9, 2020

apswong commented Jul 9, 2020

kamwal commented Aug 6, 2020

kamwal commented Apr 28, 2021

kamwal commented Oct 25, 2022

Python vs Matlab speed differences #57

Python vs Matlab speed differences #57

Comments

kamwal commented May 28, 2020

cabanesc commented May 28, 2020

edsmall-bodc commented May 28, 2020

edsmall-bodc commented Jun 4, 2020

edsmall-bodc commented Jun 4, 2020 • edited Loading

edsmall-bodc commented Jul 1, 2020

gmaze commented Jul 9, 2020

apswong commented Jul 9, 2020

kamwal commented Aug 6, 2020

kamwal commented Apr 28, 2021

kamwal commented Oct 25, 2022

edsmall-bodc commented Jun 4, 2020 •

edited

Loading