-
Notifications
You must be signed in to change notification settings - Fork 127
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
"compute_on_defer" not working with dask array #741
Comments
Thanks for reporting, @claudiofgcardoso. You're right that with the new dask development, However, the output from your function seems to hint at something else being wrong. Can you investigate a bit which of the two lines in your |
I changed the def compute(fieldset):
fieldset.W[0].data[:, 0, :, :] = 0.
fieldset.W[1].data[:, 0, :, :] = 0. The outcome was the same:
I believe this happens because I'm trying to do a direct assignment of individual elements (values on depth level 0) in a dask array, something that seems to be not possible in such arrays (https://stackoverflow.com/questions/40935756/item-assignment-not-supported-in-dask) |
The dask implementation also seems to be incompatible with
|
Hmm, this is surprising because in principle the |
Here's the script |
Thank you. Can you also provide the |
Sorry about that! Here are the missing files: |
I still can't run it because I miss the wind and mohid files? Could you either provide all the required files, or make your script simpler (but still failing) so that I can test it here? |
You could run the script without the wind forcing (by defining wind=0).. But it is better to include it to reproduce the errors. |
Hi - just as a little background: Claudio, you are right - direct item assignment for 'lazy-load' dask arrays is not possible, as the dask documentation already says. This is because dask explicitly manages its data as a 'virtual array', meaning that a dask array is just the description of which arrays (by name/index) it has, and what its bounding box values are. Dask arrays have no data, until the one moment when That means that e.g. in your plotting call, changing @erikvansebille and me had a talk about the behaviour of defered arrays and their working with dask, cause it is a more involved problem. That is because fields not generated from NetCDF but by |
Hi @CKehl, thank you a lot for the detailed explanation on the functioning of "dask"! I will test if your suggestion solves the issue for the No need to thank me for that, I'm well aware that all user inputs are very important for you guys at this stage.. I will keep doing this, as long as it helps! |
For info, I tested the suggestion from @CKehl (
|
I'm looking into it more in-depth now Update: testing the potential fix locally later. Issue here is that, because the plotting relies on having a graphical output, it cannot be easily tested in the CI workflow. Still, we take care of it. That also means that your original call to |
Dear @claudiofgcardoso , |
Hi @CKehl, Sorry for the very late reply ( needed to finish a pending task before diving back to this one). I updated Parcels with the latest version. While testing the compute_on_defer I realized that my compute functions still don't work: def compute(fieldset):
for tind in fieldset.W[0].loaded_time_indices:
fieldset.W[0].data[tind,0]=0
fieldset.W[1].data[tind, 0] = fieldset.W[1].data[tind, 0].clip(min=0) or the simplified version: def compute(fieldset):
fieldset.W[0].data[:, 0, :, :] = 0.
fieldset.W[1].data[:, 0, :, :] = 0. I need this operation in order to set ROMS vertical velocities at the surface to 0 m/s (otherwise particles may be out of the domain with depth < 0). I believe this is related with the fact that dask arrays cannot be edited.. I'll have to find a workaround for this one. Any suggestion will be highly welcome! Will keep you updated about the other issues. Cheers, |
Hi @claudiofgcardoso, there are at least a few workaround options that could work in this case:
def AdvectionEE_2fieldset(particle, fieldset, time):
"""Advection of particles using Explicit Euler (aka Euler Forward) integration for two fieldsets"""
(u1, v1) = fieldset.UV1[time, particle.depth, particle.lat, particle.lon]
(u2, v2) = fieldset.UV2[time, particle.depth, particle.lat, particle.lon]
particle.lon += (u1 + u2) * particle.dt
particle.lat += (v1 + v2) * particle.dt Let me know whether these three workarounds have worked for you! |
Thanks for the suggestions @erikvansebille, I'm trying to implement the following apporach (last lines) in my advection kernel: def AdvectionRK4_3D(particle, fieldset, time):
"""Advection of particles using fourth-order Runge-Kutta integration including vertical velocity.
W downward velocity (positive goes downward, negative goes upwards)
Function needs to be converted to Kernel object before execution"""
if particle.beached == 0:
if fieldset.track_kernels: print('Particle [%d] advection' %particle.id)
particle.prev_lon = particle.lon # Set the stored values for next iteration.
particle.prev_lat = particle.lat
(u1, v1, w) = fieldset.UVW[time, particle.depth, particle.lat, particle.lon]
lon1 = particle.lon + u1*.5*particle.dt
lat1 = particle.lat + v1*.5*particle.dt
(u2, v2) = fieldset.UV[time + .5 * particle.dt, particle.depth, lat1, lon1]
lon2 = particle.lon + u2*.5*particle.dt
lat2 = particle.lat + v2*.5*particle.dt
(u3, v3) = fieldset.UV[time + .5 * particle.dt, particle.depth, lat2, lon2]
lon3 = particle.lon + u3*particle.dt
lat3 = particle.lat + v3*particle.dt
(u4, v4) = fieldset.UV[time + particle.dt, particle.depth, lat3, lon3]
particle.lon += (u1 + 2*u2 + 2*u3 + u4) / 6. * particle.dt
particle.lat += (v1 + 2*v2 + 2*v3 + v4) / 6. * particle.dt
d = particle.depth + w * particle.dt
d += particle.Ws * particle.dt #settling velocity: positive values for sinking
# particle.depth = d if d > 0 else 0
if d >= 0:
particle.depth = d
else:
print('Particle [%d] left the domain through the surface with depth = %g' % (particle.id, d))
particle.depth = 0
particle.beached = 2 This approach does not allow the usage of a 4th order scheme to calculate the vertical displacement of a particle (if 'dep1' is already out of the surface, the calculation of 'dep2' will already though an error). So, I use the 4th order advection scheme for the horizontal advection and a simple Euler forward advection for vertical advection.. It works but it is not entirely accurate. I'm thinking that maybe I should just set to 0 positive velocities (W = upward velocity) at the surface from ROMS files before loading them into parcels. |
Also, I noticed that in this new version of Parcels one cannot assign a variable named "p" within kernels in JIT mode, although it can handle in Scipy mode. I believe this is related with the fact that parcels already uses the variable "p" for accessing particles within the simulation. |
This fixes the Issue that users can't use variable named `p` in their Custom Kernels (see also #741)
…r replication and fix OceanParcels#741
Hello all,
I realized that 'fieldset.U.data' is now a dask array:
dask.array<where, shape=(3, 28, 205, 223), dtype=float32, chunksize=(1, 28, 205, 223), chunktype=numpy.ndarray>
This makes the simplest operation using 'compute_on_defer' not working:
Result:
NotImplementedError: Item assignment with <class 'tuple'> not supported
Any ideas of how to go around this?
The text was updated successfully, but these errors were encountered: