Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

common Lagrangian data structure #78

Open
miniufo opened this issue Jun 23, 2021 · 5 comments
Open

common Lagrangian data structure #78

miniufo opened this issue Jun 23, 2021 · 5 comments

Comments

@miniufo
Copy link

miniufo commented Jun 23, 2021

I have been thinking if we need a common Lagrangian type data structure, like the xarray for coordinated n-dimensional dataset, to describe the large number of Lagrangian particles. These data generally involve a time series of positions and associated data along their Lagrangian tracks. Examples are the simulated Lagrangian trajectories here, GDP drifter dataset, Argo float dataset, as well as quasi-Lagrangian tropical cyclone best-track dataset and mesoscale eddy dataset.

So far as I know, pandas.dataframe is used to depict such data, with at least three columns of time, x_pos and y_pos. This is indeed efficient and clear. However, sometimes we need extra information to tie to the dataframe, such as ID, name, type, status etc. So I think we can design a common Lagrangian data structure that all these (quasi) Lagrangian data and associated dataset can be described, accessed, stored, and manipulated efficiently.

A scratch is to define a class of Particle, with ID, name, and records as its fields. Its records is a pandas.DataFrame that stores the Lagrangian data. Through overwritting some of the operators of Particle, we can feature a simple use of Particle like pandas.DataFrame. Through extends, we can further define Drifter, Float, TropicalCyclone subclasses to become more appropriate for each case.

Do you guys have any comment on this?

@rabernat
Copy link
Collaborator

This is a great idea.

I would start by reviewing the CF conventions on Trajectory Data. Probably just having data that all conforms to that would be a great start.

Tagging @selipot, who has been thinking about this for GDP data.

@selipot
Copy link

selipot commented Jun 23, 2021

Thank you for tagging me here. I have been thinking about this meaning I wrote and submitted a proposal to the NSF EarthCube program to do just that: define a common Lagrangian data structure for the GDP and others. I am hoping to hear in the fall. You can see the extend of the metadata available for the GDP at its ERDDAP server.

@rabernat
Copy link
Collaborator

rabernat commented Jun 23, 2021

Shane do you think the CF trajectory data / metadata conventions are enough? Or is something more needed?

@selipot
Copy link

selipot commented Jun 23, 2021

That's is what (or near) is used right now by the GDP and returned by the OSM ERDDAP server. I am not using these files because I like to have markers for "data gaps" or interruption markers for what are otherwise regular interval time series.

@miniufo
Copy link
Author

miniufo commented Jun 24, 2021

Glad you guys bring me these information. Hope @selipot get the funding so that we can start the python implementation.

I didn't notice the CF convection but it indeed addresses many of my concerns. Also, I have some experience of using both GDP data and tropical cyclone data. I have tried to abstract the Lagrangian data model as Particle here, where you could also find its subclass as TC or Drifter (or profiling Float). For a set of Particles, I defined a ParticleSet that is equivalent to xarray.Dataset.

Here is a schematic plot:
myaa_temp_screendump

I hope that I am in the right path and also that all these concerns can be merged together to shape the Lagrangian data model.

A further thinking is that, one may want to analysis the 3D structure of a mesoscale eddy (or tropical cylone) in a translating cylindrical coordinate. I hope the Lagrangian model could simplify this kind of analysis. Specifically, given a eddy information, I could get the quasi-Lagrangian view of its 3D structure.

Not sure if here is the right place to discuss this. Hope to see a repo for this. Or maybe we could start a session in Pangeo so that I could be updated regularly.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants