-
Notifications
You must be signed in to change notification settings - Fork 3
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Question about cache_age #48
Comments
Hi! Sorry for the slow response here. Glad the package is useful! You're correct in that The default setup for argodata is to never use a cache that persists between R sessions. This is because caching Argo files is hard...the realtime files get replaced with delayed-mode files, so if you include any realtime files in your analysis they may not exist later and you'll get code that won't run because some realtime files that the index pointed to no longer exist. Other types of files work really well for caching and age-based invalidation, such as Sprof or delayed-mode files. If you're using these files, you can use
...OR...
...OR...
If none of those options work for you let me know what you would like! I'd be happy to consider other workflows. I hope that helps! |
Flagging @richardsc who might have other ideas! |
We're using the delayed-mode S-files for our analysis, so I think periodically updating the cache works best for us! Thank you again for your help! |
I am facing problems with mirroring data from a server to my local repository.
|
Hi @mlarriere! First of all, I think that the behaviour you are seeing is in fact the expected behaviour -- the package doesn't know explicitly which of the files have been added or changed since the last refresh, because it only has the information contained in the index file, so exceeding the cache age (or forcing a re-download by using Part of the reason for this is that the full archive changes not only because of new files being added, but also because of the reprocessing of older files -- either due to changes in data QC (e.g. real time to delayed mode), or blacklisted floats, or changes to metadata, etc. The "archive" obtained from the Argo DAC is not a static archive of files, but a dynamic archive that is continuously updated. If what you want is an up-to-date mirror of the entire Argo DAC, I recommend you use the syncronization service provided by IFREMER, e.g. see: |
Another way to approach this would be to update your local archive by only adding the floats that have been added in say, the last month, by subsetting the index in time before you download the files. This isn't really a complete update, as I described above it will not update any files from outside that time window. It should keep all the older files that you'd already downloaded, though. |
Hi @richardsc Thank you again for your help! |
We have been using your package argodata in R to download data from BGC-Argo floats, which works great and has been extremely useful! However, we wanted to ask what the difference is between specifying the argument max_global_cache_age = Inf (or max_data_cache_age = Inf) within the function argo_update_global() (or argo_update_data()) or when defining our cache directory with argo_set_cache_dir() ?
And did we understand correctly that the difference between the functions to update the cache is that argo_update_global() updates the index file while argo_update_data() updates the data files?
Thank you!
The text was updated successfully, but these errors were encountered: