Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Alert failure- invasive species unit list dr27353 #246

Closed
turley85 opened this issue Jul 19, 2024 · 10 comments
Closed

Alert failure- invasive species unit list dr27353 #246

turley85 opened this issue Jul 19, 2024 · 10 comments
Assignees
Labels
Biosecurity Alerts for biosecurity species priority-high
Milestone

Comments

@turley85
Copy link

turley85 commented Jul 19, 2024

I think the alert for the NSW Invasive Species unit (BioSecurity alert for NSW_NPWS_InvasiveSpeciesUnit_list) is failing. In testing, it's taking many minutes to load a preview, then showing zero detections:
image

But R alerts are detecting occurrences during this same time period.
image

As such, I suspect the alert itself is failiing/timing out.

@turley85
Copy link
Author

Note, the shapefile for this alert is quite complex: https://spatial.ala.org.au/?pid=9433211

@turley85 turley85 changed the title Alert failure- invasive species unit Alert failure- invasive species unit list dr27353 Jul 19, 2024
@turley85 turley85 added priority-high Biosecurity Alerts for biosecurity species labels Jul 19, 2024
@turley85
Copy link
Author

turley85 commented Jul 19, 2024

I think these two alerts are failing in the same way too:

BioSecurity alert for NSW_NPWS_WesternRivers_Weeds_list and

BioSecurity alert for NSW_NPWS_Western_Weeds_list

@kylie-m kylie-m assigned kylie-m and unassigned kylie-m Jul 19, 2024
@kylie-m kylie-m added this to the 4.2.1 milestone Jul 19, 2024
@kylie-m
Copy link

kylie-m commented Jul 19, 2024

Thanks for flagging Andrew, the team are investigating

@kylie-m
Copy link

kylie-m commented Jul 22, 2024

Hi @turley85 , thanks for the screenshot, from the R alert, could you please supply a few occurrenceIds / links that you were expecting that did not appear?

@turley85
Copy link
Author

Hi @turley85 , thanks for the screenshot, from the R alert, could you please supply a few occurrenceIds / links that you were expecting that did not appear?

https://biocache.ala.org.au/occurrences/4a42e85a-117c-4734-a663-bb90164f2991
https://biocache.ala.org.au/occurrences/d161667c-c613-434b-aa53-272945e8fb65
https://biocache.ala.org.au/occurrences/d87ee725-d644-4baf-b755-a9d95a5a72b9

@adam-collins
Copy link
Contributor

The use of spatialObject requires more knowledge. General users of the ALA do not need to be aware of this as their objects are simplified by default. spatialObject is not simplified automatically, giving more control to advanced users.

  • Use the smallest possible object that satisfies the requirements.
  • Test the object. e.g. https://biocache.ala.org.au/ws/occurrences/search?q=spatialObject:9433211 returns an error message.
  • Test the object again. e.g. https://biocache.ala.org.au/occurrences/search?q=spatialObject:9433211 also fails, but without the specific error message.

In this case, the tests above fail and one reports that the object is too large.

Analysis of all future shapefiles is required. For this example:

  • The shapefile uploaded has a very fine resolution.
  • The shapefile is constructed from a list of points, each with buffer, that is then aggregated.
  • There was no survey producing this area definition so this resolution is arbitrary.
  • There exists a margin of error on occurrence coordinates.
  • Simplification to a threshold of approximately 100m appears acceptable.
  • This reduces the size of the .shp file by about 100 times.

The method of simplification to use for all future shapefiles:

  • Understand how a shapefile was constructed (survey, arbitrary buffers, etc).
  • Determine what level of simplification tolerance is acceptable. This will vary, e.g. a buffer of 1m will require a tolerance below that, a buffer of several kms will be much larger, a buffer of 1m may be appropriate for stationary species, located with commercial GIS devices, in use with a surveyed area that is of a fine resolution.
  • Install QGIS.
  • Open the shapefile in QGIS.
  • Simplify using the menu item; Vector | Geometry Tools | Simplify. e.g. use tolerance 0.001 (approximately 100m)
  • Inspect the simplified area in comparison to the original shapefile and adjust tolerance if required.
  • Export the shapefile; right click on the simplified layer name in the layer section, select from the popup menu Export | Save Features As..., enter the destination file name, and click OK.
  • Zip the exported shapefile. Include all files that were created by the export.

Example use cases:

  • A very large number of surveyed property boundaries.
  • Start with generalisation in the meters, depending on normal travel distance of the specific species, measurement uncertainty estimation.
  • If still too large for biocache-service, split into 2 or more files. This will require multiple lines in the species list.
  • A very large number of islands surveyed.
  • Start with generalisation in the 100's of meters, depending on the smallest island.
  • If still too large, simplify larger islands more than the smaller islands.
  • If still too large for biocache-service, split into 2 or more files. This will require multiple lines in the species list.

@kylie-m
Copy link

kylie-m commented Aug 7, 2024

Hi @turley85 - I'm happy to run through this in QGIS with you, either in our catchup tomorrow or another time that suits, if you'd like

@turley85
Copy link
Author

turley85 commented Aug 7, 2024

@kylie-m I think we need more than just to correct this file. Could a list of requirements for shapefile be produced that could be given to users to ensure their files match our requirements/standards?

@kylie-m kylie-m self-assigned this Aug 9, 2024
@kylie-m
Copy link

kylie-m commented Aug 9, 2024

Hi Andrew, I've been working through this one for you, it looks like we have a promising solution.

Following Adam's guidance above, the team have downloaded the shapefiles, reduced the complexity of each in QGIS, and uploaded a new copy. I'll include my testing steps below too.

  1. Invasive species unit, for BioSecurity alert for NSW_NPWS_InvasiveSpeciesUnit_list
  • Original shapefile: pid=9433211
  • Updated shapefile, complexity reduced to 0.001 degrees (approx 111m): pid=9439584
  • For your reference, here is the saved session in Spatial Portal where you can view both layers together and compare them: https://spatial.ala.org.au/?ss=1723187675952

Testing steps:

  1. Western rivers, for BioSecurity alert for NSW_NPWS_WesternRivers_Weeds_list and BioSecurity alert for NSW_NPWS_Western_Weeds_list
  • Original shapefile: pid=9433227
  • Updated shapefile, complexity reduced to 0.0001 degrees (approx 11m), this was due to smaller details/shapes of the polygons, to get a better match: pid=9439588
  • For your reference, here is the saved session in Spatial Portal where you can view both layers together and compare them: https://spatial.ala.org.au/?ss=1723189955525

Test links following steps above:

  1. I also did a quick test of an alert, spatialObject:9439584 seemed to work well for my "Rivers-test2" alert in production

  2. Actions for you -

  • Following on from this, could you try out the new spatial objects in your lists/alerts and see if everything appears to be working at the end of the process?

Longer term, I agree some guidelines for users on what they're supplying would be helpful. I spoke with Adam about this, though we don't have a single file size limit we can give them. As a starting point, we could test new files using the biocache web service link and if that fails, we could ask for complexity to be reduced. We can also run through the QGIS process we have used if needed, though I expect their experts are probably already across it.

@kylie-m
Copy link

kylie-m commented Sep 13, 2024

Hi Andrew, I've taken another look at the NSW NPWS Invasive Species Unit List shapefile, and it's looking like the shapefile optimisation process I've used has worked.

@nickdos nickdos closed this as completed Sep 17, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Biosecurity Alerts for biosecurity species priority-high
Projects
None yet
Development

No branches or pull requests

4 participants