Draft
Conversation
Contributor
Author
|
If our multiple instances of our API app are being deployed simultaneously, the the shared in-memory dictionary created by In that case we need to consider using Redis or a similar distributed in-memory data store designed for this sort of thing. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
This PR implements a BYO-Polygon method and also implements zonal stats for the downscaled CMIP6 data. These projects were combined here to demonstrate how the BYO-Polygon could be achieved, and then test it without messing with any currently working area queries.
BYO-Polygon
How it works
The changes in
application.pyand the newupload_polygon.pyroute have all the important parts - the rest of the changes are mostly helper functions. Basically what we are doing is defining an application-level shared dictionary (app.uploaded_polygons) and a threading lock (app.store_lock) to protect that dictionary from concurrent access issues. When we read or write from the dictionary, we do it inside awithblock that allows only one thread to access the dictionary at a time:This momentarily blocks concurrent access, but is released immediately when exiting the
withblock. Note that we need to use:from flask import ... current_appin every route where we want to access this dictionary, in order to share the in-memory store between routes.The idea is up upload the polygon info into this shared dictionary, where all routes can use it. Eventually, the user will have some kind of form input that will let them browse to their file and upload. There are a lot of ways this could be structured, but for now I focused on testing a general upload method via
curlthat provides the user with a token that can potentially be used in any area query, just like any other place ID. We will test this usingcurlcommands to mimic the request that will someday come from the form input.Create a test polygon
The easiest way to do this is by going to https://geojson.io and creating a polygon, then saving it as a shapefile. The file will automatically be zipped, and we will upload at zip file into our flask app.
Use this tool to create the polygon
Then save it as shapefile
Start the app and upload polygon
Start the API as usual, then in a separate terminal use the following command to upload your zipped shapefile. Optionally, you can name the polygon.
This should return some JSON, where you get a unique polygon ID representing your polygon and some other info:
Use the unique polygon ID to get zonal stats
You can now use your polygon to get zonal stats, by just substituting it where you would normally place one of our place IDs. For example, this is the polygon shown in the screenshot above:
http://127.0.0.1:5000/cmip6_downscaled/area/63_S3ues?vars=tasmax&models=6ModelAvg
The HUC10 for Chena Slough should give pretty similar results:
http://127.0.0.1:5000/cmip6_downscaled/area/190803060904?vars=tasmax&models=6ModelAvg
Expiration of the polygon is not handled yet, but we probably want to find some method of removing unused polygon info from the shared dictionary after a certain amount of time. It would be nice to persist the polygon info for a while, so the user can make multiple data requests from different endpoints without having to upload the polygon multiple times.
Zonal stats
The zonal stats implementation here is nothing fancy - it uses the vectorized method that is currently used by the ERA5 WRF and the CMIP6 FWI endpoints. There is probably some redundant code here between the point and area queries, but in the interest of testing the BYOP method, I haven't tried to trim this down yet.
Note that CSV output is not yet implemented - we need to find a way to sneak the user provided polygon name into the title of the CSV output. Right now, dropping this custom name in the
place_idparameter of the CSV function will cause it to fail.