Add generic SQL queries to allow automatic update#606
Conversation
|
Currently we need to run the query manually or ask admins to do it (we got the contacts), for a while I try to set it up automatically, see https://github.com/usegalaxy-eu/issues/issues/510: with this query we will try to make it work on EU (run query monthly, export to Graphana, that gives us an API endpoint to pull it periodically with CoDex) and then convince admins of other servers to implement the same logic. |
|
Ok, I don't have access to usegalaxy-eu issues (they are not public), so it was harder to get the context. So the current process is to reach out to an admin, give them the SQL queries, receive the files and drop them in the codex? |
sorry, I missed, that you do not have access to our issues. |
| @@ -0,0 +1,87 @@ | |||
| # SQL command to get those stats | |||
|
|
|||
| Needs to be run via the Galaxy Admin Stats Account | |||
There was a problem hiding this comment.
| Needs to be run via the Galaxy Admin Stats Account | |
| Needs to be run via the Galaxy Admin Stats Account | |
| For people who do not have access to the Galaxy Admin Stats Account : | |
| 1. Reach out to an admin | |
| 2. Provide them with the SQL queries (or a link to this page) | |
| Once the files have been generated : | |
| 1. Create a new folder in the [folder sources/data/usage_stats of the galaxy_codex repository](https://github.com/galaxyproject/galaxy_codex/tree/main/sources/data/usage_stats) | |
| 2. Name the folder 'usage_stats_YYYY.MM.DD' | |
| 3. Within this folder, create a subfolder indicating the instance where the data are coming from (eu, fr, org, or org.au) | |
| 4. Drop the csv file(s) in the appropriate folder |
Great! I've tried to include a short protocol to explain that to people coming to this page (until we get an automated process) |
| date_trunc('month', CURRENT_DATE) AS snapshot_date | ||
| FROM job j | ||
| WHERE j.create_time BETWEEN (date_trunc('month', CURRENT_DATE) - INTERVAL '5 years') | ||
| AND date_trunc('month', CURRENT_DATE) |
There was a problem hiding this comment.
| AND date_trunc('month', CURRENT_DATE) | |
| AND date_trunc('month', CURRENT_DATE) AND j.state != 'deleted' |
| date_trunc('month', CURRENT_DATE) AS snapshot_date | ||
| FROM job j | ||
| WHERE j.create_time <= date_trunc('month', CURRENT_DATE) | ||
| GROUP BY tool_name |
There was a problem hiding this comment.
| GROUP BY tool_name | |
| AND j.state != 'deleted' | |
| GROUP BY tool_name |
| user_id | ||
| FROM job | ||
| WHERE create_time BETWEEN (date_trunc('month', CURRENT_DATE) - INTERVAL '5 years') | ||
| AND date_trunc('month', CURRENT_DATE) |
There was a problem hiding this comment.
| AND date_trunc('month', CURRENT_DATE) | |
| AND date_trunc('month', CURRENT_DATE) AND j.state != 'deleted' |
| user_id | ||
| FROM job | ||
| WHERE create_time <= date_trunc('month', CURRENT_DATE) | ||
| GROUP BY tool_name, user_id |
There was a problem hiding this comment.
| GROUP BY tool_name, user_id | |
| AND j.state != 'deleted' | |
| GROUP BY tool_name, user_id |
Since users ask for this more and more, we should try again to automize it: https://github.com/usegalaxy-eu/issues/issues/510
Those queries should be generic, so that admins do not need to modify them and can just run them monthly or every 3 months or so, the change always collects the data between 5 years (and forever) and the first day of the month where the query is run.