feat: add wallet_details.last_screened_at#231
Conversation
In preparation to periodically screen all wallets for sanctions, I am enhancing the current wallet management flow to record the time when we last performed sanction screening for each address. This will enable the on-schedule screening worker to pick wallet address based on how long they were not screened. Signed-off-by: Miroslav Bajtoš <oss@bajtos.net>
|
@juliangruber My work on periodical screening creates naming & DB schema conflicts with your PR #224. I decided to start with this small pull request to allow us to resolve the conflicts quickly, so that I can continue with the bigger chunk of my work in a way that will not create any more conflicts (hopefully 🤞🏻). |
Thank you! Since #224 is going to be a long living PR, I plan to update it whenever there is a new commit landing on #main |
| @@ -0,0 +1,4 @@ | |||
| ALTER TABLE wallet_details | |||
| ADD COLUMN screened_at TIMESTAMP WITH TIME ZONE NOT NULL DEFAULT 0; | |||
There was a problem hiding this comment.
| ADD COLUMN screened_at TIMESTAMP WITH TIME ZONE NOT NULL DEFAULT 0; | |
| ADD COLUMN last_screened_at TIMESTAMP WITH TIME ZONE NOT NULL DEFAULT 0; |
more explicit
There was a problem hiding this comment.
| ADD COLUMN screened_at TIMESTAMP WITH TIME ZONE NOT NULL DEFAULT 0; | |
| ADD COLUMN last_screened_at TIMESTAMP WITH TIME ZONE; |
And instead of 0 special value for never screened, let's use NULL instead?
There was a problem hiding this comment.
I don't have strong on naming but +1 to making it nullable.
There was a problem hiding this comment.
I did consider last_screened_at, I am fine to use it.👍🏻
And instead of 0 special value for never screened, let's use NULL instead?
Do you know how the NULL values are sorted? I want to periodically run two queries:
- "Select 10 addresses that were least recently screened", so that the worker can screen them again
- "Select the oldest
last_screened_atvalue", I'll visualise it in Grafana & create an alert to let us know if we are not able to screen all addresses frequently enough.
According to ChatGPT, NULL is treated as the smallest value, so I guess we can use it as the default value. If we don't mind requiring every developer working with this table to understand the ordering of the NULL value in SQL. 🤓 🤔
SELECT day FROM demo ORDER BY day DESC;
2025-08-01
2025-07-01
NULLSELECT day FROM demo ORDER BY day ASC;
NULL
2025-07-01
2025-08-01
Thoughts?
There was a problem hiding this comment.
I added two commits to address both proposals:
LGTY now?
There was a problem hiding this comment.
I did consider
last_screened_at, I am fine to use it.👍🏻And instead of 0 special value for never screened, let's use NULL instead?
Do you know how the NULL values are sorted? I want to periodically run two queries:
- "Select 10 addresses that were least recently screened", so that the worker can screen them again
- "Select the oldest
last_screened_atvalue", I'll visualise it in Grafana & create an alert to let us know if we are not able to screen all addresses frequently enough.According to ChatGPT, NULL is treated as the smallest value, so I guess we can use it as the default value. If we don't mind requiring every developer working with this table to understand the ordering of the NULL value in SQL. 🤓 🤔
SELECT day FROM demo ORDER BY day DESC;
2025-08-01
2025-07-01
NULL
SELECT day FROM demo ORDER BY day ASC;
NULL
2025-07-01
2025-08-01Thoughts?
Would you consider filtering the values where last_screened_at is not NULL for these queries?
Maybe we could have add created_at column so that we could track duration between wallet creation and first screening?
There was a problem hiding this comment.
Would you consider filtering the values where
last_screened_atis notNULLfor these queries?
Isn't that making the queries more complex? At that point, I find it simpler to set NOT NULL DEFAULT 0.
In the queries:
- When picking wallets to screen, I want to list "not screened at all" among the first candidates, so I don't want to filter out NULL values.
- Now that I am thinking about monitoring how often we screen addresses, I think I will want two numbers: how many addresses were not screened at all (
last_screened_at_is NULL or 0, depending on the DB schema), and what's the oldest screening result in our DB (the oldestlast_screened_atvalue that's not NULL or 0).
Maybe we could have add
created_atcolumn so that we could track duration between wallet creation and first screening?
I don't see a need for such a metric right now, TBH.
Also, for withCDN datasets, we screen the address before adding it to our database.
There was a problem hiding this comment.
Isn't that making the queries more complex? At that point, I find it simpler to set NOT NULL DEFAULT 0.
In the queries:
When picking wallets to screen, I want to list "not screened at all" among the first candidates, so I don't want to filter out NULL values.
Now that I am thinking about monitoring how often we screen addresses, I think I will want two numbers: how many addresses were not screened at all (last_screened_at_ is NULL or 0, depending on the DB schema), and what's the oldest screening result in our DB (the oldest last_screened_at value that's not NULL or 0).
Gotcha! I guess either NULL or 0 will work but I'm not sure which is more performant. I'd assume in the background they're handled similarly by the db engine. Maybe describing the query is worth the shot before deciding between the two.
I don't see a need for such a metric right now, TBH.
Also, for withCDN datasets, we screen the address before adding it to our database.
Fair point. Since we scan every address before inserting it to the table the default column value does not matter as much.
Signed-off-by: Miroslav Bajtoš <oss@bajtos.net>
Signed-off-by: Miroslav Bajtoš <oss@bajtos.net>
wallet_details.screened_atwallet_details.last_screened_at
In preparation to periodically screen all wallets for sanctions, I am enhancing the current wallet management flow to record the time when we last performed sanction screening for each address. This will enable the on-schedule screening worker to pick wallet address based on how long they were not screened.
Links: