Skip to content

feat: add wallet_details.last_screened_at#231

Merged
juliangruber merged 4 commits intomainfrom
periodic-screening
Aug 26, 2025
Merged

feat: add wallet_details.last_screened_at#231
juliangruber merged 4 commits intomainfrom
periodic-screening

Conversation

@bajtos
Copy link
Contributor

@bajtos bajtos commented Aug 22, 2025

In preparation to periodically screen all wallets for sanctions, I am enhancing the current wallet management flow to record the time when we last performed sanction screening for each address. This will enable the on-schedule screening worker to pick wallet address based on how long they were not screened.

Links:

In preparation to periodically screen all wallets for sanctions,
I am enhancing the current wallet management flow to record the time
when we last performed sanction screening for each address. This will
enable the on-schedule screening worker to pick wallet address based on
how long they were not screened.

Signed-off-by: Miroslav Bajtoš <oss@bajtos.net>
@bajtos
Copy link
Contributor Author

bajtos commented Aug 22, 2025

@juliangruber My work on periodical screening creates naming & DB schema conflicts with your PR #224. I decided to start with this small pull request to allow us to resolve the conflicts quickly, so that I can continue with the bigger chunk of my work in a way that will not create any more conflicts (hopefully 🤞🏻).

@juliangruber
Copy link
Member

@juliangruber My work on periodical screening creates naming & DB schema conflicts with your PR #224. I decided to start with this small pull request to allow us to resolve the conflicts quickly, so that I can continue with the bigger chunk of my work in a way that will not create any more conflicts (hopefully 🤞🏻).

Thank you! Since #224 is going to be a long living PR, I plan to update it whenever there is a new commit landing on #main

@@ -0,0 +1,4 @@
ALTER TABLE wallet_details
ADD COLUMN screened_at TIMESTAMP WITH TIME ZONE NOT NULL DEFAULT 0;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
ADD COLUMN screened_at TIMESTAMP WITH TIME ZONE NOT NULL DEFAULT 0;
ADD COLUMN last_screened_at TIMESTAMP WITH TIME ZONE NOT NULL DEFAULT 0;

more explicit

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
ADD COLUMN screened_at TIMESTAMP WITH TIME ZONE NOT NULL DEFAULT 0;
ADD COLUMN last_screened_at TIMESTAMP WITH TIME ZONE;

And instead of 0 special value for never screened, let's use NULL instead?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't have strong on naming but +1 to making it nullable.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did consider last_screened_at, I am fine to use it.👍🏻

And instead of 0 special value for never screened, let's use NULL instead?

Do you know how the NULL values are sorted? I want to periodically run two queries:

  • "Select 10 addresses that were least recently screened", so that the worker can screen them again
  • "Select the oldest last_screened_at value", I'll visualise it in Grafana & create an alert to let us know if we are not able to screen all addresses frequently enough.

According to ChatGPT, NULL is treated as the smallest value, so I guess we can use it as the default value. If we don't mind requiring every developer working with this table to understand the ordering of the NULL value in SQL. 🤓 🤔

SELECT day FROM demo ORDER BY day DESC;
2025-08-01
2025-07-01
NULL

SELECT day FROM demo ORDER BY day ASC;
NULL
2025-07-01
2025-08-01

Thoughts?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added two commits to address both proposals:

LGTY now?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did consider last_screened_at, I am fine to use it.👍🏻

And instead of 0 special value for never screened, let's use NULL instead?

Do you know how the NULL values are sorted? I want to periodically run two queries:

  • "Select 10 addresses that were least recently screened", so that the worker can screen them again
  • "Select the oldest last_screened_at value", I'll visualise it in Grafana & create an alert to let us know if we are not able to screen all addresses frequently enough.

According to ChatGPT, NULL is treated as the smallest value, so I guess we can use it as the default value. If we don't mind requiring every developer working with this table to understand the ordering of the NULL value in SQL. 🤓 🤔

SELECT day FROM demo ORDER BY day DESC;
2025-08-01
2025-07-01
NULL
SELECT day FROM demo ORDER BY day ASC;
NULL
2025-07-01
2025-08-01

Thoughts?

Would you consider filtering the values where last_screened_at is not NULL for these queries?

Maybe we could have add created_at column so that we could track duration between wallet creation and first screening?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would you consider filtering the values where last_screened_at is not NULL for these queries?

Isn't that making the queries more complex? At that point, I find it simpler to set NOT NULL DEFAULT 0.

In the queries:

  • When picking wallets to screen, I want to list "not screened at all" among the first candidates, so I don't want to filter out NULL values.
  • Now that I am thinking about monitoring how often we screen addresses, I think I will want two numbers: how many addresses were not screened at all (last_screened_at_ is NULL or 0, depending on the DB schema), and what's the oldest screening result in our DB (the oldest last_screened_at value that's not NULL or 0).

Maybe we could have add created_at column so that we could track duration between wallet creation and first screening?

I don't see a need for such a metric right now, TBH.

Also, for withCDN datasets, we screen the address before adding it to our database.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't that making the queries more complex? At that point, I find it simpler to set NOT NULL DEFAULT 0.

In the queries:

When picking wallets to screen, I want to list "not screened at all" among the first candidates, so I don't want to filter out NULL values.
Now that I am thinking about monitoring how often we screen addresses, I think I will want two numbers: how many addresses were not screened at all (last_screened_at_ is NULL or 0, depending on the DB schema), and what's the oldest screening result in our DB (the oldest last_screened_at value that's not NULL or 0).

Gotcha! I guess either NULL or 0 will work but I'm not sure which is more performant. I'd assume in the background they're handled similarly by the db engine. Maybe describing the query is worth the shot before deciding between the two.

I don't see a need for such a metric right now, TBH.

Also, for withCDN datasets, we screen the address before adding it to our database.

Fair point. Since we scan every address before inserting it to the table the default column value does not matter as much.

bajtos added 2 commits August 25, 2025 10:38
Signed-off-by: Miroslav Bajtoš <oss@bajtos.net>
Signed-off-by: Miroslav Bajtoš <oss@bajtos.net>
@bajtos bajtos requested a review from juliangruber August 25, 2025 08:39
@bajtos bajtos changed the title feat: add wallet_details.screened_at feat: add wallet_details.last_screened_at Aug 25, 2025
@juliangruber juliangruber merged commit 5337714 into main Aug 26, 2025
6 checks passed
@juliangruber juliangruber deleted the periodic-screening branch August 26, 2025 07:56
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants