use -p threads flag by jurraca · Pull Request #97 · asmap/kartograf

jurraca · 2025-10-21T16:25:09Z

Use the -p threads flag from rpki-client 9.6.

This is a draft of a PR that is a breaking change, since previous versions of rpki-client do not have this flag, and this should not be merged until we have a strategy for that.

The changes simplify the code, but we lose information about valid_until and valid_since which was previously used to select the most likely useful announcement. Thus our "smallest ASN wins" heuristic is doing work than previously.

On average I get a runtime for parsing of about 45 seconds compared to 3-4 minutes previously.

@fjahr

jurraca · 2025-11-06T19:39:57Z

kartograf/rpki/fetch.py

                                 "-n",
                                 "-d",
                                 context.data_dir_rpki_cache,
+                                 "-p 16",


should derive thread count from OS

You could look at how core selects the number of script check worker threads: https://github.com/bitcoin/bitcoin/blob/master/src/node/chainstatemanager_args.cpp#L55

jurraca · 2025-11-06T19:40:18Z

kartograf/rpki/parse.py

-                continue
-
-            # We are only interested in ROAs
-            if roa['type'] != "roa":


they're all ROAs, iiuc

jurraca · 2025-11-06T19:40:30Z

kartograf/rpki/parse.py

                continue

-            # We are only interested in valid ROAs
-            if roa['validation'] != "OK":


only valid ROAs are output

fjahr · 2025-11-21T23:18:02Z

kartograf/rpki/fetch.py

    tal_options = [item for path in data_tals(context) for item in ('-t', path)]

+    cpu_count = os.cpu_count()
+    threads = cpu_count if cpu_count else 4


Since #98 is merged now, please put this threads count stuff into some util function that is shared between the two different places.

Removes test clauses which are no longer relevant, and test cases which no longer apply.

jurraca · 2025-12-02T16:19:38Z

Looking at the CI reproduction failure, this is what I was somewhat concerned about using expiry instead of valid_since and valid_until checks.
The diff between running the CI input data with this branch and master is 22185, of which 6 are definitely not due to the "lower ASN wins" heuristic. But the rest were originally (master) assigned differently due to the valid_until or valid_since heuristic, or the "lower ASN wins" heuristic.

(the six outliers, here the "older" file is the master branch file)

27.124.87.0/24 AS41239 # was AS9341
27.124.88.0/24 AS41239 # was AS9341
103.162.141.0/24 AS141668 # was AS141083
103.166.234.0/24 AS141963 # was AS140407
119.2.64.0/19 AS131704 # was AS17450
202.47.66.0/24 AS150228 # was AS138106

All the ROAs are valid and so it's not that we're issuing bad ROAs, but this big of a change between two runs is difficult to justify, especially since our current heuristics are more precise than the ones in this branch.

fjahr · 2025-12-03T14:48:20Z

Yeah, good point @jurraca . I've been thinking about this again and when I see the run times on my machine currently, it's already very fast, like just over 10min on my machine usually. Squeezing out another 1-2min would still be great but I agree with you that I am not sure it might not be worth the hassle with the versioning and the (perceived) lower quality of the data due to the missing information. I would suggest we put this on pause for the moment and get back to this when either we have another, even better reason to introduce a breaking change or we find a solution that gives us a more consistent result (however that might happen). I am interested in going deeper into the internals of rpki-client and understand what is happening there but I won't be getting to it before January.

Does that sound good to you? I am also not so keen on going through the breaking change at the same time as the chances of getting the embedding done are increasing. Any additional confusion about the data and our process might be detrimental if we continue to make progress in the same way we did these last few weeks. But if we have this progress we might get the embedding soon and then we can revisit this when the process is more established and more people can weigh in on the trade-offs here...

jurraca · 2025-12-03T16:04:09Z

Agreed on all points. Would rather focus on stability now, and keep this branch parked until we're ready to revisit.

jurraca force-pushed the threads-flag branch 2 times, most recently from 2e82c92 to 9d8ca64 Compare October 21, 2025 16:35

jurraca mentioned this pull request Nov 6, 2025

rpki-client 9.6 backwards compatibility #103

Open

jurraca commented Nov 6, 2025

View reviewed changes

fjahr reviewed Nov 21, 2025

View reviewed changes

jurraca force-pushed the threads-flag branch 2 times, most recently from 4d6548e to baa764b Compare November 23, 2025 14:39

jurraca marked this pull request as ready for review December 2, 2025 13:25

jurraca added 4 commits December 2, 2025 14:26

use -p flag, remove batching

f502efc

fix tests

c5be453

Removes test clauses which are no longer relevant, and test cases which no longer apply.

get cpu_count from host

dc403a6

use new get_threads in merge.py

fbeb1ee

jurraca force-pushed the threads-flag branch from baa764b to fbeb1ee Compare December 2, 2025 13:26

jurraca marked this pull request as draft December 2, 2025 16:19

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

use -p threads flag#97

use -p threads flag#97
jurraca wants to merge 4 commits intoasmap:masterfrom
jurraca:threads-flag

jurraca commented Oct 21, 2025

Uh oh!

jurraca Nov 6, 2025

Uh oh!

fjahr Nov 6, 2025

Uh oh!

jurraca Nov 6, 2025

Uh oh!

jurraca Nov 6, 2025

Uh oh!

fjahr Nov 21, 2025

Uh oh!

jurraca commented Dec 2, 2025 •

edited

Loading

Uh oh!

fjahr commented Dec 3, 2025

Uh oh!

jurraca commented Dec 3, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

jurraca commented Oct 21, 2025

Uh oh!

jurraca Nov 6, 2025

Choose a reason for hiding this comment

Uh oh!

fjahr Nov 6, 2025

Choose a reason for hiding this comment

Uh oh!

jurraca Nov 6, 2025

Choose a reason for hiding this comment

Uh oh!

jurraca Nov 6, 2025

Choose a reason for hiding this comment

Uh oh!

fjahr Nov 21, 2025

Choose a reason for hiding this comment

Uh oh!

jurraca commented Dec 2, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

fjahr commented Dec 3, 2025

Uh oh!

jurraca commented Dec 3, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

jurraca commented Dec 2, 2025 •

edited

Loading