Conversation
|
ACK 4b79e70. impl makes sense. worth testing in a coordinated run with a few people. I don't think we should leave it to the user to set a timeframe. The timespan won't affect the ability to all get the same hash, so let's try to settle on one that works. 10 mins feels right but let's test. I see some "Error: max connections" in my debug.log but i'm not sure who they were for, and whether they were successfully connected to later. run logs--- Start Kartograf ---Kartograf version: 0.4.9 --- Fetching RPKI --- Downloaded TAL for AFRINIC to /home/base/code/asmap/kartograf/data/1745504214/rpki/tals/afrinic.tal, file hash: 2838ef30ea27ce5705abf5f5adb131d8c35b1f50858338a2f3c84bb207c2fa35 --- Validating RPKI --- Validating RPKI ROAs --- Parsing RPKI --- Parsing 299929 ROAs --- Sorting results --- ...finished in 0:00:08.463013 --- Finishing Kartograf --- The SHA-256 hash of the result file is: 925d24b58fe078bf30cd68d900b0efdda67f33b00e363bb3e365c0dd438c409c |
|
also would be nice to update the context test with these changes. |
8676065 to
59f9355
Compare
Done, also dealt with the remaining warmup naming. |
|
I'm getting a "Connection refused" error when testing this PR: --- Start Kartograf ---
Kartograf version: 0.4.9
Using rpki-client version 9.5 (recommended).
The epoch for this run is: 1745934109 (2025-04-29 13:41:49 UTC, local: 2025-04-29 10:41:49 -03)
--- Fetching RPKI ---
Downloaded TAL for AFRINIC to /Users/brunogarcia/projects/kartograf/data/1745934109/rpki/tals/afrinic.tal, file hash: 2838ef30ea27ce5705abf5f5adb131d8c35b1f50858338a2f3c84bb207c2fa35
Downloaded TAL for APNIC to /Users/brunogarcia/projects/kartograf/data/1745934109/rpki/tals/apnic.tal, file hash: 472e551f7c551c2e999e582b7c9437d3bee4900fe53afff62aeb28d4940ade94
Downloaded TAL for ARIN to /Users/brunogarcia/projects/kartograf/data/1745934109/rpki/tals/arin.tal, file hash: 1f8bdb03bcc30a3b8e11fd9a87102fba250c22137a3c8baa9c81b139cb412639
Downloaded TAL for LACNIC to /Users/brunogarcia/projects/kartograf/data/1745934109/rpki/tals/lacnic.tal, file hash: d44bb9394ab009c8b53e5efebf2a1c9450bab61a27efe00de5a3e4587a3a2f6a
Downloaded TAL for RIPE to /Users/brunogarcia/projects/kartograf/data/1745934109/rpki/tals/ripe.tal, file hash: 59ca27ef93f23682749fcefe7c6d70fbc723343549ff9e4d3996acaff79817fb
Downloading RPKI Data, this may take a while.
RPKI sync #1
...took 594 seconds
Downloaded RPKI Data, hash sum: 8e740838bac13251321415b52c5b7c548503c4e5e68161e561b584db0e48695e
...finished in 0:10:56.212893
--- Fetching IRR ---
Downloading afrinic.db.gz
Downloaded afrinic.db.gz, file hash: 3fc8e0920190e7b6aac236c04227ed5b733ec33055f2c1ca34c8d7c020af2aa4
Downloading apnic.db.route.gz
Downloaded apnic.db.route.gz, file hash: 389086ef85b708e4e3ae954a7e1bbf8bfbd14c377ae4c532090a2c1db9ef2382
Downloading apnic.db.route6.gz
Downloaded apnic.db.route6.gz, file hash: e6c048303a472348abe9ac8edcec2de72c9f1510247bdda4c474aaaaaba7f918
Downloading arin.db.gz
Traceback (most recent call last):
File "/Users/brunogarcia/projects/kartograf/./run", line 11, in <module>
main()
File "/Users/brunogarcia/projects/kartograf/kartograf/cli.py", line 104, in main
Kartograf.map(args)
File "/Users/brunogarcia/projects/kartograf/kartograf/kartograf.py", line 65, in map
fetch_irr(context)
File "/Users/brunogarcia/projects/kartograf/kartograf/timed.py", line 10, in wrapper
result = func(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^
File "/Users/brunogarcia/projects/kartograf/kartograf/irr/fetch.py", line 41, in fetch_irr
with FTP(host) as ftp:
^^^^^^^^^
File "/Users/brunogarcia/.pyenv/versions/3.11.11/lib/python3.11/ftplib.py", line 121, in __init__
self.connect(host)
File "/Users/brunogarcia/.pyenv/versions/3.11.11/lib/python3.11/ftplib.py", line 158, in connect
self.sock = socket.create_connection((self.host, self.port), self.timeout,
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/brunogarcia/.pyenv/versions/3.11.11/lib/python3.11/socket.py", line 863, in create_connection
raise exceptions[0]
File "/Users/brunogarcia/.pyenv/versions/3.11.11/lib/python3.11/socket.py", line 848, in create_connection
sock.connect(sa)
ConnectionRefusedError: [Errno 61] Connection refused |
|
that's a new one, we should at least handle that correctly. It happened consistently on a few tries? |
Yes. I just tried twice on this PR and both runs it happened, then I tried again but on master and it happened again :( |
|
I can reproduce this. The FTP address for ARIN is still |
It worked fine. --- Start Kartograf ---
Kartograf version: 0.4.9
Using rpki-client version 9.5 (recommended).
The epoch for this run is: 1745947362 (2025-04-29 17:22:42 UTC, local: 2025-04-29 14:22:42 -03)
--- Fetching RPKI ---
Downloaded TAL for AFRINIC to /Users/brunogarcia/projects/kartograf/data/1745947362/rpki/tals/afrinic.tal, file hash: 2838ef30ea27ce5705abf5f5adb131d8c35b1f50858338a2f3c84bb207c2fa35
Downloaded TAL for APNIC to /Users/brunogarcia/projects/kartograf/data/1745947362/rpki/tals/apnic.tal, file hash: 472e551f7c551c2e999e582b7c9437d3bee4900fe53afff62aeb28d4940ade94
Downloaded TAL for ARIN to /Users/brunogarcia/projects/kartograf/data/1745947362/rpki/tals/arin.tal, file hash: 1f8bdb03bcc30a3b8e11fd9a87102fba250c22137a3c8baa9c81b139cb412639
Downloaded TAL for LACNIC to /Users/brunogarcia/projects/kartograf/data/1745947362/rpki/tals/lacnic.tal, file hash: d44bb9394ab009c8b53e5efebf2a1c9450bab61a27efe00de5a3e4587a3a2f6a
Downloaded TAL for RIPE to /Users/brunogarcia/projects/kartograf/data/1745947362/rpki/tals/ripe.tal, file hash: 59ca27ef93f23682749fcefe7c6d70fbc723343549ff9e4d3996acaff79817fb
Downloading RPKI Data, this may take a while.
RPKI sync #1
...took 356 seconds
Downloaded RPKI Data, hash sum: 05e7d4723f43edff4bc27e74bb11d3e196811577fe283ce525efa82f3296e195
...finished in 0:06:52.498349
--- Validating RPKI ---
Validating RPKI ROAs
299797 raw RKPI ROA files found.
100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 1200/1200 [01:06<00:00, 17.94it/s]
299797 RKPI ROAs validated and saved to /Users/brunogarcia/projects/kartograf/out/1745947362/rpki/rpki_raw.json, file hash: 86bb15a994272199a011c1a836d3df04d2f660eb438e78a85c66c82b78334bed
...finished in 0:01:31.313057
--- Parsing RPKI ---
Parsing 299797 ROAs
Result entries written: 606995
Duplicates found: 86294
Invalids found: 2205
Incompletes: 0
Non-ROA files: 0
...finished in 0:00:18.223953
--- Sorting results ---
...finished in 0:00:04.889057
--- Finishing Kartograf ---
The SHA-256 hash of the result file is: 08e954ad74f716ada921ffb833ac8136513d66c60721289346f7ac542b181f52
Total runtime: 0:08:46.937830 |
Oh wow, interesting. Good we caught this here and didn't run into this in an actual run 😅 It appears ARIN has removed support for FTP very recently: https://www.arin.net/blog/2025/02/10/ftp-retirement/ It's kind of annoying but we'll have to switch over to https download I guess. It seems that we can just do that for all RIRs though and the change is rather minimal, I have opened a pull here: #80 |
Sorry for being a bit late here, how does tomorrow 4pm UTC for you both @brunoerg @jurraca , i.e.: (skipping IRR for now so we don't depend on the other PR here) If that doesn't work for you or you see this too late please propose a new time, thanks! :) |
Works for me. |
works for me |
|
Is 1746021600 4pm UTC? |
Oh, sorry, I guess the tool I used didn't use UTC but my local timezone instead (UTC+2). Let's use the actual UTC 4pm: 1746028800. |
|
success logs--- Start Kartograf ---Kartograf version: 0.4.9 --- Fetching RPKI --- Downloaded TAL for AFRINIC to /home/base/code/asmap/kartograf/data/1746028800/rpki/tals/afrinic.tal, file hash: 2838ef30ea27ce5705abf5f5adb131d8c35b1f50858338a2f3c84bb207c2fa35 --- Fetching Routeviews pfx2as --- Downloading from https://publicdata.caida.org/datasets/routing/routeviews-prefix2as/2025/04/routeviews-rv2-20250428-0200.pfx2as.gz --- Validating RPKI --- --- Parsing RPKI --- Parsing 299962 ROAs --- Parsing Routeviews pfx2as --- Unzipping /home/base/code/asmap/kartograf/data/1746028800/collectors/routeviews_pfx2asn_ip4.txt.gz --- Merging Routeviews and base data --- Parse base file to dictionary --- Sorting results --- ...finished in 0:00:15.606272 --- Finishing Kartograf --- The SHA-256 hash of the result file is: d722edb130bfa2135606223f12f3f146d37f16ca6ff1d51559c44f79a21e4938 |
|
logs--- Start Kartograf ---Kartograf version: 0.4.9 --- Fetching RPKI --- Downloaded TAL for AFRINIC to /Users/brunogarcia/projects/kartograf/data/1746028800/rpki/tals/afrinic.tal, file hash: 2838ef30ea27ce5705abf5f5adb131d8c35b1f50858338a2f3c84bb207c2fa35 --- Fetching Routeviews pfx2as --- Downloading from https://publicdata.caida.org/datasets/routing/routeviews-prefix2as/2025/04/routeviews-rv2-20250429-1200.pfx2as.gz --- Validating RPKI --- Validating RPKI ROAs --- Parsing RPKI --- Parsing 58873 ROAs --- Parsing Routeviews pfx2as --- Unzipping /Users/brunogarcia/projects/kartograf/data/1746028800/collectors/routeviews_pfx2asn_ip4.txt.gz --- Merging Routeviews and base data --- Parse base file to dictionary --- Sorting results --- ...finished in 0:00:08.516794 --- Finishing Kartograf --- The SHA-256 hash of the result file is: c3095a48081ba668f79cd3997fcbf43cba90a4c9bcbe752f26185587920ff8aa |
|
so your first RPKI sync took |
|
|
My result hash is That's a bit disappointing but it's also wild to me how big the differences are in download times. For me they were all as quick as I am used to (all <200sec) and so I also had three syncs. For @jurraca the sync were slower and that resulted in just two syncs. The second sync again took almost as long as the first which it throwing off my calculation quite a bit. I will need to lower the threshold of when we run the second sync right away and when we wait until the end of the warmup period. I'll set it at 50% instead of 70%. At least that's a learning I think. But @brunoerg 's results are crazy to me, I have never seen a sync take this long and get that little data :D I only took a brief look at the debug log, there are a few things that stand out to me but I can't really make sense of it yet. In validation I see several Also a lot of these "file has vanished" like this: And then at the end of the download section: So all we can say for sure is something has clearly prevented the usual sync of the big RIR repos via rrdp. Was there something different about the connection compared to your previous usage of kartograf @brunoerg ? Since it seems to happen across the board I would rather put my money on some local issue this time... |
|
It's the same connection as always, nothing different. Perhaps I could try to use a VPN to see if anything changes. |
Would definitely be interesting to know if using a VPN changes something... |
Cool, I'll try it tomorrow. Do you think we can run it again at same time tomorrow? |
Works for me, I should be online for the irc meeting anyway. I hope I am getting this one correct on first try 😅 |
|
I used VPN, set to Germany, result:
logs--- Start Kartograf --- Kartograf version: 0.4.9 --- Fetching RPKI --- Downloaded TAL for AFRINIC to /Users/brunogarcia/projects/kartograf/data/1746115200/rpki/tals/afrinic.tal, file hash: 2838ef30ea27ce5705abf5f5adb131d8c35b1f50858338a2f3c84bb207c2fa35 --- Fetching Routeviews pfx2as --- The page at https://publicdata.caida.org/datasets/routing/routeviews-prefix2as/2025/05/ couldn't be fetched. Trying the previous month. --- Validating RPKI --- Validating RPKI ROAs --- Parsing RPKI --- Parsing 0 ROAs --- Parsing Routeviews pfx2as --- Unzipping /Users/brunogarcia/projects/kartograf/data/1746115200/collectors/routeviews_pfx2asn_ip4.txt.gz --- Merging Routeviews and base data --- Parse base file to dictionary --- Sorting results --- ...finished in 0:00:09.496421 --- Finishing Kartograf --- The SHA-256 hash of the result file is: 0f9b344e96f7615f9888bfc6cf4641392a1609043274194273cfa767de51f0c1 |
|
Unfortunately I messed up this time and let my computer go to sleep at exactly the wrong time because I got distracted. I got a result that looks alright but it's not really good test. logs@brunoerg looks like you were having issues again but different this time, instead of one long connection with limited results there were now three tries but all very short and no results at all. I guess using a VPN made the situation even worse :-/ |
Yes, noticed the same, it didn't help. |
|
Hm, sorry for dropping the ball on this a bit here but I am still a bit unsure what to do. Our tests didn't work but @brunoerg 's problem doesn't seem to be related to this change and I am pretty hopeful that this change should improve things. We should do another run soon I think. Should we merge it before that or do you think we should do more testing first @jurraca @brunoerg ? |
|
yea unsure what to do as well. Probably good to do another run with us three first. im free next week. |
|
Yes, better to do another run again. I'm free. |
|
@jurraca @brunoerg cool, if you see this in time let's try tomorrow (Tuesday, 10th of June) at 6pm CET/4pm GMT. If we miss it let me know and we can repeat it on Thursday same time. I have also pushed a rebase of this PR since there were some fixes and this also means we can use IRR again. |
|
Logs➜ kartograf git:(78) ./run map -rv -irr -w 1749571200--- Start Kartograf --- Kartograf version: 0.4.9 --- Fetching RPKI --- Downloaded TAL for AFRINIC to /Users/brunogarcia/projects/kartograf/data/1749571200/rpki/tals/afrinic.tal, file hash: 2838ef30ea27ce5705abf5f5adb131d8c35b1f50858338a2f3c84bb207c2fa35 --- Fetching IRR --- Downloading afrinic.db.gz --- Fetching Routeviews pfx2as --- Downloading from https://publicdata.caida.org/datasets/routing/routeviews-prefix2as/2025/06/routeviews-rv2-20250608-1200.pfx2as.gz --- Validating RPKI --- Validating RPKI ROAs --- Parsing RPKI --- Parsing 308165 ROAs --- Parsing IRR --- Extracting afrinic.db.gz --- Merging RPKI and IRR data --- Parse base file to dictionary --- Parsing Routeviews pfx2as --- Unzipping /Users/brunogarcia/projects/kartograf/data/1749571200/collectors/routeviews_pfx2asn_ip4.txt.gz --- Merging Routeviews and base data --- Parse base file to dictionary --- Sorting results --- ...finished in 0:00:12.188039 --- Finishing Kartograf --- The SHA-256 hash of the result file is: 5ccb49d4989989bfe84bdb053b563c4f5325bb91a04d0862bf35fcd72c0603c7 |
|
I got stuck in transport and missed it, sorry guys. my bad :/ |
|
I got Mine and @brunoerg numbers seem to match on IRR and RV but I got about 500 more entries from RPKI... LogsI would say let's give it one more shot on Thursday and then discuss: I guess it's a win that there was no massive difference in the numbers between @brunoerg and me like we have seen in the official runs previously. But if we get no matches at all across several runs between the three of us that might be evidence that this lowers the probability of exact matches and we might have to rethink the approach. |
|
my run failed on the IRR fetching step. I've never seen it crash from a connection refused, will have to fix. logs--- Start Kartograf ---Kartograf version: 0.4.9 --- Fetching RPKI --- Downloaded TAL for AFRINIC to /home/base/code/asmap/kartograf/data/1749744000/rpki/tals/afrinic.tal, file hash: 2838ef30ea27ce5705abf5f5adb131d8c35b1f50858338a2f3c84bb207c2fa35 --- Fetching IRR --- Downloading afrinic.db.gz |
|
I got Logs |
Looks like this is the issue that #80 fixed. Did you pull the latest rebase that I pushed 3 days ago? If not the fix is not included in your local branch and that cause the crash. |
Sorry, I missed it. |
|
I got logs--- Start Kartograf ---Kartograf version: 0.4.9 --- Fetching RPKI --- Downloaded TAL for AFRINIC to /home/base/code/asmap/kartograf/data/1750867200/rpki/tals/afrinic.tal, file hash: 2838ef30ea27ce5705abf5f5adb131d8c35b1f50858 Downloaded RPKI Data, hash sum: b4868c33a88eac74bbe40d9749413909cb2c3e112a628294954dd36c09269cde --- Fetching IRR --- Downloading afrinic.db.gz --- Fetching Routeviews pfx2as --- Downloading from https://publicdata.caida.org/datasets/routing/routeviews-prefix2as/2025/06/routeviews-rv2-20250623-1800.pfx2as.gz --- Validating RPKI --- Validating RPKI ROAs --- Parsing RPKI --- Parsing 315305 ROAs --- Parsing IRR --- Extracting afrinic.db.gz --- Merging RPKI and IRR data --- Parse base file to dictionary --- Parsing Routeviews pfx2as --- Unzipping /home/base/code/asmap/kartograf/data/1750867200/collectors/routeviews_pfx2asn_ip4.txt.gz --- Merging Routeviews and base data --- Parse base file to dictionary --- Sorting results --- ...finished in 0:00:20.293948 --- Finishing Kartograf --- The SHA-256 hash of the result file is: 4729f40d910aa2da80e9acc2ae87b96768b82e5a4fc1e89e9271d21f226e12ce |
|
I got logs--- Start Kartograf ---Kartograf version: 0.4.9 --- Fetching RPKI --- Downloaded TAL for AFRINIC to /Users/brunogarcia/projects/kartograf/data/1750867200/rpki/tals/afrinic.tal, file hash: 2838ef30ea27ce5705abf5f5adb131d8c35b1f50858338a2f3c84bb207c2fa35 --- Fetching IRR --- Downloading afrinic.db.gz --- Fetching Routeviews pfx2as --- Downloading from https://publicdata.caida.org/datasets/routing/routeviews-prefix2as/2025/06/routeviews-rv2-20250623-1800.pfx2as.gz --- Validating RPKI --- Validating RPKI ROAs --- Parsing RPKI --- Parsing 316834 ROAs --- Parsing IRR --- Extracting afrinic.db.gz --- Merging RPKI and IRR data --- Parse base file to dictionary --- Parsing Routeviews pfx2as --- Unzipping /Users/brunogarcia/projects/kartograf/data/1750867200/collectors/routeviews_pfx2asn_ip4.txt.gz --- Merging Routeviews and base data --- Parse base file to dictionary --- Sorting results --- ...finished in 0:00:11.203262 --- Finishing Kartograf --- The SHA-256 hash of the result file is: 31efa1b06aff97be78cacc1ff6d489b68d79ac89a68d1463bc0a2aca7a3869d5 |
|
No match @jurraca and my run seem to be similar (3 syncs) and we are close in the numbers but still diverge. @brunoerg had just one long sync and he also had more results than @jurraca and me, which I am not sure how to interpret.... logsHonestly, I think as of now it seems we should focus on something else for now. We should do some test rounds of #82 but I would also like to do a collaborative run in the meantime without these improvements because it has already been a couple of months since the last one. I will prepare a release for that. |
This implements the idea suggested here: #69 (comment) (Number 2 at the bottom).
In the case of a collaborative run (using the wait feature), kartograf will use the following behavior:
Rationale:
Open questions:
This is what the output currently looks like:
(In this particular one the second took longer than the first, which is surprising. It was on a fast connection so download speed may just not be a blocker. But most tests took longer on the first sync.)