discard rejected mavlink frames entirely#508
Conversation
When a mavlink 2 frame fails CRC validation, the parser advanced by one byte and resumed scanning for MAV_STX_V2. This allowed a marker byte inside the rejected frame payload to be treated as a new packet start. Instead, consume the whole candidate frame before looking for the next packet. Do the same for unsupported incompatibility flags so rejected mavlink 2 frames are skipped consistently. Signed-off-by: Onur Özkan <work@onurozkan.dev>
When a mavlink 1 frame fails CRC validation, the parser advanced by one byte and resumed scanning for MAV_STX. This allowed a marker byte inside the rejected frame payload to be treated as a new packet start. Instead, consume the whole candidate frame before looking for the next packet so rejected mavlink 1 frames are skipped consistently. Signed-off-by: Onur Özkan <work@onurozkan.dev>
Signed-off-by: Onur Özkan <work@onurozkan.dev>
This test no longer makes sense with the recent packet handling update. Signed-off-by: Onur Özkan <work@onurozkan.dev>
|
This has the issue that any "rouge" stx causes a frame detection and then interprets a garbage byte as frame length to skip. It is conceivable that a data stream cuts of the beginning of the first message, then a stx byte that is inside of a periodicly sent message could cause a skip over the stx at the next message start so that the next stx found is once again inside of the payload and a large number of messages get skipped. |
Signed-off-by: Onur Özkan <work@onurozkan.dev>
As far as I remember there's no escape literal on mavlink specs so I wonder how other implementations avoid this. |
|
By the way I would rather accept potential over skipping than a packet-in-packet vulnerability. In both cases the input is already malformed so this seems like a tradeoff where we have to choose which failure mode is less harmful. If the data is malformed, skipping it sounds acceptable to me (at least compared to packet-in-packet). |
|
This doesn't really fix packet-in-packet, you could still send a large packet with a header claiming a length of 1 byte and then have an embedded packet at the appropriate offset. |
|
We already have that bad outcome on malformed input. :) I will check other implementations later today to see how they solve this. |
Signed-off-by: Onur Özkan <work@onurozkan.dev>
|
I checked mavsdk and they use the same length delimited parsing approach. In practice the length field would not be useful for framing if the parser ignored it after validation failed (which doesn't make any sense to have that information in the mav spec).
Also, I added a regression test that I think disproves your claim here, please correct me if I am missing something. |
This test would also succeed if we just return an error on crc fail and don't change the parser advancement. |
| 0, // fake message id byte 0 | ||
| 0, // fake message id byte 1 | ||
| 0, // fake message id byte 2 | ||
| 0, // fake payload byte 0 |
There was a problem hiding this comment.
I think this is missing a 2 fake crc byte
There was a problem hiding this comment.
If we do that then we provide a full frame but what we are trying to do here is packet-in-packet injection.
Signed-off-by: Onur Özkan <work@onurozkan.dev>
Improved the coverage in 728cff2. |
|
I think if we don't want to jump back we could potentially remove the whole I also would prefer if some form of the agnostic recv test would stay, just a version where v1 and v2 messages are interleaved, no grabage data in between. |
|
We can't know whether they were real packets at all. The easiest way to verify that would be to write a similar test using the same file with other mavlink libraries. |
They all pass the crc check so its rather unlikely that they are not real. |
|
I need to review reports generated from this log file by rust-mavlink and other mavlink implementations and then compare them. I'm not sure when I will be able to allocate time for this. |
I think I can look into it further tomorrow. |
Thanks! |
|
Luckily I had a bit of time to review the C implementation this morning. The main difference compared to our implementation, here, the C version only skips the incompat flags on incompat flags, where our version skips the packet frame based on the length. So if the missing packets are caused by this path my understanding is that the log producer is not doing a great work on generating messages. I don't know if it's exported from ardupilot or px4. Either way I need to test this against a PX4 log before changing the behavior. A stable producer should not lose packets here unless it is emitting unsupported incompatibility flags or malformed frames. |
|
I may be blindsided, but sincerely, I don't see a solution that doesn't open a window for other security/robustness problems. I believe we are dealing with a fundamental limitation of the protocol, and perhaps the best answer we can have is to document it and recommend signing, which would solve the PIP until the attacker is able to forge the signing. Then, the next solution was already given by Goodspeed et al.: encryption, which anyone can implement in a layer around/above this protocol. |
|
There is no single solution indeed. We just need to pick one that ends with less damage than the other. |
|
The .tlog file format seems to insert 8 bytes of timestamp in between each message so it does not really contain a pure mavlink message stream. There are 4 Mav2 STX in those bytes. In the current impl that does not impact the parsing (all messages are found, no false ones are detected) but on a sufficiently large file the inc flags would be 0 or 1 and crc would (randomly) be correct. c_library_v2 only finds 1366 messages compared to 1426 in master and 1374 in this PR. It does abort parsing on an unknown inc flag but does not walk back the parser in that case. This causes it to skip a bit of data since it detects a few more STX bytes that dont belong to a real message. |
Right, but be aware that we'll (possibly) be swapping 1 known for N unknown problems, and the solution to this problem exists without code change. |
We are not trying something new. The approach we are taking is used by other libraries under this organization. Current problem is packet-in-packet injection and we are putting our best efforts to avoid this by switching to "may skip more messages than we need". In both cases the input is malformed, but still, skipping is much safer than accepting malformed packets. This is the trade-off and like I said, is already followed by other libraries in this organization. In fact, I don't think any mavlink library is doing what this one does (continiously looking for new packets in a single frame/packet). Did you read the e-mail we (all the rust-mavlink maintainers) received recently? I assume you haven't checked that yet. |
Now I see it. Well, "may skip more messages than we need" is also known as a DoS attack, which will be a known vulnerability: an attacker could use the current PoC idea to perform a DoS instead of injecting messages, which I agree is potentially less harmful. I also agree that the right move is to adhere to what's been practiced in the other parsers. |
Thanks for testing. Unless C library is considered unreliable parser, then this PR doesn't seem to lose any packets? Btw, like I mentioned earlier, skipping incompat flags on incompat flags (not the entire frame) would increase the number of messages found in the test file. But then I would like to know why we find so many more messages than the C version. 😁 Maybe we should just copy C state machine into Rust and sync with their approach entirely. |
It's basically random, and just depends on which STX bytes are encountered as the start of messages. |
What do you mean by random? Having random behaviour in a parser sounds terrible. |
|
No the parsing itself is not random, but it is not predictable which parser will find more messages, which messages it will find, and there are messages that are only found by either one parser. |
Yeah sure, it depens on the implementation of the parser. That's not a concern. Ideally, from a reliable message producer, current PR should never miss any packet. If that ever happens it's likely a bug on the message source or it's a malformed data (in this case there is no good end, it's just the less "worst end" we can have which is skipping frames). P.s: Users who are concerned about system security or who require maximum input stability should enable signing to ensure their data remains secure and reliable. |
|
Any concerns? |
|
I would prefer actual parity with an existing parser. If there are immedeate concerns this seems ok but the file input test should maybe be adjusted with a file that contains a valid mavlink stream. Since the input stream does not jump back anymore we could also possbile get some performance improvements by removing the code for that at some point. |
|
Based on your comparison with C parser, this didn't lose any packet right? We can improve the test coverage later I think. We are already extremely behind in terms of test coverage so I wouldn't block this PR with that. |
|
I it still loses some packets that C finds, but just finds more that C misses. It would be possible to construct an example where C finds more packets. |
|
We could add a CI action that reads a mavlink stream using both rust-mavlink and the C mavlink implementation, then compares the results and fails if rust-mavlink misses a packet that the C implementation successfully parses. I am not sure when I will be able to work on that but it would likely provide more value than a file based test that just asserts arbitrary numbers (which are unconfirmed). |

Fixes packet-in-packet parsing for rejected mavlink frames.
Mavlink frames are length delimited so bytes inside a frame payload must remain part of that frame even if validation fails.
Previously mavlink frame (for both v1 and v2) failed CRC validation and the parser advanced by one byte and resumed scanning for the next
MAV_STX/MAV_STX_V2. This allowed a marker byte inside the rejected frame payload to be treated as a new packet start.Now rejected frames are consumed entirely before parsing resumes to prevent payload bytes from being decoded as new separate packets.