In using the JsonSerde, we noticed that it was extremely slow at parsing our log files. We were spending about 40-45 min to parse about 5GB of uncompressed data (140MB xz -6 compressed).
This was especially odd since we could read a file fully (with JsonPath parsing) using custom code in less than 2 min on a single machine using 4 threads.
This obviously was not usable for us since we have more than 1 PB of uncompressed data.
Primary problem is that this library is using a very old version of JsonPath.
In using the JsonSerde, we noticed that it was extremely slow at parsing our log files. We were spending about 40-45 min to parse about 5GB of uncompressed data (140MB xz -6 compressed).
This was especially odd since we could read a file fully (with JsonPath parsing) using custom code in less than 2 min on a single machine using 4 threads.
This obviously was not usable for us since we have more than 1 PB of uncompressed data.
Primary problem is that this library is using a very old version of JsonPath.