Skip to content

Json parsing is extremely slow #2

Description

@pradeepg26

In using the JsonSerde, we noticed that it was extremely slow at parsing our log files. We were spending about 40-45 min to parse about 5GB of uncompressed data (140MB xz -6 compressed).
This was especially odd since we could read a file fully (with JsonPath parsing) using custom code in less than 2 min on a single machine using 4 threads.

This obviously was not usable for us since we have more than 1 PB of uncompressed data.

Primary problem is that this library is using a very old version of JsonPath.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions