Skip to content

Modify autodetection to score possible input types, rather than just an if/else #9

@toofishes

Description

@toofishes

For example, (two spaces) gets picked up as hexadecimal, because this format is checked first, but it is likely just meant to be plain ASCII.

Some sort of scoring system might be useful. On a scale of 0 to 1:

  • 0.0 - definitely not this format. For example, it can't really be hex if it has a W in it.
  • between 0 and 1 - likelyhood of this format. abc123, abc 123, and ab c1 23 can both be interpreted as hex or ASCII. Intuition says the one with pairs is more likely hex than the others- find a way to capture this algorithmically.
  • 1.0 - can only be this format. (not sure if this is really possible)

CyberChef's "magic" mode might also be inspiring- there is some scoring done there.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions