Skip to content

Ugly exit when content unexpected #9

@nstr10

Description

@nstr10

Been using this for awhile and appreciate you work! One small gripe I have is that, because Malwr.com is often not in a usable state, I get a lot of errors from your library.

data = {
  'math_captcha_field': eval(re.findall(pattern, req.content)[0]),
  'math_captcha_question': soup.find('input', {'name': 'math_captcha_question'})['value'],
  'csrfmiddlewaretoken': soup.find('input', {'name': 'csrfmiddlewaretoken'})['value'],
  'share': 'on' if share else 'off',  # share by default
  'analyze': 'on' if analyze else 'off',  # analyze by default
  'private': 'on' if private else 'off'  # private by default
}

IndexError: list index out of range is the most common error, because we're using regular expressions to find content in the response to a request we haven't checked the status of - and we're assuming the result will have an index! I was going to submit a pull request, but realized there are several potential ways to fix this:

  • Check that status_code is 200 when the request is made, else have a clean exit path that reports the HTTP status received
  • Wrap the above code in a try/except block and handle errors accordingly (less ideal, failure cause will be less apparent)
  • Check that re.findall(pattern, req.content) isn't just [] before trying to access an index of it (this should really be done regardless!)
  • Potentially many other options

Let me know what you think - I'm happy to write the solution, but wouldn't want to waste my time in case you think of a better fix than any of those I've listed.

P.S. I think also the requests library has a property method so you could just check if req.ok is true or false rather than mucking about with status codes.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions