read functions should return insightful errors

The `read_cpt` and `read_bore` functions have some "automagical" logic that infers the content of the `file` argument. The user can provide an object of types `io.BytesIO | Path | str` and with "engine"="auto", the content type is inferred automatically. This can result in confusing errors when erroneous input is provided.

Some examples:

### Providing a non-existing path results in `XMLSyntaxError`
Input:
```python
from pygef import read_cpt
read_cpt(file="non/existing/file.gef")
```
Response:
```
lxml.etree.XMLSyntaxError: Start tag expected, '<' not found, line 1, column 1
```
The expectation is to get a `FileNotFoundError`

### Providing a non-existing path and `engine`="gef" results in `ValueError`
Input:
```python
from pygef import read_cpt
read_cpt(file="non/existing/file.gef", engine="gef")
```
Response:
```
ValueError: The selected gef file is not a cpt. Check the REPORTCODE or the PROCEDURECODE.
```
The expectation is to get a `FileNotFoundError`

### Providing an erroneous gef file results in `XMLSyntaxError` while gef can be parsed when forced
Input:
```python
from pygef import read_cpt
read_cpt(file="path/to/erroneous.GEF")
```
Response:
```
lxml.etree.XMLSyntaxError: Start tag expected, '<' not found, line 1, column 1
```
Input:
```python
from pygef import read_cpt
read_cpt(file="path/to/erroneous.GEF", engine="gef")
```
Response:
```
CPTData(bro_id=None, research_report_date=None, ...
```
The expectation is to get an error that the gef file is invalid, and this response should be consistent no matter the value for `engine`.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

read functions should return insightful errors #342

Providing a non-existing path results in `XMLSyntaxError`

Providing a non-existing path and `engine`="gef" results in `ValueError`

Providing an erroneous gef file results in `XMLSyntaxError` while gef can be parsed when forced

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

read functions should return insightful errors #342

Description

Providing a non-existing path results in XMLSyntaxError

Providing a non-existing path and engine="gef" results in ValueError

Providing an erroneous gef file results in XMLSyntaxError while gef can be parsed when forced

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions

Providing a non-existing path results in `XMLSyntaxError`

Providing a non-existing path and `engine`="gef" results in `ValueError`

Providing an erroneous gef file results in `XMLSyntaxError` while gef can be parsed when forced