When running infer_types on an empty DataFrame, the logic in type_infer/rule_based/core.py (lines 33–94) fails because population_size is 0. The logging statement at line 41 performs a division by population_size, causing a ZeroDivisionError.
Even if that is guarded, the subsequent identifier pass still breaks: get_identifier_description is called with an empty column and immediately accesses data[0], which raises an IndexError on empty input.
Steps To Reproduce
import pandas as pd
from type_infer.api import infer_types
df = pd.DataFrame()
print(infer_types(df))
Output:
INFO:type_infer-21891:Analyzing a sample of 0
Traceback (most recent call last):
File "/Users/apple/Desktop/type_infer/./issue_test/main.py", line 5, in <module>
print(infer_types(df))
^^^^^^^^^^^^^^^
File "/Users/apple/Desktop/type_infer/type_infer/api.py", line 38, in infer_types
return engine.infer(data)
^^^^^^^^^^^^^^^^^^
File "/Users/apple/Desktop/type_infer/type_infer/rule_based/core.py", line 41, in infer
f'from a total population of {population_size}, this is equivalent to {round(sample_size * 100 / population_size, 1)}% of your data.') # noqa
~~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~
ZeroDivisionError: division by zero
Expected Output:
Empty inputs should be handled properly. The function should either return an empty or an invalid TypeInformation, or raise a ValueError explaining that type_infer cannot run on an empty DataFrame.
When running infer_types on an empty DataFrame, the logic in
type_infer/rule_based/core.py(lines 33–94) fails because population_size is 0. The logging statement at line 41 performs a division bypopulation_size, causing aZeroDivisionError.Even if that is guarded, the subsequent identifier pass still breaks:
get_identifier_descriptionis called with an empty column and immediately accessesdata[0], which raises anIndexErroron empty input.Steps To Reproduce
Output:
Expected Output:
Empty inputs should be handled properly. The function should either return an empty or an invalid TypeInformation, or raise a
ValueErrorexplaining thattype_infercannot run on an empty DataFrame.