Processor.process() takes metadata, which is used to directly initialize SysOutputInfo. However, these are essentially different data (especially, "metadata" $\subset$ SysOutputInfo, but not $=$) and the current implementation makes some confusion around this:
The most significant abuse around this behavior is that FileLoaderMetadata is implicitly converted into SysOutputInfo. This shouldn't work unless explicit conversion:
|
metadata = dataclasses.asdict(data.metadata) |
|
metadata.update( |
|
{ |
|
"task_name": TaskType.named_entity_recognition.value, |
|
} |
|
) |
|
processor = get_processor_class(TaskType.named_entity_recognition)() |
To this end, we need:
- A struct defining the system metadata.
- Change the behavior of
Processor to take the system metadata, not a dict.
- Either:
- A conversion method between system metadata and
FileLoaderReturn/SysOutputInfo
- Include system metadata as a direct member of
FileLoaderReturn/SysOutputInfo
Processor.process()takesmetadata, which is used to directly initializeSysOutputInfo. However, these are essentially different data (especially, "metadata"SysOutputInfo, but notThe most significant abuse around this behavior is that
FileLoaderMetadatais implicitly converted intoSysOutputInfo. This shouldn't work unless explicit conversion:ExplainaBoard/integration_tests/ner_test.py
Lines 148 to 154 in 4cec0a0
To this end, we need:
Processorto take the system metadata, not a dict.FileLoaderReturn/SysOutputInfoFileLoaderReturn/SysOutputInfo