Skip to content

Add system metadata class #575

@odashi

Description

@odashi

Processor.process() takes metadata, which is used to directly initialize SysOutputInfo. However, these are essentially different data (especially, "metadata" $\subset$ SysOutputInfo, but not $=$) and the current implementation makes some confusion around this:

The most significant abuse around this behavior is that FileLoaderMetadata is implicitly converted into SysOutputInfo. This shouldn't work unless explicit conversion:

metadata = dataclasses.asdict(data.metadata)
metadata.update(
{
"task_name": TaskType.named_entity_recognition.value,
}
)
processor = get_processor_class(TaskType.named_entity_recognition)()

To this end, we need:

  • A struct defining the system metadata.
  • Change the behavior of Processor to take the system metadata, not a dict.
  • Either:
    • A conversion method between system metadata and FileLoaderReturn/SysOutputInfo
    • Include system metadata as a direct member of FileLoaderReturn/SysOutputInfo

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions