Skip to content

twinkle数据集预处理的时间很长 #235

Description

@120L021602

Checklist / 检查清单

  • I have searched existing issues, and this is a new feature request. / 我已经搜索过现有的 issues,确认这是一个新的 Feature Request。

Feature Request Description / Feature Request 描述

用twinkle训练,数据集预处理的时间很长。我有22万条训练数据,开启了8个进程做数据预处理,显示处理速度是每秒处理3条多数据样本,总处理时间大概在30小时左右,比ms-swift慢不少。

Pull Request / Pull Request 信息

No response

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions