Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -46,6 +46,7 @@ mindspore versions.
| mindocr | mindspore |
|:-------:|:-----------:|
| main | master |
| 0.5 | 2.5.0 |
| 0.4 | 2.3.0/2.3.1 |
| 0.3 | 2.2.10 |
| 0.1 | 1.8 |
Expand Down
1 change: 1 addition & 0 deletions README_CN.md
Original file line number Diff line number Diff line change
Expand Up @@ -46,6 +46,7 @@ MindOCR是一个基于[MindSpore](https://www.mindspore.cn/) 框架开发的OCR
| mindocr | mindspore |
|:-------:|:-----------:|
| main | master |
| 0.5 | 2.5.0 |
| 0.4 | 2.3.0/2.3.1 |
| 0.3 | 2.2.10 |
| 0.1 | 1.8 |
Expand Down
34 changes: 16 additions & 18 deletions configs/cls/mobilenetv3/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -31,25 +31,11 @@ Currently we support the 0 and 180 degree classification. You can update the par

</div>

## Requirements

## Results

| mindspore | ascend driver | firmware | cann toolkit/kernel |
|:---------:|:---------------:|:------------:|:-------------------:|
| 2.3.1 | 24.1.RC2 | 7.3.0.1.231 | 8.0.RC2.beta1 |

MobileNetV3 is pretrained on ImageNet. For text direction classification task, we further train MobileNetV3 on RCTW17, MTWI and LSVT datasets.

Experiments are tested on ascend 910* with mindspore 2.3.1 graph mode
<div align="center">

| **model name** | **cards** | **batch size** | **img/s** | **accuracy** | **config** | **weight** |
|----------------|-----------|----------------|-----------|--------------|-----------------------------------------------------|------------------------------------------------|
| MobileNetV3 | 4 | 256 | 5923.5 | 94.59% | [yaml](cls_mv3.yaml) | [ckpt](https://download.mindspore.cn/toolkits/mindocr/cls/cls_mobilenetv3-92db9c58.ckpt) |
</div>



| mindspore | ascend driver | firmware | cann toolkit/kernel |
|:----------:|:--------------:|:--------------:|:-------------------:|
| 2.5.0 | 24.1.0 | 7.5.0.3.220 | 8.0.0.beta1 |

## Quick Start

Expand Down Expand Up @@ -128,6 +114,18 @@ Please set the checkpoint path to the arg `ckpt_load_path` in the `eval` section
python tools/eval.py -c configs/cls/mobilenetv3/cls_mv3.yaml
```

## Performance

MobileNetV3 is pretrained on ImageNet. For text direction classification task, we further train MobileNetV3 on RCTW17, MTWI and LSVT datasets.

Experiments are tested on ascend 910* with mindspore 2.5.0 graph mode
<div align="center">

| **model name** | **cards** | **batch size** | **img/s** | **accuracy** | **config** | **weight** |
|----------------|-----------|----------------|-----------|--------------|-----------------------------------------------------|------------------------------------------------|
| MobileNetV3 | 4 | 256 | 5923.5 | 94.59% | [yaml](cls_mv3.yaml) | [ckpt](https://download.mindspore.cn/toolkits/mindocr/cls/cls_mobilenetv3-92db9c58.ckpt) |
</div>

## References

<!--- Guideline: Citation format GB/T 7714 is suggested. -->
Expand Down
31 changes: 16 additions & 15 deletions configs/cls/mobilenetv3/README_CN.md
Original file line number Diff line number Diff line change
Expand Up @@ -31,22 +31,11 @@ MobileNetV3[[1](#参考文献)]于2019年发布,这个版本结合了V1的deep

</div>

### 配套版本

## 实验结果

| mindspore | ascend driver | firmware | cann toolkit/kernel |
|:---------:|:---------------:|:------------:|:-------------------:|
| 2.3.1 | 24.1.RC2 | 7.3.0.1.231 | 8.0.RC2.beta1 |

MobileNetV3在ImageNet上预训练。另外,我们进一步在RCTW17、MTWI和LSVT数据集上进行了文字方向分类任务的训练。

在采用图模式的ascend 910*上实验结果,mindspore版本为2.3.1
<div align="center">

| **模型名称** | **卡数** | **单卡批量大小** | **img/s** | **准确率** | **配置** | **权重** |
|-------------|--------|------------|-----------|---------|----------------------|------------------------------------------------------------------------------------------|
| MobileNetV3 | 4 | 256 | 5923.5 | 94.59% | [yaml](cls_mv3.yaml) | [ckpt](https://download.mindspore.cn/toolkits/mindocr/cls/cls_mobilenetv3-92db9c58.ckpt) |
</div>
| mindspore | ascend driver | firmware | cann toolkit/kernel |
|:----------:|:--------------:|:--------------:|:-------------------:|
| 2.5.0 | 24.1.0 | 7.5.0.3.220 | 8.0.0.beta1 |


## 快速上手
Expand Down Expand Up @@ -128,6 +117,18 @@ model:
python tools/eval.py -c configs/cls/mobilenetv3/cls_mv3.yaml
```

## 性能表现

MobileNetV3在ImageNet上预训练。另外,我们进一步在RCTW17、MTWI和LSVT数据集上进行了文字方向分类任务的训练。

在采用图模式的ascend 910*上实验结果,mindspore版本为2.5.0
<div align="center">

| **模型名称** | **卡数** | **单卡批量大小** | **img/s** | **准确率** | **配置** | **权重** |
|-------------|--------|------------|-----------|---------|----------------------|------------------------------------------------------------------------------------------|
| MobileNetV3 | 4 | 256 | 5923.5 | 94.59% | [yaml](cls_mv3.yaml) | [ckpt](https://download.mindspore.cn/toolkits/mindocr/cls/cls_mobilenetv3-92db9c58.ckpt) |
</div>

## 参考文献

<!--- Guideline: Citation format GB/T 7714 is suggested. -->
Expand Down
6 changes: 2 additions & 4 deletions configs/det/dbnet/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -56,13 +56,11 @@ combination of these two modules leads to scale-robust feature fusion.
DBNet++ performs better in detecting text instances of diverse scales, especially for large-scale text instances where
DBNet may generate inaccurate or discrete bounding boxes.


## Requirements

| mindspore | ascend driver | firmware | cann toolkit/kernel |
|:----------:|:--------------:|:--------------:|:-------------------:|
| 2.3.1 | 24.1.RC2 | 7.3.0.1.231 | 8.0.RC2.beta1 |

| 2.5.0 | 24.1.0 | 7.5.0.3.220 | 8.0.0.beta1 |

## Quick Start

Expand Down Expand Up @@ -290,7 +288,7 @@ msrun --worker_num=2 --local_worker_num=2 python tools/train.py --config configs
# Based on verification,binding cores usually results in performance acceleration.Please configure the parameters and run.
msrun --bind_core=True --worker_num=2 --local_worker_num=2 python tools/train.py --config configs/det/dbnet/db_r50_icdar15.yaml
```
**Note:** For more information about msrun configuration, please refer to [here](https://www.mindspore.cn/tutorials/experts/en/r2.3.1/parallel/msrun_launcher.html).
**Note:** For more information about msrun configuration, please refer to [here](https://www.mindspore.cn/docs/en/master/model_train/parallel/msrun_launcher.html).

The training result (including checkpoints, per-epoch performance and curves) will be saved in the directory parsed by the arg `ckpt_save_dir` in yaml config file. The default directory is `./tmp_det`.

Expand Down
5 changes: 2 additions & 3 deletions configs/det/dbnet/README_CN.md
Original file line number Diff line number Diff line change
Expand Up @@ -45,8 +45,7 @@ DBNet++在检测不同尺寸的文本方面表现更好,尤其是对于尺寸

| mindspore | ascend driver | firmware | cann toolkit/kernel |
|:----------:|:--------------:|:--------------:|:-------------------:|
| 2.3.1 | 24.1.RC2 | 7.3.0.1.231 | 8.0.RC2.beta1 |

| 2.5.0 | 24.1.0 | 7.5.0.3.220 | 8.0.0.beta1 |

## 快速上手

Expand Down Expand Up @@ -271,7 +270,7 @@ msrun --worker_num=2 --local_worker_num=2 python tools/train.py --config configs
# 经验证,绑核在大部分情况下有性能加速,请配置参数并运行
msrun --bind_core=True --worker_num=2 --local_worker_num=2 python tools/train.py --config configs/det/dbnet/db_r50_icdar15.yaml
```
**注意:** 有关 msrun 配置的更多信息,请参考[此处](https://www.mindspore.cn/tutorials/experts/zh-CN/r2.3.1/parallel/msrun_launcher.html).
**注意:** 有关 msrun 配置的更多信息,请参考[此处](https://www.mindspore.cn/docs/zh-CN/master/model_train/parallel/msrun_launcher.html).

训练结果(包括checkpoint、每个epoch的性能和曲线图)将被保存在yaml配置文件的`ckpt_save_dir`参数配置的路径下,默认为`./tmp_det`。

Expand Down
2 changes: 1 addition & 1 deletion configs/det/dbnet/README_CN_PP-OCRv3.md
Original file line number Diff line number Diff line change
Expand Up @@ -338,7 +338,7 @@ msrun --worker_num=4 --local_worker_num=4 python tools/train.py --config configs
# 经验证,绑核在大部分情况下有性能加速,请配置参数并运行
msrun --bind_core=True --worker_num=4 --local_worker_num=4 python tools/train.py --config configs/det/dbnet/db_mobilenetv3_ppocrv3.yaml
```
**注意:** 有关 msrun 配置的更多信息,请参考[此处](https://www.mindspore.cn/tutorials/experts/zh-CN/r2.3.1/parallel/msrun_launcher.html).
**注意:** 有关 msrun 配置的更多信息,请参考[此处](https://www.mindspore.cn/docs/zh-CN/master/model_train/parallel/msrun_launcher.html).


* 单卡训练
Expand Down
10 changes: 9 additions & 1 deletion configs/det/east/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -29,6 +29,14 @@ EAST uses regression for the position and rotation angle of the text box, enabli
4. **Text detection branch**:
After determining the location and size of the text region, EAST further classifies these regions as text or non-text areas. For this purpose, a fully convolutional text branch is employed for binary classification of the text areas.


## Requirements

| mindspore | ascend driver | firmware | cann toolkit/kernel |
|:----------:|:--------------:|:--------------:|:-------------------:|
| 2.5.0 | 24.1.0 | 7.5.0.3.220 | 8.0.0.beta1 |


## Quick Start

### Installation
Expand Down Expand Up @@ -128,7 +136,7 @@ msrun --worker_num=8 --local_worker_num=8 python tools/train.py --config configs
# Based on verification,binding cores usually results in performance acceleration.Please configure the parameters and run.
msrun --bind_core=True --worker_num=8 --local_worker_num=8 python tools/train.py --config configs/det/east/east_r50_icdar15.yaml
```
**Note:** For more information about msrun configuration, please refer to [here](https://www.mindspore.cn/tutorials/experts/en/r2.3.1/parallel/msrun_launcher.html).
**Note:** For more information about msrun configuration, please refer to [here](https://www.mindspore.cn/docs/en/master/model_train/parallel/msrun_launcher.html).

The training result (including checkpoints, per-epoch performance and curves) will be saved in the directory parsed by the arg `ckpt_save_dir` in yaml config file. The default directory is `./tmp_det`.

Expand Down
4 changes: 2 additions & 2 deletions configs/det/east/README_CN.md
Original file line number Diff line number Diff line change
Expand Up @@ -33,7 +33,7 @@ EAST的整体架构图如图1所示,包含以下阶段:

| mindspore | ascend driver | firmware | cann toolkit/kernel |
|:----------:|:--------------:|:--------------:|:-------------------:|
| 2.3.1 | 24.1.RC2 | 7.3.0.1.231 | 8.0.RC2.beta1 |
| 2.5.0 | 24.1.0 | 7.5.0.3.220 | 8.0.0.beta1 |

## 快速上手

Expand Down Expand Up @@ -132,7 +132,7 @@ msrun --worker_num=8 --local_worker_num=8 python tools/train.py --config configs
# 经验证,绑核在大部分情况下有性能加速,请配置参数并运行
msrun --bind_core=True --worker_num=8 --local_worker_num=8 python tools/train.py --config configs/det/east/east_r50_icdar15.yaml
```
**注意:** 有关 msrun 配置的更多信息,请参考[此处](https://www.mindspore.cn/tutorials/experts/zh-CN/r2.3.1/parallel/msrun_launcher.html).
**注意:** 有关 msrun 配置的更多信息,请参考[此处](https://www.mindspore.cn/docs/zh-CN/master/model_train/parallel/msrun_launcher.html).

训练结果(包括checkpoint、每个epoch的性能和曲线图)将被保存在yaml配置文件的`ckpt_save_dir`参数配置的路径下,默认为`./tmp_det`。

Expand Down
4 changes: 2 additions & 2 deletions configs/det/psenet/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -25,7 +25,7 @@ The overall architecture of PSENet is presented in Figure 1. It consists of mult

| mindspore | ascend driver | firmware | cann toolkit/kernel |
|:----------:|:--------------:|:--------------:|:-------------------:|
| 2.3.1 | 24.1.RC2 | 7.3.0.1.231 | 8.0.RC2.beta1 |
| 2.5.0 | 24.1.0 | 7.5.0.3.220 | 8.0.0.beta1 |

## Quick Start

Expand Down Expand Up @@ -156,7 +156,7 @@ msrun --worker_num=8 --local_worker_num=8 python tools/train.py --config configs
msrun --bind_core=True --worker_num=8 --local_worker_num=8 python tools/train.py --config configs/det/psenet/pse_r152_icdar15.yaml

```
**Note:** For more information about msrun configuration, please refer to [here](https://www.mindspore.cn/tutorials/experts/en/r2.3.1/parallel/msrun_launcher.html).
**Note:** For more information about msrun configuration, please refer to [here](https://www.mindspore.cn/docs/en/master/model_train/parallel/msrun_launcher.html).


The training result (including checkpoints, per-epoch performance and curves) will be saved in the directory parsed by the arg `ckpt_save_dir` in yaml config file. The default directory is `./tmp_det`.
Expand Down
4 changes: 2 additions & 2 deletions configs/det/psenet/README_CN.md
Original file line number Diff line number Diff line change
Expand Up @@ -25,7 +25,7 @@ PSENet的整体架构图如图1所示,包含以下阶段:

| mindspore | ascend driver | firmware | cann toolkit/kernel |
|:----------:|:--------------:|:--------------:|:-------------------:|
| 2.3.1 | 24.1.RC2 | 7.3.0.1.231 | 8.0.RC2.beta1 |
| 2.5.0 | 24.1.0 | 7.5.0.3.220 | 8.0.0.beta1 |

## 快速上手

Expand Down Expand Up @@ -155,7 +155,7 @@ msrun --worker_num=8 --local_worker_num=8 python tools/train.py --config configs
# 经验证,绑核在大部分情况下有性能加速,请配置参数并运行
msrun --bind_core=True --worker_num=8 --local_worker_num=8 python tools/train.py --config configs/det/psenet/pse_r152_icdar15.yaml
```
**注意:** 有关 msrun 配置的更多信息,请参考[此处](https://www.mindspore.cn/tutorials/experts/zh-CN/r2.3.1/parallel/msrun_launcher.html).
**注意:** 有关 msrun 配置的更多信息,请参考[此处](https://www.mindspore.cn/docs/zh-CN/master/model_train/parallel/msrun_launcher.html).

训练结果(包括checkpoint、每个epoch的性能和曲线图)将被保存在yaml配置文件的`ckpt_save_dir`参数配置的路径下,默认为`./tmp_det`。

Expand Down
50 changes: 19 additions & 31 deletions configs/kie/vi_layoutxlm/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -40,43 +40,20 @@ After obtaining αij from the original self-attention layer, considering the lar
<em> Figure 1. LayoutXLM(LayoutLMv2) architecture [<a href="#References">1</a>] </em>
</p>

## Results
<!--- Guideline:
Table Format:
- Model: model name in lower case with _ seperator.
- Context: Training context denoted as {device}x{pieces}-{MS mode}, where mindspore mode can be G - graph mode or F - pynative mode with ms function. For example, D910x8-G is for training on 8 pieces of Ascend 910 NPU using graph mode.
- Top-1 and Top-5: Keep 2 digits after the decimal point.
- Params (M): # of model parameters in millions (10^6). Keep 2 digits after the decimal point
- Recipe: Training recipe/configuration linked to a yaml config file. Use absolute url path.
- Download: url of the pretrained model weights. Use absolute url path.
-->

### Accuracy

| mindspore | ascend driver | firmware | cann toolkit/kernel |
|:---------:|:---------------:|:------------:|:-------------------:|
| 2.3.1 | 24.1.RC2 | 7.3.0.1.231 | 8.0.RC2.beta1 |

According to our experiments, the performance and accuracy evaluation([Model Evaluation](#33-Model-Evaluation)) results of training ([Model Training](#32-Model-Training)) on the XFUND Chinese dataset are as follows:


Experiments are tested on ascend 910* with mindspore 2.3.1 graph mode
<div align="center">

| **model name** | **cards** | **batch size** | **img/s** | **hmean** | **config** | **weight** |
|----------------|-----------|----------------|-----------|-----------|-----------------------------------------------------|------------------------------------------------|
| LayoutXLM | 1 | 8 | 73.26 | 90.34% | [yaml](../layoutxlm/ser_layoutxlm_xfund_zh.yaml) | [ckpt](https://download.mindspore.cn/toolkits/mindocr/layoutxlm/ser_layoutxlm_base-a4ea148e.ckpt) |
| VI-LayoutXLM | 1 | 8 | 110.6 | 93.31% | [yaml](../layoutxlm/ser_layoutxlm_xfund_zh.yaml) | [ckpt](https://download.mindspore.cn/toolkits/mindocr/layoutxlm/ser_layoutxlm_base-a4ea148e.ckpt) |
</div>

## Requirements

| mindspore | ascend driver | firmware | cann toolkit/kernel |
|:----------:|:--------------:|:--------------:|:-------------------:|
| 2.5.0 | 24.1.0 | 7.5.0.3.220 | 8.0.0.beta1 |

## Quick Start
### Preparation

#### Installation
### Installation

Please refer to the [installation instruction](https://github.com/mindspore-lab/mindocr#installation) in MindOCR.

### Dataset preparation

#### Dataset Download

[The XFUND dataset](https://github.com/doc-analysis/XFUND) is used as the experimental dataset. The XFUND dataset is a multilingual dataset proposed by Microsoft for the Knowledge-Intensive Extraction (KIE) task. It consists of seven datasets, each containing 149 training samples and 50 validation samples.
Expand Down Expand Up @@ -168,7 +145,18 @@ Recognition results are as shown in the image, and the image is saved as`inferen
<em> example_ser.jpg </em>
</p>

## Performance

According to our experiments, the performance of evaluation on the XFUND Chinese dataset are as follows:

Experiments are tested on ascend 910* with mindspore 2.5.0 graph mode
<div align="center">

| **model name** | **cards** | **batch size** | **img/s** | **hmean** | **config** | **weight** |
|----------------|-----------|----------------|-----------|-----------|-----------------------------------------------------|------------------------------------------------|
| LayoutXLM | 1 | 8 | 73.26 | 90.34% | [yaml](../layoutxlm/ser_layoutxlm_xfund_zh.yaml) | [ckpt](https://download.mindspore.cn/toolkits/mindocr/layoutxlm/ser_layoutxlm_base-a4ea148e.ckpt) |
| VI-LayoutXLM | 1 | 8 | 110.6 | 93.31% | [yaml](../layoutxlm/ser_layoutxlm_xfund_zh.yaml) | [ckpt](https://download.mindspore.cn/toolkits/mindocr/layoutxlm/ser_layoutxlm_base-a4ea148e.ckpt) |
</div>

## References
<!--- Guideline: Citation format GB/T 7714 is suggested. -->
Expand Down
Loading