diff --git a/README.md b/README.md
index 55ff82209..6df373911 100644
--- a/README.md
+++ b/README.md
@@ -46,6 +46,7 @@ mindspore versions.
| mindocr | mindspore |
|:-------:|:-----------:|
| main | master |
+| 0.5 | 2.5.0 |
| 0.4 | 2.3.0/2.3.1 |
| 0.3 | 2.2.10 |
| 0.1 | 1.8 |
diff --git a/README_CN.md b/README_CN.md
index 5b0de827c..fd6158d4b 100644
--- a/README_CN.md
+++ b/README_CN.md
@@ -46,6 +46,7 @@ MindOCR是一个基于[MindSpore](https://www.mindspore.cn/) 框架开发的OCR
| mindocr | mindspore |
|:-------:|:-----------:|
| main | master |
+| 0.5 | 2.5.0 |
| 0.4 | 2.3.0/2.3.1 |
| 0.3 | 2.2.10 |
| 0.1 | 1.8 |
diff --git a/configs/cls/mobilenetv3/README.md b/configs/cls/mobilenetv3/README.md
index 7f801b4dc..45a867ac1 100644
--- a/configs/cls/mobilenetv3/README.md
+++ b/configs/cls/mobilenetv3/README.md
@@ -31,25 +31,11 @@ Currently we support the 0 and 180 degree classification. You can update the par
+## Requirements
-## Results
-
-| mindspore | ascend driver | firmware | cann toolkit/kernel |
-|:---------:|:---------------:|:------------:|:-------------------:|
-| 2.3.1 | 24.1.RC2 | 7.3.0.1.231 | 8.0.RC2.beta1 |
-
-MobileNetV3 is pretrained on ImageNet. For text direction classification task, we further train MobileNetV3 on RCTW17, MTWI and LSVT datasets.
-
-Experiments are tested on ascend 910* with mindspore 2.3.1 graph mode
-
-
-| **model name** | **cards** | **batch size** | **img/s** | **accuracy** | **config** | **weight** |
-|----------------|-----------|----------------|-----------|--------------|-----------------------------------------------------|------------------------------------------------|
-| MobileNetV3 | 4 | 256 | 5923.5 | 94.59% | [yaml](cls_mv3.yaml) | [ckpt](https://download.mindspore.cn/toolkits/mindocr/cls/cls_mobilenetv3-92db9c58.ckpt) |
-
-
-
-
+| mindspore | ascend driver | firmware | cann toolkit/kernel |
+|:----------:|:--------------:|:--------------:|:-------------------:|
+| 2.5.0 | 24.1.0 | 7.5.0.3.220 | 8.0.0.beta1 |
## Quick Start
@@ -128,6 +114,18 @@ Please set the checkpoint path to the arg `ckpt_load_path` in the `eval` section
python tools/eval.py -c configs/cls/mobilenetv3/cls_mv3.yaml
```
+## Performance
+
+MobileNetV3 is pretrained on ImageNet. For text direction classification task, we further train MobileNetV3 on RCTW17, MTWI and LSVT datasets.
+
+Experiments are tested on ascend 910* with mindspore 2.5.0 graph mode
+
+
+| **model name** | **cards** | **batch size** | **img/s** | **accuracy** | **config** | **weight** |
+|----------------|-----------|----------------|-----------|--------------|-----------------------------------------------------|------------------------------------------------|
+| MobileNetV3 | 4 | 256 | 5923.5 | 94.59% | [yaml](cls_mv3.yaml) | [ckpt](https://download.mindspore.cn/toolkits/mindocr/cls/cls_mobilenetv3-92db9c58.ckpt) |
+
+
## References
diff --git a/configs/cls/mobilenetv3/README_CN.md b/configs/cls/mobilenetv3/README_CN.md
index 95cf12e58..89de45bf4 100644
--- a/configs/cls/mobilenetv3/README_CN.md
+++ b/configs/cls/mobilenetv3/README_CN.md
@@ -31,22 +31,11 @@ MobileNetV3[[1](#参考文献)]于2019年发布,这个版本结合了V1的deep
+### 配套版本
-## 实验结果
-
-| mindspore | ascend driver | firmware | cann toolkit/kernel |
-|:---------:|:---------------:|:------------:|:-------------------:|
-| 2.3.1 | 24.1.RC2 | 7.3.0.1.231 | 8.0.RC2.beta1 |
-
-MobileNetV3在ImageNet上预训练。另外,我们进一步在RCTW17、MTWI和LSVT数据集上进行了文字方向分类任务的训练。
-
-在采用图模式的ascend 910*上实验结果,mindspore版本为2.3.1
-
-
-| **模型名称** | **卡数** | **单卡批量大小** | **img/s** | **准确率** | **配置** | **权重** |
-|-------------|--------|------------|-----------|---------|----------------------|------------------------------------------------------------------------------------------|
-| MobileNetV3 | 4 | 256 | 5923.5 | 94.59% | [yaml](cls_mv3.yaml) | [ckpt](https://download.mindspore.cn/toolkits/mindocr/cls/cls_mobilenetv3-92db9c58.ckpt) |
-
+| mindspore | ascend driver | firmware | cann toolkit/kernel |
+|:----------:|:--------------:|:--------------:|:-------------------:|
+| 2.5.0 | 24.1.0 | 7.5.0.3.220 | 8.0.0.beta1 |
## 快速上手
@@ -128,6 +117,18 @@ model:
python tools/eval.py -c configs/cls/mobilenetv3/cls_mv3.yaml
```
+## 性能表现
+
+MobileNetV3在ImageNet上预训练。另外,我们进一步在RCTW17、MTWI和LSVT数据集上进行了文字方向分类任务的训练。
+
+在采用图模式的ascend 910*上实验结果,mindspore版本为2.5.0
+
+
+| **模型名称** | **卡数** | **单卡批量大小** | **img/s** | **准确率** | **配置** | **权重** |
+|-------------|--------|------------|-----------|---------|----------------------|------------------------------------------------------------------------------------------|
+| MobileNetV3 | 4 | 256 | 5923.5 | 94.59% | [yaml](cls_mv3.yaml) | [ckpt](https://download.mindspore.cn/toolkits/mindocr/cls/cls_mobilenetv3-92db9c58.ckpt) |
+
+
## 参考文献
diff --git a/configs/det/dbnet/README.md b/configs/det/dbnet/README.md
index 62b0f34fb..32f830304 100644
--- a/configs/det/dbnet/README.md
+++ b/configs/det/dbnet/README.md
@@ -56,13 +56,11 @@ combination of these two modules leads to scale-robust feature fusion.
DBNet++ performs better in detecting text instances of diverse scales, especially for large-scale text instances where
DBNet may generate inaccurate or discrete bounding boxes.
-
## Requirements
| mindspore | ascend driver | firmware | cann toolkit/kernel |
|:----------:|:--------------:|:--------------:|:-------------------:|
-| 2.3.1 | 24.1.RC2 | 7.3.0.1.231 | 8.0.RC2.beta1 |
-
+| 2.5.0 | 24.1.0 | 7.5.0.3.220 | 8.0.0.beta1 |
## Quick Start
@@ -290,7 +288,7 @@ msrun --worker_num=2 --local_worker_num=2 python tools/train.py --config configs
# Based on verification,binding cores usually results in performance acceleration.Please configure the parameters and run.
msrun --bind_core=True --worker_num=2 --local_worker_num=2 python tools/train.py --config configs/det/dbnet/db_r50_icdar15.yaml
```
-**Note:** For more information about msrun configuration, please refer to [here](https://www.mindspore.cn/tutorials/experts/en/r2.3.1/parallel/msrun_launcher.html).
+**Note:** For more information about msrun configuration, please refer to [here](https://www.mindspore.cn/docs/en/master/model_train/parallel/msrun_launcher.html).
The training result (including checkpoints, per-epoch performance and curves) will be saved in the directory parsed by the arg `ckpt_save_dir` in yaml config file. The default directory is `./tmp_det`.
diff --git a/configs/det/dbnet/README_CN.md b/configs/det/dbnet/README_CN.md
index d42e8579e..c94d967f5 100644
--- a/configs/det/dbnet/README_CN.md
+++ b/configs/det/dbnet/README_CN.md
@@ -45,8 +45,7 @@ DBNet++在检测不同尺寸的文本方面表现更好,尤其是对于尺寸
| mindspore | ascend driver | firmware | cann toolkit/kernel |
|:----------:|:--------------:|:--------------:|:-------------------:|
-| 2.3.1 | 24.1.RC2 | 7.3.0.1.231 | 8.0.RC2.beta1 |
-
+| 2.5.0 | 24.1.0 | 7.5.0.3.220 | 8.0.0.beta1 |
## 快速上手
@@ -271,7 +270,7 @@ msrun --worker_num=2 --local_worker_num=2 python tools/train.py --config configs
# 经验证,绑核在大部分情况下有性能加速,请配置参数并运行
msrun --bind_core=True --worker_num=2 --local_worker_num=2 python tools/train.py --config configs/det/dbnet/db_r50_icdar15.yaml
```
-**注意:** 有关 msrun 配置的更多信息,请参考[此处](https://www.mindspore.cn/tutorials/experts/zh-CN/r2.3.1/parallel/msrun_launcher.html).
+**注意:** 有关 msrun 配置的更多信息,请参考[此处](https://www.mindspore.cn/docs/zh-CN/master/model_train/parallel/msrun_launcher.html).
训练结果(包括checkpoint、每个epoch的性能和曲线图)将被保存在yaml配置文件的`ckpt_save_dir`参数配置的路径下,默认为`./tmp_det`。
diff --git a/configs/det/dbnet/README_CN_PP-OCRv3.md b/configs/det/dbnet/README_CN_PP-OCRv3.md
index 073c20041..3a4f771ca 100644
--- a/configs/det/dbnet/README_CN_PP-OCRv3.md
+++ b/configs/det/dbnet/README_CN_PP-OCRv3.md
@@ -338,7 +338,7 @@ msrun --worker_num=4 --local_worker_num=4 python tools/train.py --config configs
# 经验证,绑核在大部分情况下有性能加速,请配置参数并运行
msrun --bind_core=True --worker_num=4 --local_worker_num=4 python tools/train.py --config configs/det/dbnet/db_mobilenetv3_ppocrv3.yaml
```
-**注意:** 有关 msrun 配置的更多信息,请参考[此处](https://www.mindspore.cn/tutorials/experts/zh-CN/r2.3.1/parallel/msrun_launcher.html).
+**注意:** 有关 msrun 配置的更多信息,请参考[此处](https://www.mindspore.cn/docs/zh-CN/master/model_train/parallel/msrun_launcher.html).
* 单卡训练
diff --git a/configs/det/east/README.md b/configs/det/east/README.md
index 30a057264..81d8de830 100644
--- a/configs/det/east/README.md
+++ b/configs/det/east/README.md
@@ -29,6 +29,14 @@ EAST uses regression for the position and rotation angle of the text box, enabli
4. **Text detection branch**:
After determining the location and size of the text region, EAST further classifies these regions as text or non-text areas. For this purpose, a fully convolutional text branch is employed for binary classification of the text areas.
+
+## Requirements
+
+| mindspore | ascend driver | firmware | cann toolkit/kernel |
+|:----------:|:--------------:|:--------------:|:-------------------:|
+| 2.5.0 | 24.1.0 | 7.5.0.3.220 | 8.0.0.beta1 |
+
+
## Quick Start
### Installation
@@ -128,7 +136,7 @@ msrun --worker_num=8 --local_worker_num=8 python tools/train.py --config configs
# Based on verification,binding cores usually results in performance acceleration.Please configure the parameters and run.
msrun --bind_core=True --worker_num=8 --local_worker_num=8 python tools/train.py --config configs/det/east/east_r50_icdar15.yaml
```
-**Note:** For more information about msrun configuration, please refer to [here](https://www.mindspore.cn/tutorials/experts/en/r2.3.1/parallel/msrun_launcher.html).
+**Note:** For more information about msrun configuration, please refer to [here](https://www.mindspore.cn/docs/en/master/model_train/parallel/msrun_launcher.html).
The training result (including checkpoints, per-epoch performance and curves) will be saved in the directory parsed by the arg `ckpt_save_dir` in yaml config file. The default directory is `./tmp_det`.
diff --git a/configs/det/east/README_CN.md b/configs/det/east/README_CN.md
index aa6fd3f26..aa81c7476 100644
--- a/configs/det/east/README_CN.md
+++ b/configs/det/east/README_CN.md
@@ -33,7 +33,7 @@ EAST的整体架构图如图1所示,包含以下阶段:
| mindspore | ascend driver | firmware | cann toolkit/kernel |
|:----------:|:--------------:|:--------------:|:-------------------:|
-| 2.3.1 | 24.1.RC2 | 7.3.0.1.231 | 8.0.RC2.beta1 |
+| 2.5.0 | 24.1.0 | 7.5.0.3.220 | 8.0.0.beta1 |
## 快速上手
@@ -132,7 +132,7 @@ msrun --worker_num=8 --local_worker_num=8 python tools/train.py --config configs
# 经验证,绑核在大部分情况下有性能加速,请配置参数并运行
msrun --bind_core=True --worker_num=8 --local_worker_num=8 python tools/train.py --config configs/det/east/east_r50_icdar15.yaml
```
-**注意:** 有关 msrun 配置的更多信息,请参考[此处](https://www.mindspore.cn/tutorials/experts/zh-CN/r2.3.1/parallel/msrun_launcher.html).
+**注意:** 有关 msrun 配置的更多信息,请参考[此处](https://www.mindspore.cn/docs/zh-CN/master/model_train/parallel/msrun_launcher.html).
训练结果(包括checkpoint、每个epoch的性能和曲线图)将被保存在yaml配置文件的`ckpt_save_dir`参数配置的路径下,默认为`./tmp_det`。
diff --git a/configs/det/psenet/README.md b/configs/det/psenet/README.md
index fee0cd2d8..556590769 100644
--- a/configs/det/psenet/README.md
+++ b/configs/det/psenet/README.md
@@ -25,7 +25,7 @@ The overall architecture of PSENet is presented in Figure 1. It consists of mult
| mindspore | ascend driver | firmware | cann toolkit/kernel |
|:----------:|:--------------:|:--------------:|:-------------------:|
-| 2.3.1 | 24.1.RC2 | 7.3.0.1.231 | 8.0.RC2.beta1 |
+| 2.5.0 | 24.1.0 | 7.5.0.3.220 | 8.0.0.beta1 |
## Quick Start
@@ -156,7 +156,7 @@ msrun --worker_num=8 --local_worker_num=8 python tools/train.py --config configs
msrun --bind_core=True --worker_num=8 --local_worker_num=8 python tools/train.py --config configs/det/psenet/pse_r152_icdar15.yaml
```
-**Note:** For more information about msrun configuration, please refer to [here](https://www.mindspore.cn/tutorials/experts/en/r2.3.1/parallel/msrun_launcher.html).
+**Note:** For more information about msrun configuration, please refer to [here](https://www.mindspore.cn/docs/en/master/model_train/parallel/msrun_launcher.html).
The training result (including checkpoints, per-epoch performance and curves) will be saved in the directory parsed by the arg `ckpt_save_dir` in yaml config file. The default directory is `./tmp_det`.
diff --git a/configs/det/psenet/README_CN.md b/configs/det/psenet/README_CN.md
index f53e2e67e..d47c9c986 100644
--- a/configs/det/psenet/README_CN.md
+++ b/configs/det/psenet/README_CN.md
@@ -25,7 +25,7 @@ PSENet的整体架构图如图1所示,包含以下阶段:
| mindspore | ascend driver | firmware | cann toolkit/kernel |
|:----------:|:--------------:|:--------------:|:-------------------:|
-| 2.3.1 | 24.1.RC2 | 7.3.0.1.231 | 8.0.RC2.beta1 |
+| 2.5.0 | 24.1.0 | 7.5.0.3.220 | 8.0.0.beta1 |
## 快速上手
@@ -155,7 +155,7 @@ msrun --worker_num=8 --local_worker_num=8 python tools/train.py --config configs
# 经验证,绑核在大部分情况下有性能加速,请配置参数并运行
msrun --bind_core=True --worker_num=8 --local_worker_num=8 python tools/train.py --config configs/det/psenet/pse_r152_icdar15.yaml
```
-**注意:** 有关 msrun 配置的更多信息,请参考[此处](https://www.mindspore.cn/tutorials/experts/zh-CN/r2.3.1/parallel/msrun_launcher.html).
+**注意:** 有关 msrun 配置的更多信息,请参考[此处](https://www.mindspore.cn/docs/zh-CN/master/model_train/parallel/msrun_launcher.html).
训练结果(包括checkpoint、每个epoch的性能和曲线图)将被保存在yaml配置文件的`ckpt_save_dir`参数配置的路径下,默认为`./tmp_det`。
diff --git a/configs/kie/vi_layoutxlm/README.md b/configs/kie/vi_layoutxlm/README.md
index 35f896bd5..07b6b0fda 100644
--- a/configs/kie/vi_layoutxlm/README.md
+++ b/configs/kie/vi_layoutxlm/README.md
@@ -40,43 +40,20 @@ After obtaining αij from the original self-attention layer, considering the lar
Figure 1. LayoutXLM(LayoutLMv2) architecture [1]
-## Results
-
-
-### Accuracy
-
-| mindspore | ascend driver | firmware | cann toolkit/kernel |
-|:---------:|:---------------:|:------------:|:-------------------:|
-| 2.3.1 | 24.1.RC2 | 7.3.0.1.231 | 8.0.RC2.beta1 |
-
-According to our experiments, the performance and accuracy evaluation([Model Evaluation](#33-Model-Evaluation)) results of training ([Model Training](#32-Model-Training)) on the XFUND Chinese dataset are as follows:
-
-
-Experiments are tested on ascend 910* with mindspore 2.3.1 graph mode
-
-
-| **model name** | **cards** | **batch size** | **img/s** | **hmean** | **config** | **weight** |
-|----------------|-----------|----------------|-----------|-----------|-----------------------------------------------------|------------------------------------------------|
-| LayoutXLM | 1 | 8 | 73.26 | 90.34% | [yaml](../layoutxlm/ser_layoutxlm_xfund_zh.yaml) | [ckpt](https://download.mindspore.cn/toolkits/mindocr/layoutxlm/ser_layoutxlm_base-a4ea148e.ckpt) |
-| VI-LayoutXLM | 1 | 8 | 110.6 | 93.31% | [yaml](../layoutxlm/ser_layoutxlm_xfund_zh.yaml) | [ckpt](https://download.mindspore.cn/toolkits/mindocr/layoutxlm/ser_layoutxlm_base-a4ea148e.ckpt) |
-
-
+## Requirements
+| mindspore | ascend driver | firmware | cann toolkit/kernel |
+|:----------:|:--------------:|:--------------:|:-------------------:|
+| 2.5.0 | 24.1.0 | 7.5.0.3.220 | 8.0.0.beta1 |
## Quick Start
-### Preparation
-#### Installation
+### Installation
+
Please refer to the [installation instruction](https://github.com/mindspore-lab/mindocr#installation) in MindOCR.
+### Dataset preparation
+
#### Dataset Download
[The XFUND dataset](https://github.com/doc-analysis/XFUND) is used as the experimental dataset. The XFUND dataset is a multilingual dataset proposed by Microsoft for the Knowledge-Intensive Extraction (KIE) task. It consists of seven datasets, each containing 149 training samples and 50 validation samples.
@@ -168,7 +145,18 @@ Recognition results are as shown in the image, and the image is saved as`inferen
example_ser.jpg
+## Performance
+
+According to our experiments, the performance of evaluation on the XFUND Chinese dataset are as follows:
+
+Experiments are tested on ascend 910* with mindspore 2.5.0 graph mode
+
+| **model name** | **cards** | **batch size** | **img/s** | **hmean** | **config** | **weight** |
+|----------------|-----------|----------------|-----------|-----------|-----------------------------------------------------|------------------------------------------------|
+| LayoutXLM | 1 | 8 | 73.26 | 90.34% | [yaml](../layoutxlm/ser_layoutxlm_xfund_zh.yaml) | [ckpt](https://download.mindspore.cn/toolkits/mindocr/layoutxlm/ser_layoutxlm_base-a4ea148e.ckpt) |
+| VI-LayoutXLM | 1 | 8 | 110.6 | 93.31% | [yaml](../layoutxlm/ser_layoutxlm_xfund_zh.yaml) | [ckpt](https://download.mindspore.cn/toolkits/mindocr/layoutxlm/ser_layoutxlm_base-a4ea148e.ckpt) |
+
## References
diff --git a/configs/kie/vi_layoutxlm/README_CN.md b/configs/kie/vi_layoutxlm/README_CN.md
index 7174e7e43..af866caf8 100644
--- a/configs/kie/vi_layoutxlm/README_CN.md
+++ b/configs/kie/vi_layoutxlm/README_CN.md
@@ -35,30 +35,19 @@ Encoder concat视觉embedding和文本embedding到一个统一的序列,并与
图1. LayoutXLM(LayoutLMv2)架构图 [1]
-## 评估结果
+### 配套版本
-| mindspore | ascend driver | firmware | cann toolkit/kernel |
-|:---------:|:---------------:|:------------:|:-------------------:|
-| 2.3.1 | 24.1.RC2 | 7.3.0.1.231 | 8.0.RC2.beta1 |
-
-根据我们的实验,在XFUND中文数据集上训练的([模型评估](#33-模型评估))结果如下:
-
-在采用图模式的ascend 910*上实验结果,mindspore版本为2.3.1
-
-
-| **模型名称** | **卡数** | **单卡批量大小** | **img/s** | **hmean** | **配置** | **权重** |
-|--------------|--------|------------|-----------|-----------|--------------------------------------------------|---------------------------------------------------------------------------------------------------|
-| LayoutXLM | 1 | 8 | 73.26 | 90.34% | [yaml](../layoutxlm/ser_layoutxlm_xfund_zh.yaml) | [ckpt](https://download.mindspore.cn/toolkits/mindocr/layoutxlm/ser_layoutxlm_base-a4ea148e.ckpt) |
-| VI-LayoutXLM | 1 | 8 | 110.6 | 93.31% | [yaml](../layoutxlm/ser_layoutxlm_xfund_zh.yaml) | [ckpt](https://download.mindspore.cn/toolkits/mindocr/layoutxlm/ser_layoutxlm_base-a4ea148e.ckpt) |
-
+| mindspore | ascend driver | firmware | cann toolkit/kernel |
+|:----------:|:--------------:|:--------------:|:-------------------:|
+| 2.5.0 | 24.1.0 | 7.5.0.3.220 | 8.0.0.beta1 |
+## 快速开始
+### 安装
-## 快速开始
-### 环境及数据准备
+环境安装教程请参考MindOCR的 [安装指南](https://github.com/mindspore-lab/mindocr#installation).
-#### 安装
-环境安装教程请参考MindOCR的 [installation instruction](https://github.com/mindspore-lab/mindocr#installation).
+### 数据准备
#### 数据集下载
这里使用[XFUND数据集](https://github.com/doc-analysis/XFUND)做为实验数据集。 XFUN数据集是微软提出的一个用于KIE任务的多语言数据集,共包含七个数据集,每个数据集包含149张训练集和50张验证集
@@ -115,7 +104,6 @@ cd ..
}
```
-
### 模型评估
若要评估已训练模型的准确性,可以使用`eval.py`。请在yaml配置文件的`eval`部分将参数`ckpt_load_path`设置为模型checkpoint的文件路径,然后运行:
@@ -124,7 +112,6 @@ cd ..
python tools/eval.py --config configs/kie/vi_layoutxlm/ser_vi_layoutxlm_xfund_zh.yaml
```
-
### 模型推理
若要使用已训练的模型进行推理,可使用`tools/infer/text/predict_ser.py`进行推理并将结果进行可视化展示。
@@ -150,8 +137,18 @@ python tools/infer/text/predict_ser.py --rec_algorithm CRNN_CH --image_dir {dir
example_ser.jpg
+## 性能表现
+根据我们的实验,在XFUND中文数据集上训练的模型评估结果如下:
+
+在采用图模式的ascend 910*上实验结果,mindspore版本为2.5.0
+
+| **模型名称** | **卡数** | **单卡批量大小** | **img/s** | **hmean** | **配置** | **权重** |
+|--------------|--------|------------|-----------|-----------|--------------------------------------------------|---------------------------------------------------------------------------------------------------|
+| LayoutXLM | 1 | 8 | 73.26 | 90.34% | [yaml](../layoutxlm/ser_layoutxlm_xfund_zh.yaml) | [ckpt](https://download.mindspore.cn/toolkits/mindocr/layoutxlm/ser_layoutxlm_base-a4ea148e.ckpt) |
+| VI-LayoutXLM | 1 | 8 | 110.6 | 93.31% | [yaml](../layoutxlm/ser_layoutxlm_xfund_zh.yaml) | [ckpt](https://download.mindspore.cn/toolkits/mindocr/layoutxlm/ser_layoutxlm_base-a4ea148e.ckpt) |
+
## 参考文献
diff --git a/configs/layout/layoutlmv3/README.md b/configs/layout/layoutlmv3/README.md
index 6ff871001..727cd3de3 100644
--- a/configs/layout/layoutlmv3/README.md
+++ b/configs/layout/layoutlmv3/README.md
@@ -26,18 +26,20 @@ The representation of image vectors typically relies on CNN-extracted feature gr
Figure 1. LayoutLMv3 architecture [1]
-## Quick Start
+## Requirements
+
+| mindspore | ascend driver | firmware | cann toolkit/kernel |
+|:----------:|:--------------:|:--------------:|:-------------------:|
+| 2.5.0 | 24.1.0 | 7.5.0.3.220 | 8.0.0.beta1 |
-### Preparation
+## Quick Start
-| mindspore | ascend driver | firmware | cann toolkit/kernel |
-|:---------:|:---------------:|:------------:|:-------------------:|
-| 2.3.1 | 24.1.RC2 | 7.3.0.1.231 | 8.0.RC2.beta1 |
-| 2.4.0 | 24.1.RC2 | 7.3.0.1.231 | 8.0.RC3.beta1 |
+### Installation
-#### Installation
Please refer to the [installation instruction](https://github.com/mindspore-lab/mindocr#installation) in MindOCR.
+### Dataset preparation
+
#### PubLayNet Dataset Preparation
PubLayNet is a dataset for document layout analysis. It contains images of research papers and articles and annotations for various elements in a page such as "text", "list", "figure" etc in these research paper images. The dataset was obtained by automatically matching the XML representations and the content of over 1 million PDF articles that are publicly available on PubMed Central.
@@ -74,15 +76,6 @@ python tools/param_converter_from_torch.py \
```bash
python tools/eval.py --config configs/layout/layoutlmv3/layoutlmv3_publaynet.yaml
```
-The evaluation results on the public benchmark dataset (PublayNet) are as follows:
-
-Experiments are tested on ascend 910* with mindspore 2.3.1 pynative mode
-
-
-| **model name** | **cards** | **batch size** | **img/s** | **map** | **config** |
-|----------------|-----------|----------------|-----------|---------|----------------------------------------------------------------------------------------------------------------|
-| LayoutLMv3 | 1 | 1 | 2.7 | 94.3% | [yaml](https://github.com/mindspore-lab/mindocr/blob/main/configs/layout/layoutlmv3/layoutlmv3_publaynet.yaml) |
-
### Model Inference
@@ -99,6 +92,17 @@ layout_res.png (Model inference visualization results)
layout_results.txt (Model inference text results)
+## Performance
+
+The evaluation results on the public benchmark dataset (PublayNet) are as follows:
+
+Experiments are tested on ascend 910* with mindspore 2.5.0 pynative mode
+
+
+| **model name** | **cards** | **batch size** | **img/s** | **map** | **config** |
+|----------------|-----------|----------------|-----------|---------|----------------------------------------------------------------------------------------------------------------|
+| LayoutLMv3 | 1 | 1 | 2.7 | 94.3% | [yaml](https://github.com/mindspore-lab/mindocr/blob/main/configs/layout/layoutlmv3/layoutlmv3_publaynet.yaml) |
+
## References
diff --git a/configs/layout/layoutlmv3/README_CN.md b/configs/layout/layoutlmv3/README_CN.md
index 8de56ff78..0c982740a 100644
--- a/configs/layout/layoutlmv3/README_CN.md
+++ b/configs/layout/layoutlmv3/README_CN.md
@@ -29,18 +29,19 @@ LayoutLMv3 还应用了文本——图像多模态 Transformer 架构来学习
图1. LayoutLMv3架构图 [1]
+### 配套版本
-## 快速开始
+| mindspore | ascend driver | firmware | cann toolkit/kernel |
+|:----------:|:--------------:|:--------------:|:-------------------:|
+| 2.5.0 | 24.1.0 | 7.5.0.3.220 | 8.0.0.beta1 |
-### 环境及数据准备
+## 快速上手
-| mindspore | ascend driver | firmware | cann toolkit/kernel |
-|:---------:|:---------------:|:------------:|:-------------------:|
-| 2.3.1 | 24.1.RC2 | 7.3.0.1.231 | 8.0.RC2.beta1 |
-| 2.4.0 | 24.1.RC2 | 7.3.0.1.231 | 8.0.RC3.beta1 |
+### 安装
-#### 安装
-环境安装教程请参考MindOCR的 [installation instruction](https://github.com/mindspore-lab/mindocr#installation).
+环境安装教程请参考MindOCR的 [安装指南](https://github.com/mindspore-lab/mindocr#installation).
+
+### 数据准备
#### PubLayNet数据集准备
@@ -78,15 +79,6 @@ python tools/param_converter_from_torch.py \
```bash
python tools/eval.py --config configs/layout/layoutlmv3/layoutlmv3_publaynet.yaml
```
-在公开基准数据集(PublayNet)上的-评估结果如下:
-
-在采用动态图模式的ascend 910*上实验结果,mindspore版本为2.3.1
-
-
-| **model name** | **cards** | **batch size** | **img/s** | **map** | **config** |
-|----------------|-----------|----------------|-----------|---------|----------------------------------------------------------------------------------------------------------------|
-| LayoutLMv3 | 1 | 1 | 2.7 | 94.3% | [yaml](https://github.com/mindspore-lab/mindocr/blob/main/configs/layout/layoutlmv3/layoutlmv3_publaynet.yaml) |
-
### 模型推理
@@ -103,6 +95,17 @@ layout_res.png (模型推理可视化结果)
layout_results.txt (模型推理文本结果)
+## 性能表现
+
+在公开基准数据集(PublayNet)上的评估结果如下:
+
+在采用动态图模式的ascend 910*上实验结果,mindspore版本为2.5.0
+
+
+| **model name** | **cards** | **batch size** | **img/s** | **map** | **config** |
+|----------------|-----------|----------------|-----------|---------|----------------------------------------------------------------------------------------------------------------|
+| LayoutLMv3 | 1 | 1 | 2.7 | 94.3% | [yaml](https://github.com/mindspore-lab/mindocr/blob/main/configs/layout/layoutlmv3/layoutlmv3_publaynet.yaml) |
+
## 参考文献
diff --git a/configs/layout/yolov8/README.md b/configs/layout/yolov8/README.md
index b39ef8ac3..72310a3fe 100644
--- a/configs/layout/yolov8/README.md
+++ b/configs/layout/yolov8/README.md
@@ -17,40 +17,24 @@ In order to adapt to the layout analysis task, we have made some improvements to

-## Results
-| mindspore | ascend driver | firmware | cann toolkit/kernel |
-|:---------:|:---------------:|:------------:|:-------------------:|
-| 2.3.1 | 24.1.RC2 | 7.3.0.1.231 | 8.0.RC2.beta1 |
-
-### Accuracy
-
-According to our experiment, the evaluation results on the public benchmark dataset (PublayNet) are as follows:
-
-Experiments are tested on ascend 910* with mindspore 2.3.1 graph mode
-
-
-| **model name** | **cards** | **batch size** | **ms/step** | **img/s** | **map** | **config** | **weight** |
-|----------------|-----------|----------------|---------------|-----------|---------|-----------------------------------------------------|------------------------------------------------|
-| YOLOv8 | 4 | 16 | 284.93| 56.15 | 94.4% | [yaml](https://github.com/mindspore-lab/mindocr/blob/main/configs/layout/yolov8/yolov8n.yaml) | [ckpt](https://download.mindspore.cn/toolkits/mindocr/yolov8/yolov8n-4b9e8004.ckpt) \| [mindir](https://download.mindspore.cn/toolkits/mindocr/yolov8/yolov8n-2a1f68ab.mindir) |
-
-
-**Notes:**
-- To reproduce the result on other contexts, please ensure the global batch size is the same.
-- The models are trained from scratch without any pre-training. For more dataset details of training and evaluation, please refer to [PubLayNet Dataset Preparation](#3.1.2 PubLayNet Dataset Preparation) section.
-- The input Shapes of MindIR of YOLOv8 is (1, 3, 800, 800).
+## Requirements
+| mindspore | ascend driver | firmware | cann toolkit/kernel |
+|:----------:|:--------------:|:--------------:|:-------------------:|
+| 2.5.0 | 24.1.0 | 7.5.0.3.220 | 8.0.0.beta1 |
## Quick Start
-### Preparation
-#### Installation
+### Installation
Please refer to the [installation instruction](https://github.com/mindspore-lab/mindocr#installation) in MindOCR.
+### Dataset preparation
+
#### PubLayNet Dataset Preparation
PubLayNet is a dataset for document layout analysis. It contains images of research papers and articles and annotations for various elements in a page such as "text", "list", "figure" etc in these research paper images. The dataset was obtained by automatically matching the XML representations and the content of over 1 million PDF articles that are publicly available on PubMed Central.
-#### Check YAML Config Files
+### Check YAML Config Files
Apart from the dataset setting, please also check the following important args: `system.distribute`, `system.val_while_train`, `common.batch_size`, `train.ckpt_save_dir`, `train.dataset.dataset_path`, `eval.ckpt_load_path`, `eval.dataset.dataset_path`, `eval.loader.batch_size`. Explanations of these important args:
```yaml
@@ -107,7 +91,7 @@ msrun --worker_num=4 --local_worker_num=4 python tools/train.py --config configs
# Based on verification,binding cores usually results in performance acceleration.Please configure the parameters and run.
msrun --bind_core=True --worker_num=4 --local_worker_num=4 python tools/train.py --config configs/layout/yolov8/yolov8n.yaml
```
-**Note:** For more information about msrun configuration, please refer to [here](https://www.mindspore.cn/tutorials/experts/en/r2.3.1/parallel/msrun_launcher.html).
+**Note:** For more information about msrun configuration, please refer to [here](https://www.mindspore.cn/docs/en/master/model_train/parallel/msrun_launcher.html).
* Standalone Training
@@ -129,6 +113,23 @@ To evaluate the accuracy of the trained model, you can use `eval.py`. Please set
python tools/eval.py --config configs/layout/yolov8/yolov8n.yaml
```
+## Performance
+
+According to our experiment, the evaluation results on the public benchmark dataset (PublayNet) are as follows:
+
+Experiments are tested on ascend 910* with mindspore 2.5.0 graph mode
+
+
+| **model name** | **cards** | **batch size** | **ms/step** | **img/s** | **map** | **config** | **weight** |
+|----------------|-----------|----------------|---------------|-----------|---------|-----------------------------------------------------|------------------------------------------------|
+| YOLOv8 | 4 | 16 | 284.93| 56.15 | 94.4% | [yaml](https://github.com/mindspore-lab/mindocr/blob/main/configs/layout/yolov8/yolov8n.yaml) | [ckpt](https://download.mindspore.cn/toolkits/mindocr/yolov8/yolov8n-4b9e8004.ckpt) \| [mindir](https://download.mindspore.cn/toolkits/mindocr/yolov8/yolov8n-2a1f68ab.mindir) |
+
+
+**Notes:**
+- To reproduce the result on other contexts, please ensure the global batch size is the same.
+- The models are trained from scratch without any pre-training. For more dataset details of training and evaluation, please refer to [quick start](#quick-start) section.
+- The input Shapes of MindIR of YOLOv8 is (1, 3, 800, 800).
+
## MindSpore Lite Inference
To inference with MindSpot Lite on Ascend 310, please refer to the tutorial [MindOCR Inference](../../../docs/en/inference/inference_tutorial.md). In short, the whole process consists of the following steps:
diff --git a/configs/layout/yolov8/README_CN.md b/configs/layout/yolov8/README_CN.md
index 69fb6ba18..0e3ca7abb 100644
--- a/configs/layout/yolov8/README_CN.md
+++ b/configs/layout/yolov8/README_CN.md
@@ -20,34 +20,19 @@ YOLOv8 是Ultralytics的YOLO的最新版本。作为一种前沿、最先进(SOT

+### 配套版本
-## 评估结果
+| mindspore | ascend driver | firmware | cann toolkit/kernel |
+|:----------:|:--------------:|:--------------:|:-------------------:|
+| 2.5.0 | 24.1.0 | 7.5.0.3.220 | 8.0.0.beta1 |
+## 快速上手
-| mindspore | ascend driver | firmware | cann toolkit/kernel |
-|:---------:|:---------------:|:------------:|:-------------------:|
-| 2.3.1 | 24.1.RC2 | 7.3.0.1.231 | 8.0.RC2.beta1 |
+### 安装
-根据我们的实验,在公开基准数据集(PublayNet)上的-评估结果如下:
+环境安装教程请参考MindOCR的 [安装指南](https://github.com/mindspore-lab/mindocr#installation).
-在采用图模式的ascend 910*上实验结果,mindspore版本为2.3.1
-
-
-| **模型名称** | **卡数** | **单卡批量大小** | **ms/step** | **img/s** | **map** | **配置** | **权重** |
-|----------|--------|------------|---------------|-----------|---------|-----------------------------------------------------------------------------------------------|--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
-| YOLOv8 | 4 | 16 | 284.93| 56.15 | 94.4% | [yaml](https://github.com/mindspore-lab/mindocr/blob/main/configs/layout/yolov8/yolov8n.yaml) | [ckpt](https://download.mindspore.cn/toolkits/mindocr/yolov8/yolov8n-4b9e8004.ckpt) \| [mindir](https://download.mindspore.cn/toolkits/mindocr/yolov8/yolov8n-2a1f68ab.mindir) |
-
-
-**注意:**
-- 如需在其他环境配置重现训练结果,请确保全局批量大小与原配置文件保持一致。
-- 模型都是从头开始训练的,无需任何预训练。关于训练和测试数据集的详细介绍,请参考[PubLayNet数据集准备](#3.1.2 PubLayNet数据集准备)章节。
-- YOLOv8的MindIR导出时的输入Shape均为(1, 3, 800, 800)。
-
-## 快速开始
-### 环境及数据准备
-
-#### 安装
-环境安装教程请参考MindOCR的 [installation instruction](https://github.com/mindspore-lab/mindocr#installation).
+### 数据准备
#### PubLayNet数据集准备
@@ -64,7 +49,8 @@ python tools/dataset_converters/convert.py \
下载完成后,可以使用上述MindOCR提供的脚本将数据转换为YOLOv8输入格式的数据类型。
-#### 检查配置文件
+### 配置说明
+
除了数据集的设置,请同时重点关注以下变量的配置:`system.distribute`, `system.val_while_train`, `common.batch_size`, `train.ckpt_save_dir`, `train.dataset.dataset_path`, `eval.ckpt_load_path`, `eval.dataset.dataset_path`, `eval.loader.batch_size`。说明如下:
```yaml
@@ -121,7 +107,7 @@ msrun --worker_num=4 --local_worker_num=4 python tools/train.py --config configs
# 经验证,绑核在大部分情况下有性能加速,请配置参数并运行
msrun --bind_core=True --worker_num=4 --local_worker_num=4 python tools/train.py --config configs/layout/yolov8/yolov8n.yaml
```
-**注意:** 有关 msrun 配置的更多信息,请参考[此处](https://www.mindspore.cn/tutorials/experts/zh-CN/r2.3.1/parallel/msrun_launcher.html).
+**注意:** 有关 msrun 配置的更多信息,请参考[此处](https://www.mindspore.cn/docs/zh-CN/master/model_train/parallel/msrun_launcher.html).
* 单卡训练
@@ -143,6 +129,22 @@ python tools/train.py --config configs/layout/yolov8/yolov8n.yaml
python tools/eval.py --config configs/layout/yolov8/yolov8n.yaml
```
+## 性能表现
+
+根据我们的实验,在公开基准数据集(PublayNet)上的评估结果如下:
+
+在采用图模式的ascend 910*上实验结果,mindspore版本为2.5.0
+
+
+| **模型名称** | **卡数** | **单卡批量大小** | **ms/step** | **img/s** | **map** | **配置** | **权重** |
+|----------|--------|------------|---------------|-----------|---------|-----------------------------------------------------------------------------------------------|--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
+| YOLOv8 | 4 | 16 | 284.93| 56.15 | 94.4% | [yaml](https://github.com/mindspore-lab/mindocr/blob/main/configs/layout/yolov8/yolov8n.yaml) | [ckpt](https://download.mindspore.cn/toolkits/mindocr/yolov8/yolov8n-4b9e8004.ckpt) \| [mindir](https://download.mindspore.cn/toolkits/mindocr/yolov8/yolov8n-2a1f68ab.mindir) |
+
+
+**注意:**
+- 如需在其他环境配置重现训练结果,请确保全局批量大小与原配置文件保持一致。
+- 模型都是从头开始训练的,无需任何预训练。关于训练和测试数据集的详细介绍,请参考[快速上手](#快速上手)章节。
+- YOLOv8的MindIR导出时的输入Shape均为(1, 3, 800, 800)。
## MindSpore Lite 推理
diff --git a/configs/rec/abinet/README.md b/configs/rec/abinet/README.md
index 27975d702..75a5b87f2 100644
--- a/configs/rec/abinet/README.md
+++ b/configs/rec/abinet/README.md
@@ -22,15 +22,18 @@ Linguistic knowledge is of great benefit to scene text recognition. However, how
| mindspore | ascend driver | firmware | cann toolkit/kernel |
|:----------:|:--------------:|:-------------:|:-------------------:|
-| 2.3.1 | 24.1.RC2 | 7.3.0.1.231 | 8.0.RC2.beta1 |
+| 2.5.0 | 24.1.0 | 7.5.0.3.220 | 8.0.0.beta1 |
## Quick Start
-### Preparation
-#### Installation
+### Installation
+
Please refer to the [installation instruction](https://github.com/mindspore-lab/mindocr#installation) in MindOCR.
+### Dataset preparation
+
#### Dataset Download
+
Please download LMDB dataset for traininig and evaluation from
- `training` contains two datasets: [MJSynth (MJ)](https://pan.baidu.com/s/1mgnTiyoR8f6Cm655rFI4HQ) and [SynthText (ST)](https://pan.baidu.com/s/1mgnTiyoR8f6Cm655rFI4HQ)
- `evaluation` contains several benchmarking datasets, which are [IIIT](http://cvit.iiit.ac.in/projects/SceneTextUnderstanding/IIIT5K.html), [SVT](http://www.iapr-tc11.org/mediawiki/index.php/The_Street_View_Text_Dataset), [IC13](http://rrc.cvc.uab.es/?ch=2), [IC15](http://rrc.cvc.uab.es/?ch=4), [SVTP](http://openaccess.thecvf.com/content_iccv_2013/papers/Phan_Recognizing_Text_with_2013_ICCV_paper.pdf), and [CUTE](http://cs-chan.com/downloads_CUTE80_dataset.html).
@@ -91,8 +94,9 @@ Here we used the datasets under `train/` folders for **train**. After training,
- [SVT](http://www.iapr-tc11.org/mediawiki/index.php/The_Street_View_Text_Dataset): 2.4 MB, 647 samples
- [SVTP](http://openaccess.thecvf.com/content_iccv_2013/papers/Phan_Recognizing_Text_with_2013_ICCV_paper.pdf): 1.8 MB, 645 samples
+### Update yaml config file
-**Data configuration for model training**
+#### Data configuration for model training
To reproduce the training of model, it is recommended that you modify the configuration yaml as follows:
@@ -115,7 +119,7 @@ eval:
...
```
-**Data configuration for model evaluation**
+#### Data configuration for model evaluation
We use the dataset under `evaluation/` as the benchmark dataset. On **each individual dataset** (e.g. CUTE80, IC13_857, etc.), we perform a full evaluation by setting the dataset's directory to the evaluation dataset. This way, we get a list of the corresponding accuracies for each dataset, and then the reported accuracies are the average of these values.
@@ -144,7 +148,7 @@ eval:
...
```
-By running `tools/eval.py` as noted in section [Model Evaluation](#33-model-evaluation) with the above config yaml, you can get the accuracy performance on dataset CUTE80.
+By running `tools/eval.py` as noted in section [Model Evaluation](#model-evaluation) with the above config yaml, you can get the accuracy performance on dataset CUTE80.
2. Evaluate on multiple datasets under the same folder
@@ -167,7 +171,6 @@ data_lmdb_release/
then you can evaluate on each dataset by modifying the config yaml as follows, and execute the script `tools/benchmarking/multi_dataset_eval.py`.
-
#### Check YAML Config Files
Apart from the dataset setting, please also check the following important args: `system.distribute`, `system.val_while_train`, `common.batch_size`, `train.ckpt_save_dir`, `train.dataset.dataset_root`, `train.dataset.data_dir`, `train.dataset.label_file`,
`eval.ckpt_load_path`, `eval.dataset.dataset_root`, `eval.dataset.data_dir`, `eval.dataset.label_file`, `eval.loader.batch_size`. Explanations of these important args:
@@ -229,7 +232,7 @@ msrun --worker_num=8 --local_worker_num=8 python tools/train.py --config configs
# Based on verification,binding cores usually results in performance acceleration.Please configure the parameters and run.
msrun --bind_core=True --worker_num=8 --local_worker_num=8 python tools/train.py --config configs/rec/abinet/abinet_resnet45_en.yaml
```
-**Note:** For more information about msrun configuration, please refer to [here](https://www.mindspore.cn/tutorials/experts/en/r2.3.1/parallel/msrun_launcher.html).
+**Note:** For more information about msrun configuration, please refer to [here](https://www.mindspore.cn/docs/en/master/model_train/parallel/msrun_launcher.html).
The pre-trained model needs to be loaded during ABINet model training, and the weight of the pre-trained model is
from [abinet_pretrain_en.ckpt](https://download.mindspore.cn/toolkits/mindocr/abinet/abinet_pretrain_en-821ca20b.ckpt). It is needed to add the path of the pretrained weight to the model pretrained in "configs/rec/abinet/abinet_resnet45_en.yaml".
diff --git a/configs/rec/abinet/README_CN.md b/configs/rec/abinet/README_CN.md
index 656a26973..3dd8ebb3b 100644
--- a/configs/rec/abinet/README_CN.md
+++ b/configs/rec/abinet/README_CN.md
@@ -22,17 +22,18 @@ Modeling for Scene Text Recognition](https://arxiv.org/pdf/2103.06495)
| mindspore | ascend driver | firmware | cann toolkit/kernel |
|:----------:|:--------------:|:-------------:|:-------------------:|
-| 2.3.1 | 24.1.RC2 | 7.3.0.1.231 | 8.0.RC2.beta1 |
+| 2.5.0 | 24.1.0 | 7.5.0.3.220 | 8.0.0.beta1 |
## 快速开始
-### 环境及数据准备
-#### 安装
+### 安装
环境安装教程请参考MindOCR的 [installation instruction](https://github.com/mindspore-lab/mindocr#installation).
+### 数据准备
+
+#### 数据集下载
-#### Dataset Download
请下载LMDB数据集用于训练和评估
- `training` 包含两个数据集: [MJSynth (MJ)](https://pan.baidu.com/s/1mgnTiyoR8f6Cm655rFI4HQ) 和 [SynthText (ST)](https://pan.baidu.com/s/1mgnTiyoR8f6Cm655rFI4HQ)
- `evaluation` 包含几个基准数据集,它们是[IIIT](http://cvit.iiit.ac.in/projects/SceneTextUnderstanding/IIIT5K.html), [SVT](http://www.iapr-tc11.org/mediawiki/index.php/The_Street_View_Text_Dataset), [IC13](http://rrc.cvc.uab.es/?ch=2), [IC15](http://rrc.cvc.uab.es/?ch=4), [SVTP](http://openaccess.thecvf.com/content_iccv_2013/papers/Phan_Recognizing_Text_with_2013_ICCV_paper.pdf), 和 [CUTE](http://cs-chan.com/downloads_CUTE80_dataset.html).
@@ -94,8 +95,9 @@ data_lmdb_release/
- [SVT](http://www.iapr-tc11.org/mediawiki/index.php/The_Street_View_Text_Dataset): 2.4 MB, 647 samples
- [SVTP](http://openaccess.thecvf.com/content_iccv_2013/papers/Phan_Recognizing_Text_with_2013_ICCV_paper.pdf): 1.8 MB, 645 samples
+### 配置说明
-**模型训练的数据配置**
+#### 模型训练的数据配置
如欲重现模型的训练,建议修改配置yaml如下:
@@ -121,7 +123,8 @@ eval:
...
```
-**模型评估的数据配置**
+#### 模型评估的数据配置
+
我们使用 `evaluation/` 下的数据集作为基准数据集。在**每个单独的数据集**(例如 CUTE80、IC03_860 等)上,我们通过将数据集的目录设置为评估数据集来执行完整评估。这样,我们就得到了每个数据集对应精度的列表,然后报告的精度是这些值的平均值。
如要重现报告的评估结果,您可以:
@@ -148,7 +151,7 @@ eval:
# label_file: # 验证或评估数据集的标签文件路径,将与`dataset_root`拼接形成完整的验证或评估数据的标签文件路径。当数据集为LMDB格式时无需配置
...
```
-通过使用上述配置 yaml 运行 [模型评估](#33-模型评估) 部分中所述的`tools/eval.py`,您可以获得数据集 CUTE80 的准确度性能。
+通过使用上述配置 yaml 运行 [模型评估](#模型评估) 部分中所述的`tools/eval.py`,您可以获得数据集 CUTE80 的准确度性能。
2.对同一文件夹下的多个数据集进行评估
@@ -186,7 +189,9 @@ eval:
# label_file: # 验证或评估数据集的标签文件路径,将与`dataset_root`拼接形成完整的验证或评估数据的标签文件路径。当数据集为LMDB格式时无需配置
...
```
+
#### 检查配置文件
+
除了数据集的设置,请同时重点关注以下变量的配置:`system.distribute`, `system.val_while_train`, `common.batch_size`, `train.ckpt_save_dir`, `train.dataset.dataset_root`, `train.dataset.data_dir`, `train.dataset.label_file`,
`eval.ckpt_load_path`, `eval.dataset.dataset_root`, `eval.dataset.data_dir`, `eval.dataset.label_file`, `eval.loader.batch_size`。说明如下:
@@ -247,7 +252,7 @@ msrun --worker_num=8 --local_worker_num=8 python tools/train.py --config configs
# 经验证,绑核在大部分情况下有性能加速,请配置参数并运行
msrun --bind_core=True --worker_num=8 --local_worker_num=8 python tools/train.py --config configs/rec/abinet/abinet_resnet45_en.yaml
```
-**注意:** 有关 msrun 配置的更多信息,请参考[此处](https://www.mindspore.cn/tutorials/experts/zh-CN/r2.3.1/parallel/msrun_launcher.html).
+**注意:** 有关 msrun 配置的更多信息,请参考[此处](https://www.mindspore.cn/docs/zh-CN/master/model_train/parallel/msrun_launcher.html).
ABINet模型训练时需要加载预训练模型,预训练模型的权重来自[abinet_pretrain_en.ckpt](https://download.mindspore.cn/toolkits/mindocr/abinet/abinet_pretrain_en-821ca20b.ckpt),需要在“configs/rec/abinet/abinet_resnet45_en.yaml”中model的pretrained添加预训练权重的路径。
diff --git a/configs/rec/crnn/README.md b/configs/rec/crnn/README.md
index 04aab6248..0ace0ce0a 100644
--- a/configs/rec/crnn/README.md
+++ b/configs/rec/crnn/README.md
@@ -25,16 +25,18 @@ As shown in the architecture graph (Figure 1), CRNN firstly extracts a feature s
| mindspore | ascend driver | firmware | cann toolkit/kernel |
|:----------:|:--------------:|:-------------:|:-------------------:|
-| 2.3.1 | 24.1.RC2 | 7.3.0.1.231 | 8.0.RC2.beta1 |
-
+| 2.5.0 | 24.1.0 | 7.5.0.3.220 | 8.0.0.beta1 |
## Quick Start
-### Preparation
-#### Installation
+### Installation
+
Please refer to the [installation instruction](https://github.com/mindspore-lab/mindocr#installation) in MindOCR.
+### Dataset preparation
+
#### Dataset Download
+
Please download lmdb dataset for traininig and evaluation from [here](https://www.dropbox.com/sh/i39abvnefllx2si/AAAbAYRvxzRp3cIE5HzqUw3ra?dl=0) (ref: [deep-text-recognition-benchmark](https://github.com/clovaai/deep-text-recognition-benchmark#download-lmdb-dataset-for-traininig-and-evaluation-from-here)). There're several zip files:
- `data_lmdb_release.zip` contains the **entire** datasets including training data, validation data and evaluation data.
- `training/` contains two datasets: [MJSynth (MJ)](http://www.robots.ox.ac.uk/~vgg/data/text/) and [SynthText (ST)](https://academictorrents.com/details/2dba9518166cbd141534cbf381aa3e99a087e83c)
@@ -108,8 +110,9 @@ Here we used the datasets under `training/` folders for **training**, and the un
- [SVT](http://www.iapr-tc11.org/mediawiki/index.php/The_Street_View_Text_Dataset): 2.4 MB, 647 samples
- [SVTP](http://openaccess.thecvf.com/content_iccv_2013/papers/Phan_Recognizing_Text_with_2013_ICCV_paper.pdf): 1.8 MB, 645 samples
+### Update yaml config file
-**Data configuration for model training**
+#### Data configuration for model training
To reproduce the training of model, it is recommended that you modify the configuration yaml as follows:
@@ -132,7 +135,7 @@ eval:
...
```
-**Data configuration for model evaluation**
+#### Data configuration for model evaluation
We use the dataset under `evaluation/` as the benchmark dataset. On **each individual dataset** (e.g. CUTE80, IC03_860, etc.), we perform a full evaluation by setting the dataset's directory to the evaluation dataset. This way, we get a list of the corresponding accuracies for each dataset, and then the reported accuracies are the average of these values.
@@ -201,6 +204,7 @@ eval:
```
#### Check YAML Config Files
+
Apart from the dataset setting, please also check the following important args: `system.distribute`, `system.val_while_train`, `common.batch_size`, `train.ckpt_save_dir`, `train.dataset.dataset_root`, `train.dataset.data_dir`, `train.dataset.label_file`,
`eval.ckpt_load_path`, `eval.dataset.dataset_root`, `eval.dataset.data_dir`, `eval.dataset.label_file`, `eval.loader.batch_size`. Explanations of these important args:
@@ -260,7 +264,7 @@ msrun --worker_num=8 --local_worker_num=8 python tools/train.py --config configs
# Based on verification,binding cores usually results in performance acceleration.Please configure the parameters and run.
msrun --bind_core=True --worker_num=8 --local_worker_num=8 python tools/train.py --config configs/rec/crnn/crnn_resnet34.yaml
```
-**Note:** For more information about msrun configuration, please refer to [here](https://www.mindspore.cn/tutorials/experts/en/r2.3.1/parallel/msrun_launcher.html).
+**Note:** For more information about msrun configuration, please refer to [here](https://www.mindspore.cn/docs/en/master/model_train/parallel/msrun_launcher.html).
@@ -330,7 +334,7 @@ For detailed instruction of data preparation and yaml configuration, please refe
### General Purpose Chinese Models
-Experiments are tested on ascend 910* with mindspore 2.3.1 graph mode.
+Experiments are tested on ascend 910* with mindspore 2.5.0 graph mode.
| **model name** | **backbone** | **cards** | **batch size** | **language** | **jit level** | **graph compile** | **ms/step** | **img/s** | **scene** | **web** | **document** | **recipe** | **weight** |
| :------------: | :----------: | :-------: | :------------: | :----------: | :-----------: | :---------------: | :---------: | :-------: |:---------:|:-------:|:------------:|:-------------------------------------------------------------------------------------------------:| :---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------: |
@@ -343,7 +347,7 @@ Experiments are tested on ascend 910* with mindspore 2.3.1 graph mode.
#### Training Performance
-Experiments are tested on ascend 910* with mindspore 2.3.1 graph mode.
+Experiments are tested on ascend 910* with mindspore 2.5.0 graph mode.
| **model name** | **backbone** | **train dataset** | **params(M)** | **cards** | **batch size** | **jit level** | **graph compile** | **ms/step** | **img/s** | **accuracy** | **recipe** | **weight** |
|:--------------:| :----------: | :---------------: | :-----------: | :-------: | :------------: | :-----------: |:-----------------:|:-----------:|:---------:|:------------:|:---------------------------------------------------------------------------------------------:|:--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------:|
@@ -361,7 +365,7 @@ Detailed accuracy results for each benchmark dataset (IC03, IC13, IC15, IIIT, SV
#### Inference Performance
-Experiments are tested on ascend 310P with mindspore lite 2.3.1 graph mode.
+Experiments are tested on ascend 310P with mindspore lite 2.5.0 graph mode.
| model name | backbone | test dataset | params(M) | cards | batch size | **jit level** | **graph compile** | img/s |
| :--------: | :---------: | :----------: | :-------: | :---: | :--------: | :-----------: | :---------------: | :----: |
@@ -373,7 +377,7 @@ Experiments are tested on ascend 310P with mindspore lite 2.3.1 graph mode.
- To reproduce the result on other contexts, please ensure the global batch size is the same.
- The characters supported by model are lowercase English characters from a to z and numbers from 0 to 9. More explanation on dictionary, please refer to [Character Dictionary](#character-dictionary).
-- The models are trained from scratch without any pre-training. For more dataset details of training and evaluation, please refer to [Dataset Download & Dataset Usage](#dataset-download) section.
+- The models are trained from scratch without any pre-training. For more dataset details of training and evaluation, please refer to [Dataset preparation](#dataset-preparation) section.
- The input Shapes of MindIR of CRNN_VGG7 and CRNN_ResNet34_vd are both (1, 3, 32, 100).
diff --git a/configs/rec/crnn/README_CN.md b/configs/rec/crnn/README_CN.md
index 246427ff7..e1d393c61 100644
--- a/configs/rec/crnn/README_CN.md
+++ b/configs/rec/crnn/README_CN.md
@@ -26,17 +26,20 @@
| mindspore | ascend driver | firmware | cann toolkit/kernel |
|:----------:|:--------------:|:--------------:|:-------------------:|
-| 2.3.1 | 24.1.RC2 | 7.3.0.1.231 | 8.0.RC2.beta1 |
+| 2.5.0 | 24.1.0 | 7.5.0.3.220 | 8.0.0.beta1 |
## 快速开始
-### 环境及数据准备
-#### 安装
+### 安装
+
环境安装教程请参考MindOCR的 [installation instruction](https://github.com/mindspore-lab/mindocr#installation).
+### 数据准备
+
#### 数据集下载
+
LMDB格式的训练及验证数据集可以从[这里](https://www.dropbox.com/sh/i39abvnefllx2si/AAAbAYRvxzRp3cIE5HzqUw3ra?dl=0) (出处: [deep-text-recognition-benchmark](https://github.com/clovaai/deep-text-recognition-benchmark#download-lmdb-dataset-for-traininig-and-evaluation-from-here))下载。连接中的文件包含多个压缩文件,其中:
- `data_lmdb_release.zip` 包含了**完整**的一套数据集,有训练集(training/),验证集(validation/)以及测试集(evaluation)。
- `training.zip` 包括两个数据集,分别是 [MJSynth (MJ)](http://www.robots.ox.ac.uk/~vgg/data/text/) 和 [SynthText (ST)](https://academictorrents.com/details/2dba9518166cbd141534cbf381aa3e99a087e83c)
@@ -109,7 +112,9 @@ data_lmdb_release/
- [SVT](http://www.iapr-tc11.org/mediawiki/index.php/The_Street_View_Text_Dataset): 2.4 MB, 647 samples
- [SVTP](http://openaccess.thecvf.com/content_iccv_2013/papers/Phan_Recognizing_Text_with_2013_ICCV_paper.pdf): 1.8 MB, 645 samples
-**模型训练的数据配置**
+### 配置说明
+
+#### 模型训练的数据配置
如欲重现模型的训练,建议修改配置yaml如下:
@@ -132,7 +137,7 @@ eval:
...
```
-**模型评估的数据配置**
+#### 模型评估的数据配置
我们使用 `evaluation/` 下的数据集作为基准数据集。在**每个单独的数据集**(例如 CUTE80、IC03_860 等)上,我们通过将数据集的目录设置为评估数据集来执行完整评估。这样,我们就得到了每个数据集对应精度的列表,然后报告的精度是这些值的平均值。
@@ -201,6 +206,7 @@ eval:
```
#### 检查配置文件
+
除了数据集的设置,请同时重点关注以下变量的配置:`system.distribute`, `system.val_while_train`, `common.batch_size`, `train.ckpt_save_dir`, `train.dataset.dataset_root`, `train.dataset.data_dir`, `train.dataset.label_file`,
`eval.ckpt_load_path`, `eval.dataset.dataset_root`, `eval.dataset.data_dir`, `eval.dataset.label_file`, `eval.loader.batch_size`。说明如下:
@@ -260,7 +266,7 @@ msrun --worker_num=8 --local_worker_num=8 python tools/train.py --config configs
# 经验证,绑核在大部分情况下有性能加速,请配置参数并运行
msrun --bind_core=True --worker_num=8 --local_worker_num=8 python tools/train.py --config configs/rec/crnn/crnn_resnet34.yaml
```
-**注意:** 有关 msrun 配置的更多信息,请参考[此处](https://www.mindspore.cn/tutorials/experts/zh-CN/r2.3.1/parallel/msrun_launcher.html).
+**注意:** 有关 msrun 配置的更多信息,请参考[此处](https://www.mindspore.cn/docs/zh-CN/master/model_train/parallel/msrun_launcher.html).
* 单卡训练
@@ -331,7 +337,7 @@ Mindocr内置了一部分字典,均放在了 `mindocr/utils/dict/` 位置,
### 通用泛化中文模型
-在采用图模式的ascend 910*上实验结果,mindspore版本为2.3.1
+在采用图模式的ascend 910*上实验结果,mindspore版本为2.5.0
| **model name** | **backbone** | **cards** | **batch size** | **language** | **jit level** | **graph compile** | **ms/step** | **img/s** | **scene** | **web** | **document** | **recipe** | **weight** |
| :------------: | :----------: | :-------: | :------------: | :----------: | :-----------: | :---------------: | :---------: | :-------: |:---------:|:-------:|:------------:|:-----------------------------------------------------------------------------------------------------------------------------------------------------------------------:| :---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------: |
@@ -344,7 +350,7 @@ Mindocr内置了一部分字典,均放在了 `mindocr/utils/dict/` 位置,
#### 训练性能表现
-在采用图模式的ascend 910*上实验结果,mindspore版本为2.3.1
+在采用图模式的ascend 910*上实验结果,mindspore版本为2.5.0
| **model name** | **backbone** | **train dataset** | **params(M)** | **cards** | **batch size** | **jit level** | **graph compile** | **ms/step** | **img/s** | **accuracy** | **recipe** | **weight** |
|:--------------:| :----------: | :---------------: | :-----------: | :-------: | :------------: | :-----------: |:-----------------:|:-----------:|:---------:|:------------:|:---------------------------------------------------------------------------------------------:|:----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------:|
@@ -364,7 +370,7 @@ Mindocr内置了一部分字典,均放在了 `mindocr/utils/dict/` 位置,
#### 推理性能表现
-在采用图模式的ascend 310P上实验结果,mindspore lite版本为2.3.1
+在采用图模式的ascend 310P上实验结果,mindspore lite版本为2.5.0
| model name | backbone | test dataset | params(M) | cards | batch size | **jit level** | **graph compile** | img/s |
| :--------: | :---------: | :----------: | :-------: | :---: | :--------: | :-----------: | :---------------: | :----: |
@@ -376,7 +382,7 @@ Mindocr内置了一部分字典,均放在了 `mindocr/utils/dict/` 位置,
### 注意
- 如需在其他环境配置重现训练结果,请确保全局批量大小与原配置文件保持一致。
- 模型所能识别的字符都是默认的设置,即所有英文小写字母a至z及数字0至9,详细请看[字符词典](#字符词典)
-- 模型都是从头开始训练的,无需任何预训练。关于训练和测试数据集的详细介绍,请参考[数据集下载及使用](#数据集下载)章节。
+- 模型都是从头开始训练的,无需任何预训练。关于训练和测试数据集的详细介绍,请参考[数据准备](#数据准备)章节。
- CRNN_VGG7和CRNN_ResNet34_vd的MindIR导出时的输入Shape均为(1, 3, 32, 100)。
diff --git a/configs/rec/master/README.md b/configs/rec/master/README.md
index 20e6aac81..eb8768d32 100644
--- a/configs/rec/master/README.md
+++ b/configs/rec/master/README.md
@@ -22,20 +22,17 @@ Attention-based scene text recognizers have gained huge success, which leverages
| mindspore | ascend driver | firmware | cann toolkit/kernel |
|:----------:|:--------------:|:-------------:|:-------------------:|
-| 2.3.1 | 24.1.RC2 | 7.3.0.1.231 | 8.0.RC2.beta1 |
-
+| 2.5.0 | 24.1.0 | 7.5.0.3.220 | 8.0.0.beta1 |
## Quick Start
-### Preparation
-
-#### Installation
+### Installation
Please refer to the [installation instruction](https://github.com/mindspore-lab/mindocr#installation) in MindOCR.
-#### Dataset Preparation
+### Dataset Preparation
-##### MJSynth, validation and evaluation dataset
+#### MJSynth, validation and evaluation dataset
Part of the lmdb dataset for training and evaluation can be downloaded from [here](https://www.dropbox.com/sh/i39abvnefllx2si/AAAbAYRvxzRp3cIE5HzqUw3ra?dl=0) (ref: [deep-text-recognition-benchmark](https://github.com/clovaai/deep-text-recognition-benchmark#download-lmdb-dataset-for-traininig-and-evaluation-from-here)). There're several zip files:
- `data_lmdb_release.zip` contains the datasets including training data, validation data and evaluation data.
- `training/` contains two datasets: [MJSynth (MJ)](http://www.robots.ox.ac.uk/~vgg/data/text/) and [SynthText (ST)](https://academictorrents.com/details/2dba9518166cbd141534cbf381aa3e99a087e83c). *Here we use **MJSynth only**.*
@@ -44,7 +41,7 @@ Part of the lmdb dataset for training and evaluation can be downloaded from [her
- `validation.zip`: same as the validation/ within data_lmdb_release.zip
- `evaluation.zip`: same as the evaluation/ within data_lmdb_release.zip
-##### SynthText dataset
+#### SynthText dataset
For `SynthText`, we do not use the given LMDB dataset in `data_lmdb_release.zip`, since it only contains part of the cropped images. Please download the raw dataset from [here](https://academictorrents.com/details/2dba9518166cbd141534cbf381aa3e99a087e83c) and prepare the LMDB dataset using the following command
@@ -58,7 +55,7 @@ python tools/dataset_converters/convert.py \
```
the `ST_full` contained the full cropped images of SynthText in LMDB data format. Please replace the `ST` folder with the `ST_full` folder.
-##### SynthAdd dataset
+#### SynthAdd dataset
Please download the **SynthAdd** Dataset from [here](https://pan.baidu.com/s/1uV0LtoNmcxbO-0YA7Ch4dg) (code: 627x). This dataset is proposed in . Please prepare the corresponding LMDB dataset using the following command
@@ -72,7 +69,7 @@ python tools/dataset_converters/convert.py \
Please put the `SynthAdd` folder in `/training` directory.
-#### Dataset Usage
+### Dataset Usage
Finally, the data structure should like this.
@@ -141,8 +138,9 @@ Here we used the datasets under `training/` folders for training, and the union
- [SVT](http://www.iapr-tc11.org/mediawiki/index.php/The_Street_View_Text_Dataset): 2.4 MB, 647 samples
- [SVTP](http://openaccess.thecvf.com/content_iccv_2013/papers/Phan_Recognizing_Text_with_2013_ICCV_paper.pdf): 1.8 MB, 645 samples
+### Update yaml config file
-**Data configuration for model training**
+#### Data configuration for model training
To reproduce the training of model, it is recommended that you modify the configuration yaml as follows:
@@ -163,7 +161,7 @@ eval:
...
```
-**Data configuration for model evaluation**
+#### Data configuration for model evaluation
We use the dataset under `evaluation/` as the benchmark dataset. On **each individual dataset** (e.g. CUTE80, IC03_860, etc.), we perform a full evaluation by setting the dataset's directory to the evaluation dataset. This way, we get a list of the corresponding accuracies for each dataset, and then the reported accuracies are the average of these values.
@@ -230,6 +228,7 @@ eval:
```
#### Check YAML Config Files
+
Apart from the dataset setting, please also check the following important args: `system.distribute`, `system.val_while_train`, `common.batch_size`, `train.ckpt_save_dir`, `train.dataset.dataset_root`, `train.dataset.data_dir`, `train.dataset.label_file`,
`eval.ckpt_load_path`, `eval.dataset.dataset_root`, `eval.dataset.data_dir`, `eval.dataset.label_file`, `eval.loader.batch_size`. Explanations of these important args:
@@ -288,7 +287,7 @@ msrun --worker_num=4 --local_worker_num=4 python tools/train.py --config configs
# Based on verification,binding cores usually results in performance acceleration.Please configure the parameters and run.
msrun --bind_core=True --worker_num=4 --local_worker_num=4 python tools/train.py --config configs/rec/master/master_resnet31.yaml
```
-**Note:** For more information about msrun configuration, please refer to [here](https://www.mindspore.cn/tutorials/experts/en/r2.3.1/parallel/msrun_launcher.html).
+**Note:** For more information about msrun configuration, please refer to [here](https://www.mindspore.cn/docs/en/master/model_train/parallel/msrun_launcher.html).
@@ -346,7 +345,7 @@ According to our experiments, the evaluation results on public benchmark dataset
**Notes:**
- To reproduce the result on other contexts, please ensure the global batch size is the same.
-- The models are trained from scratch without any pre-training. For more dataset details of training and evaluation, please refer to [Dataset Download & Dataset Usage](#dataset-usage) section.
+- The models are trained from scratch without any pre-training. For more dataset details of training and evaluation, please refer to [Dataset preparation](#dataset-preparation) section.
- The input Shapes of MindIR of MASTER is (1, 3, 48, 160).
diff --git a/configs/rec/master/README_CN.md b/configs/rec/master/README_CN.md
index fe9a672c4..938be72cf 100644
--- a/configs/rec/master/README_CN.md
+++ b/configs/rec/master/README_CN.md
@@ -23,20 +23,18 @@
| mindspore | ascend driver | firmware | cann toolkit/kernel |
|:----------:|:--------------:|:-------------:|:-------------------:|
-| 2.3.1 | 24.1.RC2 | 7.3.0.1.231 | 8.0.RC2.beta1 |
+| 2.5.0 | 24.1.0 | 7.5.0.3.220 | 8.0.0.beta1 |
## 快速开始
-### 环境及数据准备
-
-#### 安装
+### 安装
环境安装教程请参考MindOCR的 [installation instruction](https://github.com/mindspore-lab/mindocr#installation).
-#### 数据集准备
+### 数据准备
-##### MJSynth, 验证集和测试集
+#### MJSynth, 验证集和测试集
部分LMDB格式的训练及验证数据集可以从[这里](https://www.dropbox.com/sh/i39abvnefllx2si/AAAbAYRvxzRp3cIE5HzqUw3ra?dl=0) (出处: [deep-text-recognition-benchmark](https://github.com/clovaai/deep-text-recognition-benchmark#download-lmdb-dataset-for-traininig-and-evaluation-from-here))下载。连接中的文件包含多个压缩文件,其中:
- `data_lmdb_release.zip` 包含了了部分数据集,有训练集(training/),验证集(validation/)以及测试集(evaluation)。
- `training.zip` 包括两个数据集,分别是 [MJSynth (MJ)](http://www.robots.ox.ac.uk/~vgg/data/text/) 和 [SynthText (ST)](https://academictorrents.com/details/2dba9518166cbd141534cbf381aa3e99a087e83c)。 这里我们只使用**MJSynth**。
@@ -45,7 +43,7 @@
- `validation.zip`: 与 data_lmdb_release.zip 中的validation/ 一样。
- `evaluation.zip`: 与 data_lmdb_release.zip 中的evaluation/ 一样。
-##### SynthText dataset
+#### SynthText dataset
我们不使用`data_lmdb_release.zip`提供的`SynthText`数据, 因为它只包含部分切割下来的图片。请从[此处](https://academictorrents.com/details/2dba9518166cbd141534cbf381aa3e99a087e83c)下载原始数据, 并使用以下命令转换成LMDB格式
@@ -59,7 +57,7 @@ python tools/dataset_converters/convert.py \
```
`ST_full` 包含了所有已切割的图片,以LMDB格式储存。 请将 `ST` 文件夹换成 `ST_full` 文件夹。
-##### SynthAdd dataset
+#### SynthAdd dataset
另外请从[此处](https://pan.baidu.com/s/1uV0LtoNmcxbO-0YA7Ch4dg)(密码:627x)下载**SynthAdd**训练集. 这个训练集是由提出。请使用以下命令转换成LMDB格式
@@ -73,7 +71,7 @@ python tools/dataset_converters/convert.py \
并将转换完成的`SynthAdd`文件夹摆在`/training`里面.
-#### 数据集使用
+### 数据集使用
最终数据文件夹结构如下:
@@ -142,7 +140,9 @@ data_lmdb_release/
- [SVT](http://www.iapr-tc11.org/mediawiki/index.php/The_Street_View_Text_Dataset): 2.4 MB, 647 samples
- [SVTP](http://openaccess.thecvf.com/content_iccv_2013/papers/Phan_Recognizing_Text_with_2013_ICCV_paper.pdf): 1.8 MB, 645 samples
-**模型训练的数据配置**
+### 配置说明
+
+#### 模型训练的数据配置
如欲重现模型的训练,建议修改配置yaml如下:
@@ -163,7 +163,7 @@ eval:
...
```
-**模型评估的数据配置**
+#### 模型评估的数据配置
我们使用 `evaluation/` 下的数据集作为基准数据集。在**每个单独的数据集**(例如 CUTE80、IC03_860 等)上,我们通过将数据集的目录设置为评估数据集来执行完整评估。这样,我们就得到了每个数据集对应精度的列表,然后报告的精度是这些值的平均值。
@@ -231,6 +231,7 @@ eval:
```
#### 检查配置文件
+
除了数据集的设置,请同时重点关注以下变量的配置:`system.distribute`, `system.val_while_train`, `common.batch_size`, `train.ckpt_save_dir`, `train.dataset.dataset_root`, `train.dataset.data_dir`, `train.dataset.label_file`,
`eval.ckpt_load_path`, `eval.dataset.dataset_root`, `eval.dataset.data_dir`, `eval.dataset.label_file`, `eval.loader.batch_size`。说明如下:
@@ -289,7 +290,7 @@ msrun --worker_num=4 --local_worker_num=4 python tools/train.py --config configs
# 经验证,绑核在大部分情况下有性能加速,请配置参数并运行
msrun --bind_core=True --worker_num=4 --local_worker_num=4 python tools/train.py --config configs/rec/master/master_resnet31.yaml
```
-**注意:** 有关 msrun 配置的更多信息,请参考[此处](https://www.mindspore.cn/tutorials/experts/zh-CN/r2.3.1/parallel/msrun_launcher.html).
+**注意:** 有关 msrun 配置的更多信息,请参考[此处](https://www.mindspore.cn/docs/zh-CN/master/model_train/parallel/msrun_launcher.html).
@@ -349,7 +350,7 @@ Table Format:
**注意:**
- 如需在其他环境配置重现训练结果,请确保全局批量大小与原配置文件保持一致。
-- 模型都是从头开始训练的,无需任何预训练。关于训练和测试数据集的详细介绍,请参考[数据集下载及使用](#环境及数据准备)章节。
+- 模型都是从头开始训练的,无需任何预训练。关于训练和测试数据集的详细介绍,请参考[数据准备](#数据准备)章节。
- Master的MindIR导出时的输入Shape均为(1, 3, 48, 160)。
diff --git a/configs/rec/rare/README.md b/configs/rec/rare/README.md
index 6b35218bc..235a393cb 100644
--- a/configs/rec/rare/README.md
+++ b/configs/rec/rare/README.md
@@ -22,16 +22,18 @@ Recognizing text in natural images is a challenging task with many unsolved prob
| mindspore | ascend driver | firmware | cann toolkit/kernel |
|:----------:|:--------------:|:-------------:|:-------------------:|
-| 2.3.1 | 24.1.RC2 | 7.3.0.1.231 | 8.0.RC2.beta1 |
-
+| 2.5.0 | 24.1.0 | 7.5.0.3.220 | 8.0.0.beta1 |
## Quick Start
-### Preparation
-#### Installation
+### Installation
+
Please refer to the [installation instruction](https://github.com/mindspore-lab/mindocr#installation) in MindOCR.
+### Dataset preparation
+
#### Dataset Download
+
Please download lmdb dataset for traininig and evaluation from [here](https://www.dropbox.com/sh/i39abvnefllx2si/AAAbAYRvxzRp3cIE5HzqUw3ra?dl=0) (ref: [deep-text-recognition-benchmark](https://github.com/clovaai/deep-text-recognition-benchmark#download-lmdb-dataset-for-traininig-and-evaluation-from-here)). There're several zip files:
- `data_lmdb_release.zip` contains the **entire** datasets including training data, validation data and evaluation data.
- `training/` contains two datasets: [MJSynth (MJ)](http://www.robots.ox.ac.uk/~vgg/data/text/) and [SynthText (ST)](https://academictorrents.com/details/2dba9518166cbd141534cbf381aa3e99a087e83c)
@@ -105,8 +107,9 @@ Here we used the datasets under `training/` folders for training, and the union
- [SVT](http://www.iapr-tc11.org/mediawiki/index.php/The_Street_View_Text_Dataset): 2.4 MB, 647 samples
- [SVTP](http://openaccess.thecvf.com/content_iccv_2013/papers/Phan_Recognizing_Text_with_2013_ICCV_paper.pdf): 1.8 MB, 645 samples
+### Update yaml config file
-**Data configuration for model training**
+#### Data configuration for model training
To reproduce the training of model, it is recommended that you modify the configuration yaml as follows:
@@ -127,7 +130,7 @@ eval:
...
```
-**Data configuration for model evaluation**
+#### Data configuration for model evaluation
We use the dataset under `evaluation/` as the benchmark dataset. On **each individual dataset** (e.g. CUTE80, IC03_860, etc.), we perform a full evaluation by setting the dataset's directory to the evaluation dataset. This way, we get a list of the corresponding accuracies for each dataset, and then the reported accuracies are the average of these values.
@@ -194,6 +197,7 @@ eval:
```
#### Check YAML Config Files
+
Apart from the dataset setting, please also check the following important args: `system.distribute`, `system.val_while_train`, `common.batch_size`, `train.ckpt_save_dir`, `train.dataset.dataset_root`, `train.dataset.data_dir`, `train.dataset.label_file`,
`eval.ckpt_load_path`, `eval.dataset.dataset_root`, `eval.dataset.data_dir`, `eval.dataset.label_file`, `eval.loader.batch_size`. Explanations of these important args:
@@ -251,7 +255,7 @@ msrun --worker_num=4 --local_worker_num=4 python tools/train.py --config configs
# Based on verification,binding cores usually results in performance acceleration.Please configure the parameters and run.
msrun --bind_core=True --worker_num=4 --local_worker_num=4 python tools/train.py --config configs/rec/rare/rare_resnet34.yaml
```
-**Note:** For more information about msrun configuration, please refer to [here](https://www.mindspore.cn/tutorials/experts/en/r2.3.1/parallel/msrun_launcher.html).
+**Note:** For more information about msrun configuration, please refer to [here](https://www.mindspore.cn/docs/en/master/model_train/parallel/msrun_launcher.html).
* Standalone Training
@@ -308,7 +312,7 @@ According to our experiments, the evaluation results on public benchmark dataset
**Notes:**
- To reproduce the result on other contexts, please ensure the global batch size is the same.
- The characters supported by model are lowercase English characters from a to z and numbers from 0 to 9. More explanation on dictionary, please refer to [Character Dictionary](#character-dictionary).
-- The models are trained from scratch without any pre-training. For more dataset details of training and evaluation, please refer to [Dataset Download & Dataset Usage](#dataset-usage) section.
+- The models are trained from scratch without any pre-training. For more dataset details of training and evaluation, please refer to [Dataset preparation](#dataset-preparation) section.
- The input Shapes of MindIR of RARE is (1, 3, 32, 100) and it is for Ascend only.
## Character Dictionary
diff --git a/configs/rec/rare/README_CN.md b/configs/rec/rare/README_CN.md
index c4eace91f..575e2d307 100644
--- a/configs/rec/rare/README_CN.md
+++ b/configs/rec/rare/README_CN.md
@@ -23,16 +23,18 @@
| mindspore | ascend driver | firmware | cann toolkit/kernel |
|:----------:|:--------------:|:-------------:|:-------------------:|
-| 2.3.1 | 24.1.RC2 | 7.3.0.1.231 | 8.0.RC2.beta1 |
-
+| 2.5.0 | 24.1.0 | 7.5.0.3.220 | 8.0.0.beta1 |
## 快速开始
-### 环境及数据准备
-#### 安装
+### 安装
+
环境安装教程请参考MindOCR的 [installation instruction](https://github.com/mindspore-lab/mindocr#installation).
+### 数据准备
+
#### 数据集下载
+
LMDB格式的训练及验证数据集可以从[这里](https://www.dropbox.com/sh/i39abvnefllx2si/AAAbAYRvxzRp3cIE5HzqUw3ra?dl=0) (出处: [deep-text-recognition-benchmark](https://github.com/clovaai/deep-text-recognition-benchmark#download-lmdb-dataset-for-traininig-and-evaluation-from-here))下载。连接中的文件包含多个压缩文件,其中:
- `data_lmdb_release.zip` 包含了**完整**的一套数据集,有训练集(training/),验证集(validation/)以及测试集(evaluation)。
- `training.zip` 包括两个数据集,分别是 [MJSynth (MJ)](http://www.robots.ox.ac.uk/~vgg/data/text/) 和 [SynthText (ST)](https://academictorrents.com/details/2dba9518166cbd141534cbf381aa3e99a087e83c)
@@ -105,7 +107,9 @@ data_lmdb_release/
- [SVT](http://www.iapr-tc11.org/mediawiki/index.php/The_Street_View_Text_Dataset): 2.4 MB, 647 samples
- [SVTP](http://openaccess.thecvf.com/content_iccv_2013/papers/Phan_Recognizing_Text_with_2013_ICCV_paper.pdf): 1.8 MB, 645 samples
-**模型训练的数据配置**
+### 配置说明
+
+#### 模型训练的数据配置
如欲重现模型的训练,建议修改配置yaml如下:
@@ -126,7 +130,7 @@ eval:
...
```
-**模型评估的数据配置**
+#### 模型评估的数据配置
我们使用 `evaluation/` 下的数据集作为基准数据集。在**每个单独的数据集**(例如 CUTE80、IC03_860 等)上,我们通过将数据集的目录设置为评估数据集来执行完整评估。这样,我们就得到了每个数据集对应精度的列表,然后报告的精度是这些值的平均值。
@@ -194,6 +198,7 @@ eval:
```
#### 检查配置文件
+
除了数据集的设置,请同时重点关注以下变量的配置:`system.distribute`, `system.val_while_train`, `common.batch_size`, `train.ckpt_save_dir`, `train.dataset.dataset_root`, `train.dataset.data_dir`, `train.dataset.label_file`,
`eval.ckpt_load_path`, `eval.dataset.dataset_root`, `eval.dataset.data_dir`, `eval.dataset.label_file`, `eval.loader.batch_size`。说明如下:
@@ -251,7 +256,7 @@ msrun --worker_num=4 --local_worker_num=4 python tools/train.py --config configs
# 经验证,绑核在大部分情况下有性能加速,请配置参数并运行
msrun --bind_core=True --worker_num=4 --local_worker_num=4 python tools/train.py --config configs/rec/rare/rare_resnet34.yaml
```
-**注意:** 有关 msrun 配置的更多信息,请参考[此处](https://www.mindspore.cn/tutorials/experts/zh-CN/r2.3.1/parallel/msrun_launcher.html).
+**注意:** 有关 msrun 配置的更多信息,请参考[此处](https://www.mindspore.cn/docs/zh-CN/master/model_train/parallel/msrun_launcher.html).
* 单卡训练
@@ -308,7 +313,7 @@ Table Format:
**注意:**
- 如需在其他环境配置重现训练结果,请确保全局批量大小与原配置文件保持一致。
- 模型所能识别的字符都是默认的设置,即所有英文小写字母a至z及数字0至9,详细请看[字符词典](#字符词典)
-- 模型都是从头开始训练的,无需任何预训练。关于训练和测试数据集的详细介绍,请参考[数据集下载及使用](#数据集下载)章节。
+- 模型都是从头开始训练的,无需任何预训练。关于训练和测试数据集的详细介绍,请参考[数据准备](#数据准备)章节。
- RARE的MindIR导出时的输入Shape均为(1, 3, 32, 100),只能在昇腾卡上使用。
## 字符词典
diff --git a/configs/rec/robustscanner/README.md b/configs/rec/robustscanner/README.md
index c8ff32e3a..3098d0653 100644
--- a/configs/rec/robustscanner/README.md
+++ b/configs/rec/robustscanner/README.md
@@ -25,16 +25,18 @@ Overall, the RobustScanner model consists of an encoder and a decoder. The encod
| mindspore | ascend driver | firmware | cann toolkit/kernel |
|:----------:|:--------------:|:-------------:|:-------------------:|
-| 2.3.1 | 24.1.RC2 | 7.3.0.1.231 | 8.0.RC2.beta1 |
-
+| 2.5.0 | 24.1.0 | 7.5.0.3.220 | 8.0.0.beta1 |
## Quick Start
-### Preparation
-#### Installation
+### Installation
+
Please refer to the [installation instruction](https://github.com/mindspore-lab/mindocr#installation) in MindOCR.
+### Dataset preparation
+
#### Dataset Download
+
The dataset used for training and validation in this work, was referenced from the datasets used by mmocr and PaddleOCR for reproducing the RobustScanner algorithms. We are very grateful to mmocr and PaddleOCR for improving the reproducibility efficiency of this repository.
The details of the dataset are as follows:
@@ -136,8 +138,9 @@ data/
```
Here, we use the datasets under the `training/` folder for training and the datasets under the `evaluation/` folder for model evaluation. For convenience of storage and usage, all data is in the lmdb format.
+### Update yaml config file
-**Data configuration for model training**
+#### Data configuration for model training
To reproduce the training of model, it is recommended that you modify the configuration yaml as follows:
@@ -158,7 +161,7 @@ eval:
...
```
-**Data configuration for model evaluation**
+## Data configuration for model evaluation
We use the dataset under `evaluation/` as the benchmark dataset. On **each individual dataset** (e.g. CUTE80, IC13_1015, etc.), we perform a full evaluation by setting the dataset's directory to the evaluation dataset. This way, we get a list of the corresponding accuracies for each dataset, and then the reported accuracies are the average of these values.
@@ -224,6 +227,7 @@ eval:
```
#### Check YAML Config Files
+
Apart from the dataset setting, please also check the following important args: `system.distribute`, `system.val_while_train`, `common.batch_size`, `train.ckpt_save_dir`, `train.dataset.dataset_root`, `train.dataset.data_dir`,
`eval.ckpt_load_path`, `eval.dataset.dataset_root`, `eval.dataset.data_dir`, `eval.loader.batch_size`. Explanations of these important args:
@@ -280,7 +284,7 @@ msrun --worker_num=4 --local_worker_num=4 python tools/train.py --config configs
# Based on verification,binding cores usually results in performance acceleration.Please configure the parameters and run.
msrun --bind_core=True --worker_num=4 --local_worker_num=4 python tools/train.py --config configs/rec/robustscanner/robustscanner_resnet31.yaml
```
-**Note:** For more information about msrun configuration, please refer to [here](https://www.mindspore.cn/tutorials/experts/en/r2.3.1/parallel/msrun_launcher.html).
+**Note:** For more information about msrun configuration, please refer to [here](https://www.mindspore.cn/docs/en/master/model_train/parallel/msrun_launcher.html).
* Standalone Training
@@ -340,7 +344,7 @@ Note: In addition to using the MJSynth (partial) and SynthText (partial) text re
**Notes:**
- To reproduce the result on other contexts, please ensure the global batch size is the same.
- The model uses an English character dictionary, en_dict90.txt, consisting of 90 characters including digits, common symbols, and upper and lower case English letters. More explanation on dictionary, please refer to [Character Dictionary](#character-dictionary).
-- The models are trained from scratch without any pre-training. For more dataset details of training and evaluation, please refer to [Dataset Download & Dataset Usage](#dataset-usage) section.
+- The models are trained from scratch without any pre-training. For more dataset details of training and evaluation, please refer to [Dataset preparation](#dataset-preparation) section.
- The input Shapes of MindIR of RobustScanner is (1, 3, 48, 160) and it is for Ascend only.
diff --git a/configs/rec/robustscanner/README_CN.md b/configs/rec/robustscanner/README_CN.md
index 10fe3ba37..2dc9ab0bb 100644
--- a/configs/rec/robustscanner/README_CN.md
+++ b/configs/rec/robustscanner/README_CN.md
@@ -25,16 +25,19 @@ RobustScanner是具有注意力机制的编码器-解码器文字识别算法,
| mindspore | ascend driver | firmware | cann toolkit/kernel |
|:----------:|:--------------:|:-------------:|:-------------------:|
-| 2.3.1 | 24.1.RC2 | 7.3.0.1.231 | 8.0.RC2.beta1 |
+| 2.5.0 | 24.1.0 | 7.5.0.3.220 | 8.0.0.beta1 |
## 快速开始
-### 环境及数据准备
-#### 安装
+### 安装
+
环境安装教程请参考MindOCR的 [installation instruction](https://github.com/mindspore-lab/mindocr#installation).
+### 数据准备
+
#### 数据集下载
+
本RobustScanner训练、验证使用的数据集参考了mmocr和PaddleOCR所使用的数据集对文献算法进行复现,在此非常感谢mmocr和PaddleOCR,提高了本repo的复现效率。
数据集细节如下:
@@ -63,6 +66,7 @@ RobustScanner是具有注意力机制的编码器-解码器文字识别算法,
- `SynthText800K_shuffle_xxx_xxx.zip`: 1_200共5个zip文件,包含SynthText数据集中随机挑选的240万个样本。
- 验证集
- `testing_lmdb.zip`: 包含了评估模型使用的CUTE80, icdar2013, icdar2015, IIIT5k, SVT, SVTP六个数据集。
+
#### 数据集使用
数据文件夹按照如下结构进行解压:
@@ -134,7 +138,9 @@ data/
```
在这里,我们使用 `training/` 文件夹下的数据集进行训练,并使用 `evaluation/` 下的数据集来进行模型的验证和评估。为方便存储和使用,所有数据均为lmdb格式
-**模型训练的数据配置**
+### 配置说明
+
+#### 模型训练的数据配置
如欲重现模型的训练,建议修改配置yaml如下:
@@ -155,7 +161,7 @@ eval:
...
```
-**模型评估的数据配置**
+#### 模型评估的数据配置
我们使用 `evaluation/` 下的数据集作为基准数据集。在**每个单独的数据集**(例如 CUTE80、IC13_1015 等)上,我们通过将数据集的目录设置为评估数据集来执行完整评估。这样,我们就得到了每个数据集对应精度的列表,然后报告的精度是这些值的平均值。
@@ -222,6 +228,7 @@ eval:
```
#### 检查配置文件
+
除了数据集的设置,请同时重点关注以下变量的配置:`system.distribute`, `system.val_while_train`, `common.batch_size`, `train.ckpt_save_dir`, `train.dataset.dataset_root`, `train.dataset.data_dir`,
`eval.ckpt_load_path`, `eval.dataset.dataset_root`, `eval.dataset.data_dir`, `eval.loader.batch_size`。说明如下:
@@ -282,7 +289,7 @@ msrun --worker_num=4 --local_worker_num=4 python tools/train.py --config configs
# 经验证,绑核在大部分情况下有性能加速,请配置参数并运行
msrun --bind_core=True --worker_num=4 --local_worker_num=4 python tools/train.py --config configs/rec/robustscanner/robustscanner_resnet31.yaml
```
-**注意:** 有关 msrun 配置的更多信息,请参考[此处](https://www.mindspore.cn/tutorials/experts/zh-CN/r2.3.1/parallel/msrun_launcher.html).
+**注意:** 有关 msrun 配置的更多信息,请参考[此处](https://www.mindspore.cn/docs/zh-CN/master/model_train/parallel/msrun_launcher.html).
* 单卡训练
@@ -342,7 +349,7 @@ Table Format:
**注意:**
- 如需在其他环境配置重现训练结果,请确保全局批量大小与原配置文件保持一致。
- 模型使用90个字符的英文字典en_dict90.txt,其中有数字,常用符号以及大小写的英文字母,详细请看[字符词典](#字符词典)
-- 模型都是从头开始训练的,无需任何预训练。关于训练和测试数据集的详细介绍,请参考[数据集下载及使用](#数据集下载)章节。
+- 模型都是从头开始训练的,无需任何预训练。关于训练和测试数据集的详细介绍,请参考[数据准备](#数据准备)章节。
- RobustScanner的MindIR导出时的输入Shape均为(1, 3, 48, 160)。
diff --git a/configs/rec/svtr/README.md b/configs/rec/svtr/README.md
index 818c6cec1..42f865411 100644
--- a/configs/rec/svtr/README.md
+++ b/configs/rec/svtr/README.md
@@ -24,7 +24,7 @@ Dominant scene text recognition models commonly contain two building blocks, a v
| mindspore | ascend driver | firmware | cann toolkit/kernel |
|:----------:|:--------------:|:-------------:|:-------------------:|
-| 2.3.1 | 24.1.RC2 | 7.3.0.1.231 | 8.0.RC2.beta1 |
+| 2.5.0 | 24.1.0 | 7.5.0.3.220 | 8.0.0.beta1 |
## Quick Start
@@ -269,7 +269,7 @@ msrun --worker_num=8 --local_worker_num=8 python tools/train.py --config configs
# Based on verification,binding cores usually results in performance acceleration.Please configure the parameters and run.
msrun --bind_core=True --worker_num=8 --local_worker_num=8 python tools/train.py --config configs/rec/svtr/svtr_tiny_8p.yaml
```
-**Note:** For more information about msrun configuration, please refer to [here](https://www.mindspore.cn/tutorials/experts/en/r2.3.1/parallel/msrun_launcher.html).
+**Note:** For more information about msrun configuration, please refer to [here](https://www.mindspore.cn/docs/en/master/model_train/parallel/msrun_launcher.html).
@@ -334,7 +334,7 @@ For detailed instruction of data preparation and yaml configuration, please refe
### General Purpose Chinese Models
-Experiments are tested on ascend 910* with mindspore 2.3.1 graph mode.
+Experiments are tested on ascend 910* with mindspore 2.5.0 graph mode.
| **model name** | **cards** | **batch size** | **languages** | **jit level** | **graph compile** | **ms/step** | **img/s** | **scene** | **web** | **document** | **recipe** | **weight** |
| :------------: | :-------: | :------------: | :-----------: | :-----------: | :---------------: | :---------: | :-------: |:---------:|:-------:| :----------: | :--------------------------------------------------------------------------------------------------------: | :-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------: |
@@ -342,7 +342,7 @@ Experiments are tested on ascend 910* with mindspore 2.3.1 graph mode.
### Specific Purpose Models
-Experiments are tested on ascend 910* with mindspore 2.3.1 graph mode.
+Experiments are tested on ascend 910* with mindspore 2.5.0 graph mode.
| **model name** | **backbone** | **train dataset** | **params(M)** | **cards** | **batch size** | **jit level** | **graph compile** | **ms/step** | **img/s** | **accuracy** | **recipe** | **weight** |
|:--------------:|:------------:|:--------------------:|:-------------:|:---------:|:--------------:| :-----------: |:-----------------:|:-----------:|:---------:|:------------:|:----------------------------------------------------------------------------------------------:|:-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------:|
diff --git a/configs/rec/svtr/README_CN.md b/configs/rec/svtr/README_CN.md
index 600d27aa8..e335faeec 100644
--- a/configs/rec/svtr/README_CN.md
+++ b/configs/rec/svtr/README_CN.md
@@ -23,19 +23,18 @@
| mindspore | ascend driver | firmware | cann toolkit/kernel |
|:----------:|:--------------:|:-------------:|:-------------------:|
-| 2.3.1 | 24.1.RC2 | 7.3.0.1.231 | 8.0.RC2.beta1 |
-
-
+| 2.5.0 | 24.1.0 | 7.5.0.3.220 | 8.0.0.beta1 |
## 快速开始
-### 环境及数据准备
-#### 安装
+### 安装
+
环境安装教程请参考MindOCR的 [installation instruction](https://github.com/mindspore-lab/mindocr#installation).
-#### 数据集准备
+### 数据准备
+
+#### MJSynth, 验证集和测试集
-##### MJSynth, 验证集和测试集
部分LMDB格式的训练及验证数据集可以从[这里](https://www.dropbox.com/sh/i39abvnefllx2si/AAAbAYRvxzRp3cIE5HzqUw3ra?dl=0) (出处: [deep-text-recognition-benchmark](https://github.com/clovaai/deep-text-recognition-benchmark#download-lmdb-dataset-for-traininig-and-evaluation-from-here))下载。连接中的文件包含多个压缩文件,其中:
- `data_lmdb_release.zip` 包含了了部分数据集,有训练集(training/),验证集(validation/)以及测试集(evaluation)。
- `training.zip` 包括两个数据集,分别是 [MJSynth (MJ)](http://www.robots.ox.ac.uk/~vgg/data/text/) 和 [SynthText (ST)](https://academictorrents.com/details/2dba9518166cbd141534cbf381aa3e99a087e83c)。 这里我们只使用**MJSynth**。
@@ -44,7 +43,7 @@
- `validation.zip`: 与 data_lmdb_release.zip 中的validation/ 一样。
- `evaluation.zip`: 与 data_lmdb_release.zip 中的evaluation/ 一样。
-##### SynthText数据集
+#### SynthText数据集
我们不使用`data_lmdb_release.zip`提供的`SynthText`数据, 因为它只包含部分切割下来的图片。请从[此处](https://academictorrents.com/details/2dba9518166cbd141534cbf381aa3e99a087e83c)下载原始数据, 并使用以下命令转换成LMDB格式
@@ -122,7 +121,9 @@ data_lmdb_release/
- [SVT](http://www.iapr-tc11.org/mediawiki/index.php/The_Street_View_Text_Dataset): 2.4 MB, 647 samples
- [SVTP](http://openaccess.thecvf.com/content_iccv_2013/papers/Phan_Recognizing_Text_with_2013_ICCV_paper.pdf): 1.8 MB, 645 samples
-**模型训练的数据配置**
+### 配置说明
+
+#### 模型训练的数据配置
如欲重现模型的训练,建议修改配置yaml如下:
@@ -143,7 +144,7 @@ eval:
...
```
-**模型评估的数据配置**
+#### 模型评估的数据配置
我们使用 `evaluation/` 下的数据集作为基准数据集。在**每个单独的数据集**(例如 CUTE80、IC03_860 等)上,我们通过将数据集的目录设置为评估数据集来执行完整评估。这样,我们就得到了每个数据集对应精度的列表,然后报告的精度是这些值的平均值。
@@ -210,6 +211,7 @@ eval:
```
#### 检查配置文件
+
除了数据集的设置,请同时重点关注以下变量的配置:`system.distribute`, `system.val_while_train`, `common.batch_size`, `train.ckpt_save_dir`, `train.dataset.dataset_root`, `train.dataset.data_dir`, `train.dataset.label_file`,
`eval.ckpt_load_path`, `eval.dataset.dataset_root`, `eval.dataset.data_dir`, `eval.dataset.label_file`, `eval.loader.batch_size`。说明如下:
@@ -268,7 +270,7 @@ msrun --worker_num=8 --local_worker_num=8 python tools/train.py --config configs
# 经验证,绑核在大部分情况下有性能加速,请配置参数并运行
msrun --bind_core=True --worker_num=8 --local_worker_num=8 python tools/train.py --config configs/rec/svtr/svtr_tiny_8p.yaml
```
-**注意:** 有关 msrun 配置的更多信息,请参考[此处](https://www.mindspore.cn/tutorials/experts/zh-CN/r2.3.1/parallel/msrun_launcher.html).
+**注意:** 有关 msrun 配置的更多信息,请参考[此处](https://www.mindspore.cn/docs/zh-CN/master/model_train/parallel/msrun_launcher.html).
@@ -333,7 +335,7 @@ Mindocr内置了一部分字典,均放在了 `mindocr/utils/dict/` 位置,
### 通用泛化中文模型
-在采用图模式的ascend 910*上实验结果,mindspore版本为2.3.1
+在采用图模式的ascend 910*上实验结果,mindspore版本为2.5.0
| **model name** | **cards** | **batch size** | **languages** | **jit level** | **graph compile** | **ms/step** | **img/s** | **scene** | **web** | **document** | **recipe** | **weight** |
| :------------: | :-------: | :------------: | :-----------: | :-----------: | :---------------: | :---------: | :-------: |:---------:|:-------:| :----------: | :--------------------------------------------------------------------------------------------------------: | :-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------: |
@@ -342,7 +344,7 @@ Mindocr内置了一部分字典,均放在了 `mindocr/utils/dict/` 位置,
### 细分领域模型
-在采用图模式的ascend 910*上实验结果,mindspore版本为2.3.1
+在采用图模式的ascend 910*上实验结果,mindspore版本为2.5.0
| **model name** | **backbone** | **train dataset** | **params(M)** | **cards** | **batch size** | **jit level** | **graph compile** | **ms/step** | **img/s** | **accuracy** | **recipe** | **weight** |
|:--------------:|:------------:|:--------------------:|:-------------:|:---------:|:--------------:| :-----------: |:-----------------:|:-----------:|:---------:|:------------:|:----------------------------------------------------------------------------------------------:|:-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------:|
@@ -359,7 +361,7 @@ Mindocr内置了一部分字典,均放在了 `mindocr/utils/dict/` 位置,
**注意:**
- 如需在其他环境配置重现训练结果,请确保全局批量大小与原配置文件保持一致。
- 模型所能识别的字符都是默认的设置,即所有英文小写字母a至z及数字0至9,详细请看[字符词典](#字符词典)
-- 模型都是从头开始训练的,无需任何预训练。关于训练和测试数据集的详细介绍,请参考[数据集准备](#数据集准备)章节。
+- 模型都是从头开始训练的,无需任何预训练。关于训练和测试数据集的详细介绍,请参考[数据准备](#数据准备)章节。
- SVTR的MindIR导出时的输入Shape均为(1, 3, 64, 256)。
## 参考文献
diff --git a/configs/rec/svtr/README_CN_PP-OCRv3.md b/configs/rec/svtr/README_CN_PP-OCRv3.md
index 2a62eb6be..abe39d1e0 100644
--- a/configs/rec/svtr/README_CN_PP-OCRv3.md
+++ b/configs/rec/svtr/README_CN_PP-OCRv3.md
@@ -301,7 +301,7 @@ msrun --worker_num=4 --local_worker_num=4 python tools/train.py --config configs
# 经验证,绑核在大部分情况下有性能加速,请配置参数并运行
msrun --bind_core=True --worker_num=4 --local_worker_num=4 python tools/train.py --config configs/rec/svtr/svtr_ppocrv3_ch.yaml
```
-**注意:** 有关 msrun 配置的更多信息,请参考[此处](https://www.mindspore.cn/tutorials/experts/zh-CN/r2.3.1/parallel/msrun_launcher.html).
+**注意:** 有关 msrun 配置的更多信息,请参考[此处](https://www.mindspore.cn/docs/zh-CN/master/model_train/parallel/msrun_launcher.html).
diff --git a/configs/rec/visionlan/README.md b/configs/rec/visionlan/README.md
index ddf957dc5..847e69828 100644
--- a/configs/rec/visionlan/README.md
+++ b/configs/rec/visionlan/README.md
@@ -36,7 +36,7 @@ While in the test stage, MLM is not used. Only the backbone and VRM are used for
| mindspore | ascend driver | firmware | cann toolkit/kernel |
|:----------:|:--------------:|:-------------:|:-------------------:|
-| 2.3.1 | 24.1.RC2 | 7.3.0.1.231 | 8.0.RC2.beta1 |
+| 2.5.0 | 24.1.0 | 7.5.0.3.220 | 8.0.0.beta1 |
## Quick Start
@@ -193,7 +193,7 @@ msrun --bind_core=True --worker_num=4 --local_worker_num=4 python tools/train.py
msrun --bind_core=True --worker_num=4 --local_worker_num=4 python tools/train.py --config configs/rec/visionlan/visionlan_resnet45_LF_2.yaml
msrun --bind_core=True --worker_num=4 --local_worker_num=4 python tools/train.py --config configs/rec/visionlan/visionlan_resnet45_LA.yaml
```
-**Note:** For more information about msrun configuration, please refer to [here](https://www.mindspore.cn/tutorials/experts/en/r2.3.1/parallel/msrun_launcher.html).
+**Note:** For more information about msrun configuration, please refer to [here](https://www.mindspore.cn/docs/en/master/model_train/parallel/msrun_launcher.html).
The training result (including checkpoints, per-epoch performance and curves) will be saved in the directory parsed by the arg `ckpt_save_dir` in yaml config file. The default directory is `./tmp_visionlan`.
diff --git a/configs/rec/visionlan/README_CN.md b/configs/rec/visionlan/README_CN.md
index dcf461fbb..4352c9e73 100644
--- a/configs/rec/visionlan/README_CN.md
+++ b/configs/rec/visionlan/README_CN.md
@@ -26,15 +26,11 @@
但在测试阶段,MLM不被使用。只有骨干网络和VRM被用于预测。
-
## 配套版本
-
| mindspore | ascend driver | firmware | cann toolkit/kernel |
|:----------:|:--------------:|:-------------:|:-------------------:|
-| 2.3.1 | 24.1.RC2 | 7.3.0.1.231 | 8.0.RC2.beta1 |
-
-
+| 2.5.0 | 24.1.0 | 7.5.0.3.220 | 8.0.0.beta1 |
## 快速入门
@@ -42,7 +38,7 @@
请参考[MindOCR中的安装说明](https://github.com/mindspore-lab/mindocr#installation)。
-### 数据集准备
+### 数据准备
**训练集**
@@ -115,7 +111,7 @@ datasets
└── SynText
```
-### 更新yaml配置文件
+### 配置说明
如果数据集放置在`./datasets`目录下,则无需更改yaml配置文件`configs/rec/visionlan/visionlan_L*.yaml`中的`train.dataset.dataset_root`。
否则,请相应地更改以下字段:
@@ -188,7 +184,7 @@ msrun --bind_core=True --worker_num=4 --local_worker_num=4 python tools/train.py
msrun --bind_core=True --worker_num=4 --local_worker_num=4 python tools/train.py --config configs/rec/visionlan/visionlan_resnet45_LF_2.yaml
msrun --bind_core=True --worker_num=4 --local_worker_num=4 python tools/train.py --config configs/rec/visionlan/visionlan_resnet45_LA.yaml
```
-**注意:** 有关 msrun 配置的更多信息,请参考[此处](https://www.mindspore.cn/tutorials/experts/zh-CN/r2.3.1/parallel/msrun_launcher.html).
+**注意:** 有关 msrun 配置的更多信息,请参考[此处](https://www.mindspore.cn/docs/zh-CN/master/model_train/parallel/msrun_launcher.html).
训练结果(包括checkpoints、每个阶段的性能和loss曲线)将保存在yaml配置文件中由参数`ckpt_save_dir`解析的目录中。默认目录为`./tmp_visionlan`。
@@ -250,7 +246,7 @@ python tools/benchmarking/multi_dataset_eval.py --config $yaml_file --opt eval.d
- 训练数据集:`MJ+ST`代表两个合成数据集SynthText(800k)和MJSynth的组合。
- 要在其他训练环境中重现结果,请确保全局批量大小相同。
-- 这些模型是从头开始训练的,没有任何预训练。有关训练和评估的更多数据集详细信息,请参阅[数据集准备](#数据集准备)部分。
+- 这些模型是从头开始训练的,没有任何预训练。有关训练和评估的更多数据集详细信息,请参阅[数据准备](#数据准备)部分。
- VisionLAN的MindIR导出时的输入Shape均为(1, 3, 64, 256)。
diff --git a/configs/table/README.md b/configs/table/README.md
index 93ff72d16..200cd8387 100644
--- a/configs/table/README.md
+++ b/configs/table/README.md
@@ -22,27 +22,11 @@ Through this approach, TableMaster is able to simultaneously learn and predict t
Figure 1. Overall TableMaster architecture [1]
-## Results
+## Requirements
-| mindspore | ascend driver | firmware | cann toolkit/kernel |
-|:---------:|:---------------:|:------------:|:-------------------:|
-| 2.3.1 | 24.1.RC2 | 7.3.0.1.231 | 8.0.RC2.beta1 |
-
-### PubTabNet
-
-Experiments are tested on ascend 910* with mindspore 2.3.1 graph mode
-
-
-| **model name** | **cards** | **batch size** | **ms/step** | **img/s** | **accuracy** | **config** | **weight** |
-|----------------|-----------|----------------|-------------|-----------|--------------|-----------------------------------------------------|------------------------------------------------|
-| TableMaster | 8 | 10 | 268 | 296 | 77.49% | [yaml](table_master.yaml) | [ckpt](https://download-mindspore.osinfra.cn/toolkits/mindocr/tablemaster/table_master-78bf35bb.ckpt) |
-
-
-
-
-#### Notes:
-- The training time of EAST is highly affected by data processing and varies on different machines.
-- The input_shape for exported MindIR in the link is `(1,3,480,480)`.
+| mindspore | ascend driver | firmware | cann toolkit/kernel |
+|:----------:|:--------------:|:--------------:|:-------------------:|
+| 2.5.0 | 24.1.0 | 7.5.0.3.220 | 8.0.0.beta1 |
## Quick Start
@@ -160,7 +144,7 @@ msrun --worker_num=8 --local_worker_num=8 python tools/train.py --config configs
# Based on verification,binding cores usually results in performance acceleration.Please configure the parameters and run.
msrun --bind_core=True --worker_num=8 --local_worker_num=8 python tools/train.py --config configs/table/table_master.yaml
```
-**Note:** For more information about msrun configuration, please refer to [here](https://www.mindspore.cn/tutorials/experts/en/r2.3.1/parallel/msrun_launcher.html).
+**Note:** For more information about msrun configuration, please refer to [here](https://www.mindspore.cn/docs/en/master/model_train/parallel/msrun_launcher.html).
The training result (including checkpoints, per-epoch performance and curves) will be saved in the directory parsed by the arg `ckpt_save_dir` in yaml config file. The default directory is `./tmp_table`.
@@ -172,3 +156,19 @@ To evaluate the accuracy of the trained model, you can use `eval.py`. Please set
``` shell
python tools/eval.py --config configs/table/table_master.yaml
```
+
+## Performance
+
+### PubTabNet
+
+Experiments are tested on ascend 910* with mindspore 2.5.0 graph mode
+
+
+| **model name** | **cards** | **batch size** | **ms/step** | **img/s** | **accuracy** | **config** | **weight** |
+|----------------|-----------|----------------|-------------|-----------|--------------|-----------------------------------------------------|------------------------------------------------|
+| TableMaster | 8 | 10 | 268 | 296 | 77.49% | [yaml](table_master.yaml) | [ckpt](https://download-mindspore.osinfra.cn/toolkits/mindocr/tablemaster/table_master-78bf35bb.ckpt) |
+
+
+#### Notes:
+- The training time of TableMaster is highly affected by data processing and varies on different machines.
+- The input_shape for exported MindIR in the link is `(1,3,480,480)`.
diff --git a/configs/table/README_CN.md b/configs/table/README_CN.md
index 108189316..8a5d42bba 100644
--- a/configs/table/README_CN.md
+++ b/configs/table/README_CN.md
@@ -21,26 +21,11 @@ TableMaster是一种用于表格识别的模型,其独特之处在于能够同
图1. TableMaster整体架构图 [1]
-## 实验结果
-
-| mindspore | ascend driver | firmware | cann toolkit/kernel |
-|:---------:|:---------------:|:------------:|:-------------------:|
-| 2.3.1 | 24.1.RC2 | 7.3.0.1.231 | 8.0.RC2.beta1 |
-
-### PubTabNet
-
-在采用图模式的ascend 910*上实验结果,mindspore版本为2.3.1
-
-
-| **模型名称** | **卡数** | **单卡批量大小** | **ms/step** | **img/s** | **准确率** | **配置** | **权重** |
-|-------------|--------|------------|-------------|-----------|---------|---------------------------|-------------------------------------------------------------------------------------------------------|
-| TableMaster | 8 | 10 | 268 | 296 | 77.49% | [yaml](table_master.yaml) | [ckpt](https://download-mindspore.osinfra.cn/toolkits/mindocr/tablemaster/table_master-78bf35bb.ckpt) |
-
-
-#### 注释:
-- TableMaster的训练时长受数据处理部分和不同运行环境的影响非常大。
-- 链接中MindIR导出时的输入Shape为`(1,3,480,480)` 。
+### 配套版本
+| mindspore | ascend driver | firmware | cann toolkit/kernel |
+|:----------:|:--------------:|:--------------:|:-------------------:|
+| 2.5.0 | 24.1.0 | 7.5.0.3.220 | 8.0.0.beta1 |
## 快速上手
@@ -153,7 +138,7 @@ msrun --worker_num=8 --local_worker_num=8 python tools/train.py --config configs
# 经验证,绑核在大部分情况下有性能加速,请配置参数并运行
msrun --bind_core=True --worker_num=8 --local_worker_num=8 python tools/train.py --config configs/table/table_master.yaml
```
-**注意:** 有关 msrun 配置的更多信息,请参考[此处](https://www.mindspore.cn/tutorials/experts/zh-CN/r2.3.1/parallel/msrun_launcher.html).
+**注意:** 有关 msrun 配置的更多信息,请参考[此处](https://www.mindspore.cn/docs/zh-CN/master/model_train/parallel/msrun_launcher.html).
训练结果(包括checkpoint、每个epoch的性能和曲线图)将被保存在yaml配置文件的`ckpt_save_dir`参数配置的路径下,默认为`./tmp_table`。
@@ -164,3 +149,19 @@ msrun --bind_core=True --worker_num=8 --local_worker_num=8 python tools/train.py
``` shell
python tools/eval.py --config configs/table/table_master.yaml
```
+
+## 性能表现
+
+### PubTabNet
+
+在采用图模式的ascend 910*上实验结果,mindspore版本为2.5.0
+
+
+| **模型名称** | **卡数** | **单卡批量大小** | **ms/step** | **img/s** | **准确率** | **配置** | **权重** |
+|-------------|--------|------------|-------------|-----------|---------|---------------------------|-------------------------------------------------------------------------------------------------------|
+| TableMaster | 8 | 10 | 268 | 296 | 77.49% | [yaml](table_master.yaml) | [ckpt](https://download-mindspore.osinfra.cn/toolkits/mindocr/tablemaster/table_master-78bf35bb.ckpt) |
+
+
+#### 注释:
+- TableMaster的训练时长受数据处理部分和不同运行环境的影响非常大。
+- 链接中MindIR导出时的输入Shape为`(1,3,480,480)` 。
diff --git a/examples/license_plate_detection_and_recognition/README.md b/examples/license_plate_detection_and_recognition/README.md
index 8ffacc56a..8e3e5fef8 100644
--- a/examples/license_plate_detection_and_recognition/README.md
+++ b/examples/license_plate_detection_and_recognition/README.md
@@ -252,7 +252,7 @@ msrun --worker_num=2 --local_worker_num=2 python tools/train.py --config configs
# Based on verification,binding cores usually results in performance acceleration.Please configure the parameters and run.
msrun --bind_core=True --worker_num=2 --local_worker_num=2 python tools/train.py --config configs/det/dbnet/db_r50_ccpd.yaml --device_target Ascend/GPU
```
-**Note:** For more information about msrun configuration, please refer to [here](https://www.mindspore.cn/tutorials/experts/en/r2.3.1/parallel/msrun_launcher.html).
+**Note:** For more information about msrun configuration, please refer to [here](https://www.mindspore.cn/docs/en/master/model_train/parallel/msrun_launcher.html).
## Test
diff --git a/examples/license_plate_detection_and_recognition/README_CN.md b/examples/license_plate_detection_and_recognition/README_CN.md
index 87dd178d9..792e5d37f 100644
--- a/examples/license_plate_detection_and_recognition/README_CN.md
+++ b/examples/license_plate_detection_and_recognition/README_CN.md
@@ -252,7 +252,7 @@ msrun --worker_num=2 --local_worker_num=2 python tools/train.py --config configs
# 经验证,绑核在大部分情况下有性能加速,请配置参数并运行
msrun --bind_core=True --worker_num=2 --local_worker_num=2 python tools/train.py --config configs/det/dbnet/db_r50_ccpd.yaml --device_target Ascend/GPU
```
-**注意:** 有关 msrun 配置的更多信息,请参考[此处](https://www.mindspore.cn/tutorials/experts/zh-CN/r2.3.1/parallel/msrun_launcher.html).
+**注意:** 有关 msrun 配置的更多信息,请参考[此处](https://www.mindspore.cn/docs/zh-CN/master/model_train/parallel/msrun_launcher.html).
## 测试