我们致力于构建场景文本检测与识别模型的统一训练评估基准。基于此基准,我们推出了兼顾精度与效率的通用OCR系统——OpenOCR。本仓库同时作为复旦大学FVL实验室OCR团队的官方代码库。
我们诚挚欢迎研究者推荐OCR相关算法,并指出潜在的事实性错误或代码缺陷。收到建议后,我们将及时评估并严谨复现。期待与您携手推进OpenOCR发展,持续为OCR社区贡献力量!
🔥OpenOCR: A general OCR system with accuracy and efficiency
🔥SVTRv2: CTC Beats Encoder-Decoder Models in Scene Text Recognition (ICCV 2025)
注意: OpenOCR支持ONNX和PyTorch双框架推理,环境相互独立。使用ONNX推理时无需安装PyTorch,反之亦然。
pip install openocr-python
pip install onnxruntime
from openocr import OpenOCR
onnx_engine = OpenOCR(backend='onnx', device='cpu')
img_path = '/path/img_path or /path/img_file'
result, elapse = onnx_engine(img_path)
Python >= 3.7
conda create -n openocr python==3.8
conda activate openocr
# 安装GPU版本
conda install pytorch==2.2.0 torchvision==0.17.0 torchaudio==2.2.0 pytorch-cuda=11.8 -c pytorch -c nvidia
# 或CPU版本
conda install pytorch torchvision torchaudio cpuonly -c pytorch
安装OpenOCR:
pip install openocr-python
使用示例:
from openocr import OpenOCR
engine = OpenOCR()
img_path = '/path/img_path or /path/img_file'
result, elapse = engine(img_path)
# Server模式
# engine = OpenOCR(mode='server')
git clone https://github.com/Topdu/OpenOCR.git
cd OpenOCR
pip install -r requirements.txt
wget https://github.com/Topdu/OpenOCR/releases/download/develop0.0.1/openocr_det_repvit_ch.pth
wget https://github.com/Topdu/OpenOCR/releases/download/develop0.0.1/openocr_repsvtr_ch.pth
# Server识别模型
# wget https://github.com/Topdu/OpenOCR/releases/download/develop0.0.1/openocr_svtrv2_ch.pth
使用命令:
# 端到端OCR系统: 检测+识别
python tools/infer_e2e.py --img_path=/path/img_path or /path/img_file
# 单独检测模型
python tools/infer_det.py --c ./configs/det/dbnet/repvit_db.yml --o Global.infer_img=/path/img_path or /path/img_file
# 单独识别模型
python tools/infer_rec.py --c ./configs/rec/svtrv2/repsvtr_ch.yml --o Global.infer_img=/path/img_path or /path/img_file
pip install onnx
python tools/toonnx.py --c configs/rec/svtrv2/repsvtr_ch.yml --o Global.device=cpu
python tools/toonnx.py --c configs/det/dbnet/repvit_db.yml --o Global.device=cpu
pip install onnxruntime
# 端到端OCR系统
python tools/infer_e2e.py --img_path=/path/img_path or /path/img_file --backend=onnx --device=cpu
# 检测模型
python tools/infer_det.py --c ./configs/det/dbnet/repvit_db.yml --o Global.backend=onnx Global.device=cpu Global.infer_img=/path/img_path or /path/img_file
# 识别模型
python tools/infer_rec.py --c ./configs/rec/svtrv2/repsvtr_ch.yml --o Global.backend=onnx Global.device=cpu Global.infer_img=/path/img_path or /path/img_file
pip install gradio==4.20.0
wget https://github.com/Topdu/OpenOCR/releases/download/develop0.0.1/OCR_e2e_img.tar
tar xf OCR_e2e_img.tar
# 启动Demo
python demo_gradio.py
方法 | 会议/期刊 | 训练支持 | 评估支持 | 贡献者 |
---|---|---|---|---|
CRNN | TPAMI 2016 | ✅ | ✅ | |
ASTER | TPAMI 2019 | ✅ | ✅ | pretto0 |
NRTR | ICDAR 2019 | ✅ | ✅ | |
SAR | AAAI 2019 | ✅ | ✅ | pretto0 |
MORAN | PR 2019 | ✅ | ✅ | |
DAN | AAAI 2020 | ✅ | ✅ | |
RobustScanner | ECCV 2020 | ✅ | ✅ | pretto0 |
AutoSTR | ECCV 2020 | ✅ | ✅ | |
SRN | CVPR 2020 | ✅ | ✅ | pretto0 |
SEED | CVPR 2020 | ✅ | ✅ | |
ABINet | CVPR 2021 | ✅ | ✅ | YesianRohn |
VisionLAN | ICCV 2021 | ✅ | ✅ | YesianRohn |
PIMNet | ACM MM 2021 | TODO | ||
SVTR | IJCAI 2022 | ✅ | ✅ | |
PARSeq | ECCV 2022 | ✅ | ✅ | |
MATRN | ECCV 2022 | ✅ | ✅ | |
MGP-STR | ECCV 2022 | ✅ | ✅ | |
LPV | IJCAI 2023 | ✅ | ✅ | |
MAERec(Union14M) | ICCV 2023 | ✅ | ✅ | |
LISTER | ICCV 2023 | ✅ | ✅ | |
CDistNet | IJCV 2024 | ✅ | ✅ | YesianRohn |
BUSNet | AAAI 2024 | ✅ | ✅ | |
DCTC | AAAI 2024 | TODO | ||
CAM | PR 2024 | ✅ | ✅ | |
OTE | CVPR 2024 | ✅ | ✅ | |
CFF | IJCAI 2024 | TODO | ||
DPTR | ACM MM 2024 | fd-zs | ||
VIPTR | ACM CIKM 2024 | TODO | ||
IGTR | TPAMI 2025 | ✅ | ✅ | |
SMTR | AAAI 2025 | ✅ | ✅ | |
CPPD | TPAMI 2025 | ✅ | ✅ | |
FocalSVTR-CTC | AAAI 2025 | ✅ | ✅ | |
SVTRv2 | ICCV 2025 | ✅ | ✅ | |
ResNet+Trans-CTC | ✅ | ✅ | ||
ViT-CTC | ✅ | ✅ |
复旦大学FVL实验室的Yiming Lei (pretto0), Xingsong Ye (YesianRohn), and Shuai Zhao (fd-zs)在Zhineng Chen老师(个人主页)指导下完成了主要算法复现工作,感谢他们的贡献。
开发中
开发中
如果我们的工作对您的研究有所帮助,请引用:
@inproceedings{Du2024SVTRv2,
title={SVTRv2: CTC Beats Encoder-Decoder Models in Scene Text Recognition},
author={Yongkun Du and Zhineng Chen and Hongtao Xie and Caiyan Jia and Yu-Gang Jiang},
booktitle={ICCV},
year={2025}
}
本代码库基于PaddleOCR、PytorchOCR和MMOCR构建,感谢他们的出色工作!