EADST

Code for SPIE paper - CEIR

CEIR

This project is for the SPIE paper - Novel Receipt Recognition with Deep Learning Algorithms. In this paper, we propose an end-to-end novel receipt recognition system for capturing effective information from receipts (CEIR).

CEIR code and results have been made available at: CEIR code

CEIR system demo is available at: CEIR Demo

The CEIR has three parts: preprocess, detection, recognition.

Introduction

In the preprocessing method, by converting the image to gray scale and obtaining the gradient with the Sobel operator, the outline of the receipt area is decided by morphological transformations with the elliptic kernel.

In text detection, the modified connectionist text proposal network to execute text detection. The pytorch implementation of detection is based on CTPN.

In text recognition, the convolutional recurrent neural network with the connectionist temporal classification with maximum entropy regularization as a loss function to update the weights in networks and extract the characters from receipt. The pytorch implementation of recognition is based on CRNN and ENESCTC.

We validate our system with the scanned receipts optical character recognition and information extraction (SROIE) database.

Dependency

Python 3.6.3 1. torch==1.4 2. torchvision 3. opencv-python 4. lmdb

Prediction

  1. Download pre-trained model from Google Drive and put the file under ./detection/output/ folder.

  2. Change the image name to demo.jpg in the CEIR folder.

  3. Run python ceir_crop.py for stage 1.
  4. Run python ceir_detect.py for stage 2.
  5. Run python ceir_recognize.py for stage 3.

  6. The result will be saved in ./result/.

Training

  1. Put dataset in ./dataset/train/image and ./dataset/train/label.

  2. Preprocess parameters can be changed in ./preprocess/crop.py.

  3. In the detection part, the ./detection/config.py is used for configuring. After that, run python train.py in the detection folder.

  4. In recognition, you need to change trainroot and other parameters in train.sh, then run sh train.sh to train.

相关标签
About Me
XD
Goals determine what you are going to be.
Category
标签云
Search Hilton 多进程 ChatGPT Password CTC Pillow Vmess Bitcoin 关于博主 Quantization TTS Food printf Interview diffusers CC TSV Knowledge IndexTTS2 Miniforge Markdown ModelScope Tracking Google SPIE Quantize VGG-16 GPTQ Rebuttal v0.dev 公式 Hungarian GIT Conda Land 证件照 Claude GPT4 Plate FP64 Data 强化学习 InvalidArgumentError Docker Random SAM 财报 继承 RAR Vim Linux 第一性原理 Qwen2.5 tar FP8 Django Safetensors Augmentation TensorRT FP32 logger Jetson Shortcut PyCharm Windows Cloudreve Tiktoken HaggingFace DeepSeek Domain Agent Paddle Image2Text SQLite Translation Anaconda 报税 递归学习法 Attention Github GGML LLAMA NLP FastAPI hf uWSGI UI FlashAttention Web 算法题 Bin Proxy llama.cpp Card transformers ResNet-50 GoogLeNet PDF Jupyter OCR Plotly Use Website BF16 OpenAI tqdm Qwen mmap CEIR icon Python 顶会 SVR CV Animate PyTorch AI 版权 Pandas Template Excel Permission WebCrawler PDB Dataset YOLO Heatmap RGB Hotel Llama LLM LaTeX 腾讯云 CSV Video Sklearn Diagram BTC Clash 图标 阿里云 签证 Zip Nginx C++ Bipartite Mixtral BeautifulSoup WAN Ubuntu XGBoost Gemma CUDA EXCEL Magnet Pytorch VSCode Streamlit FP16 Statistics Baidu LeetCode 飞书 Math git News JSON 音频 NameSilo XML 域名 Tensor Datetime PIP QWEN OpenCV torchinfo TensorFlow UNIX MD5 CAM Pickle Paper NLTK 图形思考法 API Bert 云服务器 CLAP Michelin git-lfs ONNX Base64 Breakpoint 搞笑 Qwen2 Crawler Disk Git Numpy Algorithm Review Firewall Distillation COCO Transformers LoRA Ptyhon HuggingFace 多线程 DeepStream Input Logo 净利润 v2ray scipy Freesound Color VPN uwsgi SQL
站点统计

本站现有博文323篇,共被浏览795544

本站已经建立2493天!

热门文章
文章归档
回到顶部