EADST

Data Augmentation for Handwritten Recognition

I read some data augmentation papers this week.

Data augmentation has three main areas.

  • Space transform
  • Color change
  • Information delect

Here are four papers where three papers using space transform and one paper taking information delect.

  1. Bhunia, Ayan & Das, Abhirup & Bhunia, Ankan & Perla, Sai & Roy, Partha. (2019). Handwriting Recognition in Low-Resource Scripts Using Adversarial Learning. 10.1109/CVPR.2019.00490.

    • They propose the algorithm, Adversarial Feature Deformation Module (AFDM) inspired by Spatial Transformation Networks (STN).

      • Localisation Network: using Generative Adversarial Networks (GANs) to generate the transform matrix.
      • Grid Generator: transforming feature maps with matrix.
      • Sampler: based on the neighbor relative position to update weights.

      image

  2. Luo, Canjie & Zhu, Yuanzhi & Jin, Lianwen & Wang, Yongpan. (2020). Learn to Augment: Joint Data Augmentation and Network Optimization for Text Recognition.

    • They combine moving least squares with a learnable agent to augment data.

      Code

      image

  3. C. Wigington, S. Stewart, B. Davis, B. Barrett, B. Price and S. Cohen, "Data Augmentation for Recognition of Handwritten Words and Lines Using a CNN-LSTM Network," 2017 14th IAPR International Conference on Document Analysis and Recognition (ICDAR), Kyoto, 2017, pp. 639-645, doi: 10.1109/ICDAR.2017.110.

    • They reshape the characters with the normal distribution where the parameters from the normalization step.

      image

  4. Pengguang Chen. GridMask data augmentation. arXiv preprint arXiv:2001.04086, 2020. 3.

    • They use grid mask strategy to shadow some blocks with the grid for image.
    • A handwritten recognition is used in this paper: GridMask Based Data Augmentation for Bengali Handwritten Grapheme Classification

      Code

      image

相关标签
About Me
XD
Goals determine what you are going to be.
Category
标签云
关于博主 GoogLeNet Template 公式 Interview LaTeX RAR CAM Gemma Markdown FastAPI YOLO Data Proxy GIT ChatGPT 图标 InvalidArgumentError UNIX Pillow llama.cpp OpenAI CEIR uWSGI RGB Food tqdm Plate Vim icon tar scipy Safetensors Math logger Windows Qwen2.5 FlashAttention Llama PDB Random Pickle Baidu Zip 云服务器 域名 Magnet Diagram hf XGBoost CV GPT4 NLP SPIE v0.dev Shortcut Hilton Qwen2 IndexTTS2 v2ray Input FP64 Plotly Crawler Algorithm Translation Google CUDA 腾讯云 多进程 FP8 Pandas TensorFlow PyCharm PIP Michelin C++ 飞书 Firewall Review Jetson Dataset Cloudreve VSCode AI torchinfo Heatmap DeepStream Vmess WAN NLTK News 报税 继承 mmap Password Agent printf Python Color 算法题 Statistics Attention Domain Git Permission Hungarian CC GGML 音频 第一性原理 Knowledge LLAMA Freesound Bitcoin 搞笑 Bin Miniforge VGG-16 Paper COCO Transformers Logo 阿里云 Quantization Docker SAM Qwen SQLite 签证 Website CTC SVR SQL Ptyhon Tiktoken Numpy TSV Bipartite Excel Linux API Land GPTQ Bert JSON Card git 强化学习 TensorRT Web 财报 CLAP BTC LeetCode ModelScope Search EXCEL Mixtral BF16 diffusers Django LoRA Hotel WebCrawler 版权 HaggingFace Augmentation 递归学习法 NameSilo Github Image2Text ResNet-50 顶会 Breakpoint FP16 Quantize Paddle uwsgi DeepSeek 多线程 FP32 transformers Animate Rebuttal Tensor Clash TTS BeautifulSoup LLM Use HuggingFace Jupyter Tracking Nginx VPN QWEN 净利润 图形思考法 Claude UI Streamlit 证件照 PDF Conda git-lfs MD5 OCR Disk XML Distillation ONNX Pytorch OpenCV PyTorch Anaconda Datetime Base64 Video Sklearn Ubuntu CSV
站点统计

本站现有博文323篇,共被浏览795874

本站已经建立2493天!

热门文章
文章归档
回到顶部