EADST

Data Augmentation for Handwritten Recognition

I read some data augmentation papers this week.

Data augmentation has three main areas.

  • Space transform
  • Color change
  • Information delect

Here are four papers where three papers using space transform and one paper taking information delect.

  1. Bhunia, Ayan & Das, Abhirup & Bhunia, Ankan & Perla, Sai & Roy, Partha. (2019). Handwriting Recognition in Low-Resource Scripts Using Adversarial Learning. 10.1109/CVPR.2019.00490.

    • They propose the algorithm, Adversarial Feature Deformation Module (AFDM) inspired by Spatial Transformation Networks (STN).

      • Localisation Network: using Generative Adversarial Networks (GANs) to generate the transform matrix.
      • Grid Generator: transforming feature maps with matrix.
      • Sampler: based on the neighbor relative position to update weights.

      image

  2. Luo, Canjie & Zhu, Yuanzhi & Jin, Lianwen & Wang, Yongpan. (2020). Learn to Augment: Joint Data Augmentation and Network Optimization for Text Recognition.

    • They combine moving least squares with a learnable agent to augment data.

      Code

      image

  3. C. Wigington, S. Stewart, B. Davis, B. Barrett, B. Price and S. Cohen, "Data Augmentation for Recognition of Handwritten Words and Lines Using a CNN-LSTM Network," 2017 14th IAPR International Conference on Document Analysis and Recognition (ICDAR), Kyoto, 2017, pp. 639-645, doi: 10.1109/ICDAR.2017.110.

    • They reshape the characters with the normal distribution where the parameters from the normalization step.

      image

  4. Pengguang Chen. GridMask data augmentation. arXiv preprint arXiv:2001.04086, 2020. 3.

    • They use grid mask strategy to shadow some blocks with the grid for image.
    • A handwritten recognition is used in this paper: GridMask Based Data Augmentation for Bengali Handwritten Grapheme Classification

      Code

      image

相关标签
About Me
XD
Goals determine what you are going to be.
Category
标签云
Password MD5 Proxy 强化学习 CEIR PDF git Permission 关于博主 Gemma transformers XML Logo Website GPTQ hf Tensor Color Vim AI printf logger Domain Excel 净利润 tqdm PDB 云服务器 XGBoost LaTeX TensorFlow 飞书 WAN Numpy OpenCV 版权 CSV 继承 SQL 财报 递归学习法 Review HuggingFace SQLite GGML Github Bert Google Qwen2 icon Magnet LLM Algorithm DeepStream Bipartite 搞笑 RAR SPIE Python Input Video Bitcoin Rebuttal TTS Claude VGG-16 Shortcut IndexTTS2 Git Distillation Hotel 图标 TSV Paddle HaggingFace LeetCode Knowledge mmap 多线程 FastAPI Tiktoken Quantize 图形思考法 Dataset Agent ONNX Vmess PIP Freesound LoRA CV Search OpenAI GIT Jetson CLAP v2ray Web Image2Text BF16 PyCharm Food GPT4 域名 PyTorch diffusers SAM 顶会 腾讯云 Pytorch Augmentation 阿里云 证件照 论文速读 OCR Quantization RGB EXCEL 签证 NLP Streamlit QWEN Cloudreve 公式 Qwen2.5 Plate Docker ResNet-50 llama.cpp 第一性原理 VSCode GoogLeNet CAM News Attention scipy NameSilo Tracking Hilton CTC DeepSeek Interview COCO Paper BeautifulSoup git-lfs Michelin Windows Land Safetensors Mixtral Ubuntu Zip Heatmap Datetime Firewall Animate Use UNIX Nginx TensorRT InvalidArgumentError BTC 多进程 VPN Transformers uWSGI Miniforge Baidu 论文 Clash Conda FlashAttention Django SVR v0.dev UI Sklearn Linux Jupyter ModelScope WebCrawler ChatGPT Bin Llama FP8 Base64 FP64 FP32 Math CUDA Breakpoint uwsgi YOLO Hungarian FP16 Plotly tar Crawler Pickle Pillow CC Template LLAMA Pandas Qwen 算法题 Translation NLTK Card 报税 API Diagram Data Random Statistics JSON 音频 Disk Ptyhon torchinfo C++ Anaconda Markdown
站点统计

本站现有博文327篇,共被浏览833394

本站已经建立2538天!

热门文章
文章归档
回到顶部