Papers

Multi-digit Number Recognition from Street View Imagery using Deep Convolutional Neural Networks

intro: Google. Ian J. Goodfellow
arxiv: https://arxiv.org/abs/1312.6082

End-to-End Text Recognition with Convolutional Neural Networks

Word Spotting and Recognition with Embedded Attributes

paper: http://ieeexplore.ieee.org.sci-hub.org/xpl/articleDetails.jsp?arnumber=6857995&filter%3DAND%28p_IS_Number%3A6940341%29

Reading Text in the Wild with Convolutional Neural Networks

arxiv: http://arxiv.org/abs/1412.1842
homepage: http://www.robots.ox.ac.uk/~vgg/publications/2016/Jaderberg16/
demo: http://zeus.robots.ox.ac.uk/textsearch/#/search/
code: http://www.robots.ox.ac.uk/~vgg/research/text/

Deep structured output learning for unconstrained text recognition

intro: “propose an architecture consisting of a character sequence CNN and an N-gram encoding CNN which act on an input image in parallel and whose outputs are utilized along with a CRF model to recognize the text content present within the image.”
arxiv: http://arxiv.org/abs/1412.5903

Deep Features for Text Spotting

Reading Scene Text in Deep Convolutional Sequences

arxiv: http://arxiv.org/abs/1506.04395

DeepFont: Identify Your Font from An Image

arxiv: http://arxiv.org/abs/1507.03196

An End-to-End Trainable Neural Network for Image-based Sequence Recognition and Its Application to Scene Text Recognition

intro: Convolutional Recurrent Neural Network
arxiv: http://arxiv.org/abs/1507.05717
github: https://github.com/bgshih/crnn
github: https://github.com/meijieru/crnn.pytorch

Recursive Recurrent Nets with Attention Modeling for OCR in the Wild

arxiv: http://arxiv.org/abs/1603.03101

Writer-independent Feature Learning for Offline Signature Verification using Deep Convolutional Neural Networks

arxiv: http://arxiv.org/abs/1604.00974

DeepText: A Unified Framework for Text Proposal Generation and Text Detection in Natural Images

arxiv: http://arxiv.org/abs/1605.07314

End-to-End Interpretation of the French Street Name Signs Dataset

End-to-End Subtitle Detection and Recognition for Videos in East Asian Languages via CNN Ensemble with Near-Human-Level Performance

arxiv: https://arxiv.org/abs/1611.06159

Smart Library: Identifying Books in a Library using Richly Supervised Deep Scene Text Reading

arxiv: https://arxiv.org/abs/1611.07385

Improving Text Proposals for Scene Images with Fully Convolutional Networks

intro: Universitat Autonoma de Barcelona (UAB) & University of Florence
intro: International Conference on Pattern Recognition (ICPR) - DLPR (Deep Learning for Pattern Recognition) workshop
arxiv: https://arxiv.org/abs/1702.05089

Scene Text Eraser

https://arxiv.org/abs/1705.02772

Attention-based Extraction of Structured Information from Street View Imagery

intro: University College London & Google Inc
arxiv: https://arxiv.org/abs/1704.03549
github: https://github.com/tensorflow/models/tree/master/attention_ocr

STN-OCR: A single Neural Network for Text Detection and Text Recognition

arxiv: https://arxiv.org/abs/1707.08831
github(MXNet): https://github.com/Bartzi/stn-ocr

Text Detection

Object Proposals for Text Extraction in the Wild

intro: ICDAR 2015
arxiv: http://arxiv.org/abs/1509.02317
github: https://github.com/lluisgomez/TextProposals

Text-Attentional Convolutional Neural Networks for Scene Text Detection

arxiv: http://arxiv.org/abs/1510.03283

Accurate Text Localization in Natural Image with Cascaded Convolutional Text Network

arxiv: http://arxiv.org/abs/1603.09423

Synthetic Data for Text Localisation in Natural Images

intro: CVPR 2016
project page: http://www.robots.ox.ac.uk/~vgg/data/scenetext/
arxiv: http://arxiv.org/abs/1604.06646
paper: http://www.robots.ox.ac.uk/~vgg/data/scenetext/gupta16.pdf
github: https://github.com/ankush-me/SynthText

Scene Text Detection via Holistic, Multi-Channel Prediction

arxiv: http://arxiv.org/abs/1606.09002

Detecting Text in Natural Image with Connectionist Text Proposal Network

intro: ECCV 2016
arxiv: http://arxiv.org/abs/1609.03605
github(Caffe): https://github.com/tianzhi0549/CTPN
demo: http://textdet.com/

TextBoxes: A Fast Text Detector with a Single Deep Neural Network

intro: AAAI 2017
arxiv: https://arxiv.org/abs/1611.06779
github(Caffe): https://github.com/MhLiao/TextBoxes

Deep Matching Prior Network: Toward Tighter Multi-oriented Text Detection

intro: CVPR 2017
intro: F-measure 70.64%, outperforming the existing state-of-the-art method with F-measure 63.76%
arxiv: https://arxiv.org/abs/1703.01425

Detecting Oriented Text in Natural Images by Linking Segments

intro: CVPR 2017
arxiv: https://arxiv.org/abs/1703.06520
github(Tensorflow): https://github.com/dengdan/seglink

Deep Direct Regression for Multi-Oriented Scene Text Detection

arxiv: https://arxiv.org/abs/1703.08289

Cascaded Segmentation-Detection Networks for Word-Level Text Spotting

https://arxiv.org/abs/1704.00834

Text-Detection-using-py-faster-rcnn-framework

github: https://github.com/jugg1024/Text-Detection-with-FRCN

WordFence: Text Detection in Natural Images with Border Awareness

intro: ICIP 2017
arcxiv: https://arxiv.org/abs/1705.05483

SSD-text detection: Text Detector

intro: A modified SSD model for text detection
github: https://github.com/oyxhust/ssd-text_detection

R2CNN: Rotational Region CNN for Orientation Robust Scene Text Detection

intro: Samsung R&D Institute China
arxiv: https://arxiv.org/abs/1706.09579

R-PHOC: Segmentation-Free Word Spotting using CNN

intro: ICDAR 2017
arxiv: https://arxiv.org/abs/1707.01294

Towards End-to-end Text Spotting with Convolutional Recurrent Neural Networks

https://arxiv.org/abs/1707.03985

EAST: An Efficient and Accurate Scene Text Detector

Text Recognition

Sequence to sequence learning for unconstrained scene text recognition

intro: master thesis
arxiv: http://arxiv.org/abs/1607.06125

Drawing and Recognizing Chinese Characters with Recurrent Neural Network

arxiv: https://arxiv.org/abs/1606.06539

Learning Spatial-Semantic Context with Fully Convolutional Recurrent Network for Online Handwritten Chinese Text Recognition

intro: correct rates: Dataset-CASIA 97.10% and Dataset-ICDAR 97.15%
arxiv: https://arxiv.org/abs/1610.02616

Stroke Sequence-Dependent Deep Convolutional Neural Network for Online Handwritten Chinese Character Recognition

arxiv: https://arxiv.org/abs/1610.04057

Visual attention models for scene text recognition

https://arxiv.org/abs/1706.01487

Breaking Captcha

Using deep learning to break a Captcha system

intro: “Using Torch code to break simplecaptcha with 92% accuracy”
blog: https://deepmlblog.wordpress.com/2016/01/03/how-to-break-a-captcha-system/
github: https://github.com/arunpatala/captcha

Breaking reddit captcha with 96% accuracy

I’m not a human: Breaking the Google reCAPTCHA

paper: https://www.blackhat.com/docs/asia-16/materials/asia-16-Sivakorn-Im-Not-a-Human-Breaking-the-Google-reCAPTCHA-wp.pdf

Neural Net CAPTCHA Cracker

Recurrent neural networks for decoding CAPTCHAS

Reading irctc captchas with 95% accuracy using deep learning

github: https://github.com/arunpatala/captcha.irctc

端到端的OCR：基于CNN的实现

blog: http://blog.xlvector.net/2016-05/mxnet-ocr-cnn/

I Am Robot: (Deep) Learning to Break Semantic Image CAPTCHAs

intro: automatically solving 70.78% of the image reCaptchachallenges, while requiring only 19 seconds per challenge. apply to the Facebook image captcha and achieve an accuracy of 83.5%
paper: http://www.cs.columbia.edu/~polakis/papers/sivakorn_eurosp16.pdf

SimGAN-Captcha

intro: Solve captcha without manually labeling a training set
github: https://github.com/rickyhan/SimGAN-Captcha

Handwritten Recognition

High Performance Offline Handwritten Chinese Character Recognition Using GoogLeNet and Directional Feature Maps

arxiv: http://arxiv.org/abs/1505.04925
github: https://github.com/zhongzhuoyao/HCCR-GoogLeNet

Recognize your handwritten numbers

https://medium.com/@o.kroeger/recognize-your-handwritten-numbers-3f007cbe46ff#.jllz62xgu

Handwritten Digit Recognition using Convolutional Neural Networks in Python with Keras

blog: http://machinelearningmastery.com/handwritten-digit-recognition-using-convolutional-neural-networks-python-keras/

MNIST Handwritten Digit Classifier

github: https://github.com/karandesai-96/digit-classifier

如何用卷积神经网络CNN识别手写数字集？

blog: http://www.cnblogs.com/charlotte77/p/5671136.html

LeNet – Convolutional Neural Network in Python

blog: http://www.pyimagesearch.com/2016/08/01/lenet-convolutional-neural-network-in-python/

Scan, Attend and Read: End-to-End Handwritten Paragraph Recognition with MDLSTM Attention

arxiv: http://arxiv.org/abs/1604.03286

MLPaint: the Real-Time Handwritten Digit Recognizer

Training a Computer to Recognize Your Handwriting

https://medium.com/@annalyzin/training-a-computer-to-recognize-your-handwriting-24b808fb584#.gd4pb9jk2

Using TensorFlow to create your own handwriting recognition engine

Building a Deep Handwritten Digits Classifier using Microsoft Cognitive Toolkit

Hand Writing Recognition Using Convolutional Neural Networks

intro: This CNN-based model for recognition of hand written digits attains a validation accuracy of 99.2% after training for 12 epochs. Its trained on the MNIST dataset on Kaggle.
github: https://github.com/ayushoriginal/HandWritingRecognition-CNN

Design of a Very Compact CNN Classifier for Online Handwritten Chinese Character Recognition Using DropWeight and Global Pooling

intro: 0.57 MB, performance is decreased only by 0.91%.
arxiv: https://arxiv.org/abs/1705.05207

Plate Recognition

Reading Car License Plates Using Deep Convolutional Neural Networks and LSTMs

arxiv: http://arxiv.org/abs/1601.05610

Number plate recognition with Tensorflow

blog: http://matthewearl.github.io/2016/05/06/cnn-anpr/
github(Deep ANPR): https://github.com/matthewearl/deep-anpr

end-to-end-for-plate-recognition

github: https://github.com/szad670401/end-to-end-for-chinese-plate-recognition

Segmentation-free Vehicle License Plate Recognition using ConvNet-RNN

intro: International Workshop on Advanced Image Technology, January, 8-10, 2017. Penang, Malaysia. Proceeding IWAIT2017
arxiv: https://arxiv.org/abs/1701.06439

License Plate Detection and Recognition Using Deeply Learned Convolutional Neural Networks

arxiv: https://arxiv.org/abs/1703.07330
api: https://www.sighthound.com/products/cloud

Adversarial Generation of Training Examples for Vehicle License Plate Recognition

https://arxiv.org/abs/1707.03124

Blogs

Applying OCR Technology for Receipt Recognition

Hacking MNIST in 30 lines of Python

Optical Character Recognition Using One-Shot Learning, RNN, and TensorFlow

https://blog.altoros.com/optical-character-recognition-using-one-shot-learning-rnn-and-tensorflow.html

Creating a Modern OCR Pipeline Using Computer Vision and Deep Learning

https://blogs.dropbox.com/tech/2017/04/creating-a-modern-ocr-pipeline-using-computer-vision-and-deep-learning/

Projects

ocropy: Python-based tools for document analysis and OCR

github: https://github.com/tmbdev/ocropy

Extracting text from an image using Ocropus

blog: http://www.danvk.org/2015/01/09/extracting-text-from-an-image-using-ocropus.html

CLSTM : A small C++ implementation of LSTM networks, focused on OCR

github: https://github.com/tmbdev/clstm

OCR text recognition using tensorflow with attention

github: https://github.com/pannous/caffe-ocr
github: https://github.com/pannous/tensorflow-ocr

Digit Recognition via CNN: digital meter numbers detection

github(caffe): https://github.com/SHUCV/digit

Attention-OCR: Visual Attention based OCR

github: https://github.com/da03/Attention-OCR

umaru: An OCR-system based on torch using the technique of LSTM/GRU-RNN, CTC and referred to the works of rnnlib and clstm

github: https://github.com/edward-zhu/umaru

Tesseract.js: Pure Javascript OCR for 62 Languages

homepage: http://tesseract.projectnaptha.com/
github: https://github.com/naptha/tesseract.js

DeepHCCR: Offline Handwritten Chinese Character Recognition based on GoogLeNet and AlexNet (With CaffeModel)

github: https://github.com/chongyangtao/DeepHCCR

deep ocr: make a better chinese character recognition OCR than tesseract

https://github.com/JinpengLI/deep_ocr

Practical Deep OCR for scene text using CTPN + CRNN

https://github.com/AKSHAYUBHAT/DeepVideoAnalytics/blob/master/notebooks/OCR/readme.md

Videos

LSTMs for OCR

youtube: https://www.youtube.com/watch?v=5vW8faXvnrc

Resources

Deep Learning for OCR

https://github.com/hs105/Deep-Learning-for-OCR

Scene Text Localization & Recognition Resources

intro: A curated list of resources dedicated to scene text localization and recognition
github: https://github.com/chongyangtao/Awesome-Scene-Text-Recognition

Scene Text Localization & Recognition Resources

intro: 图像文本位置感知与识别的论文资源汇总
github: https://github.com/whitelok/image-text-localization-recognition/blob/master/README.zh-cn.md

awesome-ocr: A curated list of promising OCR resources

https://github.com/wanghaisheng/awesome-ocr