You are now leaving the website that is under the control and management of DARPA. The appearance of hyperlinks does not constitute endorsement by DARPA of non-U.S. Government sites or the information, products, or services contained therein. Although DARPA may or may not use these sites as additional distribution channels for Department of Defense information, it does not exercise editorial control over all of the information that you may find at these locations. Such links are provided consistent with the stated purpose of this website.

After reading this message, click to continue immediately.

Go Back

/ Information Innovation Office (I2O)

Multilingual Automatic Document Classification Analysis and Translation (MADCAT)

The goal of the Multilingual Automatic Document Classification Analysis and Translation (MADCAT) program is to automatically convert foreign language text images into English transcripts, thus eliminating the need for linguists and analysts while automatically providing relevant, distilled actionable information to military command and personnel in a timely fashion.

Program Manager: Dr. David Doermann


The content below has been generated by organizations that are partially funded by DARPA; the views and conclusions contained therein are those of the authors and should not be interpreted as necessarily representing the official policies or endorsements, either expressed or implied, of DARPA or the U.S. Government.

Report a problem:

Last updated: November 13, 2015

LDC Annotation Trees: LDC's Customizable, Extensible, Scalable Annotation Infrastructure
(Team BBN) Columbia University, Safaba Translation Solutions, George Washington University Reranking with Linguistic and Semantic Features for Arabic Optical Character Recognition
BBN Applying Discriminatively Optimized Feature Transform for HMM-based Off-Line Handwriting Recognition
BBN Statistical Machine Translation as a Language Model for Handwriting Recognition
BBN Detecting Near-Duplicate Document Images Using Interest Point Matching
BBN Document Recognition and Translation System for Unconstrained Arabic Documents
BBN Local Segmentation of Touching Characters Using Contour Based Shape Decomposition
BBN Detecting OOV Names in Arabic Handwritten Data
BBN Arabic Text Recognition Using a Script-Independent Methodology: A Unified HMM-Based Approach for Machine-Printed and Handwritten Text
BBN James-Stein Type Center Pixel Weights for Non-Local Means Image Denoising
BBN Probabilistic Non-Local Means
BBN Preprocessing Issues in Arabic OCR
BBN Model Based Table Cell Detection and Content Extraction from Degraded Document Images
BBN Handwritten Arabic Text Recognition Using Deep Belief Networks
BBN Ensemble of Biased Learners for Offline Arabic Handwriting Recognition
BBN Machine Learning in Handwritten Arabic Text Recognition
BBN Scene Text Detection via Integrated Discrimination of Component Appearance and Consensus
BBN Using Models for Automatic Classification and Segmentation of Tables in Noisy Offline Images
BBN Table Form Detection and Segmentation Using an Iterative Registration Algorithm
BBN Confusion Network-Based Recurrent Neural Network Language Modeling for Chinese OCR Error Detection
BBN Text Classification via iVector Based Feature Representation
BBN Boost Accuracy Using iVector Based Combination Approach
BBN Applications of Recurrent Neural Network Language Model in Offline Handwriting Recognition and Word Spotting
BBN Sentence Boundary Detection for Arabic Image Documents and its Effects on Machine Translation
BBN Progress in the Raytheon BBN Arabic Offline Handwriting Recognition System
BBN Unsupervised Classification of Structurally Similar Document Images
BBN Document Image Quality Assessment: A Brief Survey
BBN A Robust Table Registration Method for Batch Table OCR Processing
BBN Integrating Natural Language Processing with Image Document Analysis: What We Learned from Two Real World Applications
BBN Text Detection and Recognition in Natural Scenes and Consumer Videos
(Team LDC) University of Maryland Linguistic Resources for Handwriting Recognition and Translation Evaluation
(Team LDC) University of Maryland Linguistic Resources for the 2013 NIST Open Handwriting Recognition and Translation Evaluation
BBN Page Rule-Line Removal Using Linear Subspaces in Monochromatic Handwritten Arabic Documents
BBN Voronoi++: A Dynamic Page Segmentation Approach Based on Voronoi and Docstrum Features
BBN Improvements in BBN's HMM-Based Offline Arabic Handwriting Recognition System
BBN Unsupervised HMM Adaptation Using Page Style Clustering
BBN Stochastic Segment Modeling for Offline Handwriting Recognition
BBN A Steerable Directional Local Profile Technique for Extraction of Handwritten Arabic Text Lines
BBN Gabor Features for Offline Arabic Handwriting Recognition
BBN The BBN Document Analysis Service: A Platform for Multilingual Document Translation
BBN Context-Aware and Content-Based Dynamic Voronoi Page Segmentation
BBN Handwritten Arabic Text Line Segmentation Using Affinity Propagation
BBN Improvements in HMM Adaptation for Handwriting Recognition Using Writer Identification and Duration Adaptation
BBN Stochastic Segment Model Adaptation for Offline Handwriting Recognition
BBN Consensus Network Based Hypotheses Combination for Arabic Offline Handwriting Recognition
BBN Removing Rule-Lines from Binary Handwritten Arabic Document Images Using Directional Local Profile
BBN Shape-DNA: Effective Character Restoration and Enhancement for Arabic Text Documents
BBN Shape Codebook Based Handwritten and Machine Printed Text Zone Extraction
BBN Stroke-Like Pattern Noise Removal in Binary Document Images
BBN Segmentation of Handwritten Textlines in Presence of Touching Components
BBN Template Based Segmentation of Touching Components in Handwritten Text Lines
BBN Fast Rule-Line Removal Using Integral Images and Support Vector Machines
BBN Document Image Classification and Labeling Using Multiple Instance Learning
BBN Graph Clustering-Based Ensemble Method for Handwritten Text Line Segmentation
BBN OCR-Driven Writer Identification and Adaptation in an HMM Handwriting Recognition System
BBN Handwritten and Typewritten Text Identification and Recognition Using Hidden Markov Models
BBN Text Extraction from Video Using Conditional Random Fields
BBN Image Enhancement for Degraded Binary Document Images
BBN No-Reference Image Quality Assessment Based on Visual Codebook
BBN Automated Image Quality Assessment for Camera-Captured OCR
BBN Reranking with Linguistic and Semantic Features for Arabic Optical Character Recognition
BBN Some Recent Advances in Offline Handwriting Recognition
BBN Learning Features for Predicting OCR Accuracy
BBN Logo Retrieval in Document Images
BBN Learning Text-line Segmentation Using Codebooks and Graph Partitioning
BBN Learning Document Structure for Retrieval and Classification
BBN Combining Preference and Absolute Judgements in a Crowd-Sourced Setting