Commit ae84a8ed authored by Tom Breuel's avatar Tom Breuel
Browse files

Merge branch 'master' of github.com:tmbdev/ocropy

parents 3b843f5c 7eac431d
OLD
JUNK
.hg
book/
temp/
models/
......@@ -13,3 +12,5 @@ build/
*.os
*.a
*.so
.~*.vue
doc/.ipynb_checkpoints/
syntax: glob
*.cmodel
MODELS/*.cmodel
.*
*~
*.o
*.so
*.a
*.err
*.log
*.os
*.pyc
*.png
*.jpg
[0-9]
_*
book
book-*
unlv
unlv-*
Volume-*
Volume_*
*[0-9][0-9][0-9][0-9]*
TAGS
build
*.db
OLD
*.tgz
*.zip
html
apidocs
JUNK
OLD
models*/*
*.orig
temp/
temp.*
*.temp
......@@ -45,4 +45,5 @@ install:
script:
- mkdir ../test_folder
- cd ../test_folder
- ../ocropy/run-test
- ../ocropy/tests/run-unit
- ../ocropy/run-test-ci
------------------------
| Project Announcements
|:-----------------------
| The text line recognizer has been ported to C++ and is now a separate project, the CLSTM project, available here: https://github.com/tmbdev/clstm
| Please welcome @zuphilip and @kba as additional project maintainers. @tmb is busy developing new DNN models for document analysis (among other things). (10/15/2016)
------------------------
ocropy
======
[![Build Status](https://travis-ci.org/tmbdev/ocropy.svg)](https://travis-ci.org/tmbdev/ocropy)
[![CircleCI](https://circleci.com/gh/UB-Mannheim/ocropy/tree/pull%2F4.svg?style=svg)](https://circleci.com/gh/UB-Mannheim/ocropy/tree/pull%2F4)
[![license](https://img.shields.io/github/license/tmbdev/ocropy.svg)](https://github.com/tmbdev/ocropy/blob/master/LICENSE)
[![Wiki](https://img.shields.io/badge/wiki-11%20pages-orange.svg)](https://github.com/tmbdev/ocropy/wiki)
[![Join the chat at https://gitter.im/tmbdev/ocropy](https://badges.gitter.im/Join%20Chat.svg)](https://gitter.im/tmbdev/ocropy?utm_source=badge&utm_medium=badge&utm_campaign=pr-badge&utm_content=badge)
OCRopus is a collection of document analysis programs, not a turn-key OCR system.
In order to apply it to your documents, you may need to do some image preprocessing,
and possibly also train new models.
......@@ -21,9 +20,6 @@ trace by default since it seems to confuse too many users).
Installing
----------
[![Build Status](https://travis-ci.org/tmbdev/ocropy.svg)](https://travis-ci.org/tmbdev/ocropy)
[![Join the chat at https://gitter.im/tmbdev/ocropy](https://badges.gitter.im/Join%20Chat.svg)](https://gitter.im/tmbdev/ocropy?utm_source=badge&utm_medium=badge&utm_campaign=pr-badge&utm_content=badge)
To install OCRopus dependencies system-wide:
$ sudo apt-get install $(cat PACKAGES)
......@@ -31,14 +27,15 @@ To install OCRopus dependencies system-wide:
$ mv en-default.pyrnn.gz models/
$ sudo python setup.py install
Alternatively, dependencies can be installed into a [Python Virtual Environment]
(http://docs.python-guide.org/en/latest/dev/virtualenvs/):
Alternatively, dependencies can be installed into a
[Python Virtual Environment](http://docs.python-guide.org/en/latest/dev/virtualenvs/):
$ virtualenv ocropus_venv/
$ source ocropus_venv/bin/activate
$ pip install -r requirements.txt
$ wget -nd http://www.tmbdev.net/en-default.pyrnn.gz
$ mv en-default.pyrnn.gz models/
$ python setup.py install
An additional method using [Conda](http://conda.pydata.org/) is also possible:
......@@ -97,6 +94,14 @@ suitable for training OCRopus with synthetic data.
## Roadmap
------------------------
| Project Announcements
|:-----------------------
| The text line recognizer has been ported to C++ and is now a separate project, the CLSTM project, available here: https://github.com/tmbdev/clstm
| New GPU-capable text line recognizers and deep-learning based layout analysis methods are in the works and will be published as separate projects some time in 2017.
| Please welcome @zuphilip and @kba as additional project maintainers. @tmb is busy developing new DNN models for document analysis (among other things). (10/15/2016)
------------------------
A lot of excellent packages have become available for deep learning, vision, and GPU computing over the last few years.
At the same time, it has become feasible now to address problems like layout analysis and text line following
through attentional and reinforcement learning mechanisms. I (@tmb) am planning on developing new software using these
......
machine:
python:
version: 2.7.12
environment:
# Set matplotlb backend to the non-interactive antigrain image lib
MPLBACKEND: Agg
dependencies:
pre:
# 'models' folder is cached, don't download twice
- cd models && wget -nc http://www.tmbdev.net/en-default.pyrnn.gz
# Pipe to cat to hide the progress bars
- pip install -r requirements.txt|cat
cache_directories:
- models
test:
override:
- PATH=$PWD:$PATH ./run-test-ci
File added
This diff is collapsed.
<html><head><title>workflow.vue</title><script src="http://ajax.googleapis.com/ajax/libs/jquery/1.3.1/jquery.min.js" type="text/javascript"></script><script type="text/javascript">
jQuery.noConflict();
</script>
<script src="http://vue.tufts.edu/htmlexport-includes/jquery.maphilight.min.js" type="text/javascript"></script><script src="http://vue.tufts.edu/htmlexport-includes/v3/tooltip.min.js" type="text/javascript"></script><script type="text/javascript">jQuery(function() {jQuery.fn.maphilight.defaults = {
fill: false,
fillColor: '000000',
fillOpacity: 0.2,
stroke: true,
strokeColor: '282828',
strokeOpacity: 1,
strokeWidth: 4,
fade: true,
alwaysOn: false
}
jQuery('.example2 img').maphilight();
});
</script>
<style type="text/css">
#tooltip{
position:absolute;
border:1px solid #333;
background:#f7f5d1;
padding:2px 5px;
color:#333;
display:none;
}
</style>
</head><body>
<div class="example2"><img class="map" src="workflow.png" width="959.0" height="626.0" usemap="#vuemap"><map name="vuemap"> <area id="node0" shape="rect" coords="162,312,293,365"></area>
<area href="https://github.com/tmbdev/ocropy/wiki/Compute-errors-and-confusions" target="_blank" id="node1" shape="rect" coords="718,316,849,369"></area>
<area href="https://github.com/tmbdev/ocropy/wiki/Page-Segmentation" target="_blank" id="node2" shape="rect" coords="241,215,385,268"></area>
<area href="https://github.com/tmbdev/ocropy/wiki/Working-with-Ground-Truth" target="_blank" id="node3" shape="rect" coords="482,310,627,363"></area>
<area id="node4" shape="rect" coords="490,113,621,166"></area>
<area id="node5" shape="rect" coords="715,505,846,558"></area>
<area id="node6" shape="rect" coords="15,212,146,265"></area>
<area id="node7" shape="rect" coords="489,215,620,268"></area>
<area id="node8" shape="rect" coords="716,420,847,473"></area>
<area id="node9" shape="rect" coords="652,112,801,165"></area>
<area href="https://github.com/tmbdev/ocropy/wiki/Working-with-Ground-Truth" target="_blank" id="node10" shape="rect" coords="476,419,632,472"></area>
<area id="node11" shape="rect" coords="42,163,118,186"></area>
<area id="node12" shape="rect" coords="716,586,777,609"></area>
<area id="node13" shape="rect" coords="801,586,860,609"></area>
<area href="https://github.com/tmbdev/hocr-tools#hocr-pdf" target="_blank" id="node14" shape="rect" coords="492,17,623,70"></area>
<area id="node15" shape="rect" coords="707,231,799,254"></area>
<area id="node16" shape="rect" coords="660,33,820,56"></area>
<area id="node17" shape="rect" coords="870,319,942,357"></area>
<area id="node18" shape="rect" coords="888,435,934,458"></area>
</map></div></body></html>
\ No newline at end of file
This diff is collapsed.
You can find a list of available pre-trained models for ocropy here:
https://github.com/tmbdev/ocropy/wiki/Models
Copy the models into this directory to use them.
......@@ -14,8 +14,10 @@ german = u"ÄäÖöÜüß"
french = u"ÀàÂâÆæÇçÉéÈèÊêËëÎîÏïÔôŒœÙùÛûÜüŸÿ"
turkish = u"ĞğŞşıſ"
greek = u"ΑαΒβΓγΔδΕεΖζΗηΘθΙιΚκΛλΜμΝνΞξΟοΠπΡρΣσςΤτΥυΦφΧχΨψΩω"
portuguese = u"ÁÃÌÍÒÓÕÚáãìíòóõú"
telugu = u" ఁంఃఅఆఇఈఉఊఋఌఎఏఐఒఓఔకఖగఘఙచఛజఝఞటఠడఢణతథదధనపఫబభమయరఱలళవశషసహఽాిీుూృౄెేైొోౌ్ౘౙౠౡౢౣ౦౧౨౩౪౫౬౭౮౯"
default = ascii+xsymbols+german+french
default = ascii+xsymbols+german+french+portuguese
european = default+turkish+greek
......
......@@ -45,8 +45,11 @@ class CenterNormalizer:
def dewarp(self,img,cval=0,dtype=dtype('f')):
assert img.shape==self.shape
h,w = img.shape
padded = vstack([cval*ones((h,w)),img,cval*ones((h,w))])
center = self.center+h
# The actual image img is embedded into a larger image by
# adding vertical space on top and at the bottom (padding)
hpadding = self.r # this is large enough
padded = vstack([cval*ones((hpadding,w)),img,cval*ones((hpadding,w))])
center = self.center + hpadding
dewarped = [padded[center[i]-self.r:center[i]+self.r,i] for i in range(w)]
dewarped = array(dewarped,dtype=dtype).T
return dewarped
......
......@@ -197,12 +197,13 @@ def select_regions(binary,f,min=0,nbest=100000):
scores = [f(o) for o in objects]
best = argsort(scores)
keep = zeros(len(objects)+1,'i')
for i in best[-nbest:]:
if scores[i]<=min: continue
keep[i+1] = 1
# print(scores, best[-nbest:], keep)
# print(sorted(list(set(labels.ravel()))))
# print(sorted(list(set(keep[labels].ravel()))))
if nbest > 0:
for i in best[-nbest:]:
if scores[i]<=min: continue
keep[i+1] = 1
# print scores,best[-nbest:],keep
# print sorted(list(set(labels.ravel())))
# print sorted(list(set(keep[labels].ravel())))
return keep[labels]
@checks(SEGMENTATION)
......
......@@ -52,11 +52,11 @@ for fname,e,t,m in sorted(outputs):
total += t
missing += m
print("errors %8d"%errs)
print("missing %8d"%missing)
print("total %8d"%total)
print("err %8.3f %%"%(errs*100.0/total))
print("errnomiss %8.3f %%"%((errs-missing)*100.0/total))
if args.erroronly:
print(errs * 1.0 / total)
if not args.erroronly:
print("errors %8d"%errs)
print("missing %8d"%missing)
print("total %8d"%total)
print("err %8.3f %%"%(errs*100.0/total))
print("errnomiss %8.3f %%"%((errs-missing)*100.0/total))
print(errs * 1.0 / total)
......@@ -25,67 +25,78 @@ from ocrolib import psegutils,morph,sl
from ocrolib.exceptions import OcropusException
from ocrolib.toplevel import *
parser = argparse.ArgumentParser()
parser = argparse.ArgumentParser(add_help=False)
# error checking
parser.add_argument('-n','--nocheck',action="store_true",
group_error_checking = parser.add_argument_group('error checking')
group_error_checking.add_argument('-n','--nocheck',action="store_true",
help="disable error checking on inputs")
parser.add_argument('-z','--zoom',type=float,default=0.5,
help='zoom for page background estimation, smaller=faster, default: %(default)s')
parser.add_argument('--gray',action='store_true',
help='output grayscale lines as well, default: %(default)s')
parser.add_argument('-q','--quiet',action='store_true',
help='be less verbose, default: %(default)s')
# limits
parser.add_argument('--minscale',type=float,default=12.0,
group_error_checking.add_argument('--minscale',type=float,default=12.0,
help='minimum scale permitted, default: %(default)s')
parser.add_argument('--maxlines',type=float,default=300,
group_error_checking.add_argument('--maxlines',type=float,default=300,
help='maximum # lines permitted, default: %(default)s')
# scale parameters
parser.add_argument('--scale',type=float,default=0.0,
group_scale = parser.add_argument_group('scale parameters')
group_scale.add_argument('--scale',type=float,default=0.0,
help='the basic scale of the document (roughly, xheight) 0=automatic, default: %(default)s')
parser.add_argument('--hscale',type=float,default=1.0,
group_scale.add_argument('--hscale',type=float,default=1.0,
help='non-standard scaling of horizontal parameters, default: %(default)s')
parser.add_argument('--vscale',type=float,default=1.0,
group_scale.add_argument('--vscale',type=float,default=1.0,
help='non-standard scaling of vertical parameters, default: %(default)s')
# line parameters
parser.add_argument('--threshold',type=float,default=0.2,
group_line = parser.add_argument_group('line parameters')
group_line.add_argument('--threshold',type=float,default=0.2,
help='baseline threshold, default: %(default)s')
parser.add_argument('--noise',type=int,default=8,
group_line.add_argument('--noise',type=int,default=8,
help="noise threshold for removing small components from lines, default: %(default)s")
parser.add_argument('--usegauss',action='store_true',
group_line.add_argument('--usegauss',action='store_true',
help='use gaussian instead of uniform, default: %(default)s')
# column parameters
parser.add_argument('--maxseps',type=int,default=2,
group_column = parser.add_argument_group('column parameters')
group_column.add_argument('--maxseps',type=int,default=0,
help='maximum black column separators, default: %(default)s')
parser.add_argument('--sepwiden',type=int,default=10,
group_column.add_argument('--sepwiden',type=int,default=10,
help='widen black separators (to account for warping), default: %(default)s')
parser.add_argument('-b','--blackseps',action="store_true",
help="also check for black column separators")
# Obsolete parameter for 'also check for black column separators'
# which can now be triggered simply by a positive maxseps value.
group_column.add_argument('-b','--blackseps',action="store_true",
help=argparse.SUPPRESS)
# whitespace column separators
parser.add_argument('--maxcolseps',type=int,default=2,
group_column.add_argument('--maxcolseps',type=int,default=3,
help='maximum # whitespace column separators, default: %(default)s')
parser.add_argument('--csminaspect',type=float,default=1.1,
help='minimum aspect ratio for column separators')
parser.add_argument('--csminheight',type=float,default=10,
group_column.add_argument('--csminheight',type=float,default=10,
help='minimum column height (units=scale), default: %(default)s')
# wait for input after everything is done
parser.add_argument('-p','--pad',type=int,default=3,
# Obsolete parameter for the 'minimum aspect ratio for column separators'
# used in the obsolete function compute_colseps_morph
group_column.add_argument('--csminaspect',type=float,default=1.1,
help=argparse.SUPPRESS)
# output parameters
group_output = parser.add_argument_group('output parameters')
group_output.add_argument('--gray',action='store_true',
help='output grayscale lines as well, default: %(default)s')
group_output.add_argument('-p','--pad',type=int,default=3,
help='padding for extracted lines, default: %(default)s')
parser.add_argument('-e','--expand',type=int,default=3,
group_output.add_argument('-e','--expand',type=int,default=3,
help='expand mask for grayscale extraction, default: %(default)s')
parser.add_argument('-Q','--parallel',type=int,default=0,
# other parameters
group_others = parser.add_argument_group('others')
group_others.add_argument('-q','--quiet',action='store_true',
help='be less verbose, default: %(default)s')
group_others.add_argument('-Q','--parallel',type=int,default=0,
help="number of CPUs to use")
parser.add_argument('-d','--debug',action="store_true")
group_others.add_argument('-d','--debug',action="store_true")
group_others.add_argument("-h", "--help", action="help", help="show this help message and exit")
# input files
parser.add_argument('files',nargs='+')
args = parser.parse_args()
args.files = ocrolib.glob_all(args.files)
......@@ -211,15 +222,22 @@ def compute_colseps_conv(binary,scale=1.0):
seps = maximum_filter(seps,(int(2*scale),1))
DSAVE("3seps",seps)
# select only the biggest column separators
seps = morph.select_regions(seps,sl.dim0,min=args.csminheight*scale,nbest=args.maxcolseps+1)
seps = morph.select_regions(seps,sl.dim0,min=args.csminheight*scale,nbest=args.maxcolseps)
DSAVE("4seps",seps)
return seps
def compute_colseps(binary,scale):
"""Computes column separators either from vertical black lines or whitespace."""
print_info("considering at most %g whitespace column separators" % args.maxcolseps)
colseps = compute_colseps_conv(binary,scale)
DSAVE("colwsseps",0.7*colseps+0.3*binary)
if args.blackseps:
if args.blackseps and args.maxseps == 0:
# simulate old behaviour of blackseps when the default value
# for maxseps was 2, but only when the maxseps-value is still zero
# and not set manually to a non-zero value
args.maxseps = 2
if args.maxseps > 0:
print_info("considering at most %g black column separators" % args.maxseps)
seps = compute_separators_morph(binary,scale)
DSAVE("colseps",0.7*seps+0.3*binary)
#colseps = compute_colseps_morph(binary,scale)
......@@ -227,7 +245,7 @@ def compute_colseps(binary,scale):
binary = minimum(binary,1-seps)
return colseps,binary
################################################################
### Text Line Finding.
......@@ -283,7 +301,7 @@ def compute_line_seeds(binary,bottom,top,colseps,scale):
seeds,_ = morph.label(seeds)
return seeds
################################################################
### The complete line segmentation process.
......@@ -326,7 +344,7 @@ def compute_segmentation(binary,scale):
segmentation = llabels*binary
return segmentation
################################################################
### Processing each file.
......
......@@ -137,14 +137,6 @@ def rdistort(image,distort=3.0,dsigma=10.0,cval=0):
if args.debug_show:
ion(); gray()
def bounding_box(a):
a = array(a>0,'i')
l = measurements.find_objects(a)
if len(l)<1: return (0,0,0,0)
ys,xs = l[0]
# y0,x0,y1,x1
return (ys.start,xs.start,ys.stop,xs.stop)
base = args.base
print("base", base)
os.system("rm -rf "+base)
......
......@@ -10,7 +10,7 @@ from scipy import stats
import multiprocessing
import ocrolib
parser = argparse.ArgumentParser("""
Image binarization using non-linear processing.
......@@ -45,7 +45,7 @@ args.files = ocrolib.glob_all(args.files)
if len(args.files)<1:
parser.print_help()
sys.exit(0)
def print_info(*objs):
print("INFO: ", *objs, file=sys.stdout)
......@@ -63,16 +63,6 @@ def check_page(image):
if w>10000: return "line too wide for a page image %s"%(image.shape,)
return None
def estimate_scale(binary):
objects = binary_objects(binary)
bysize = sorted(objects,key=A)
scalemap = zeros(binary.shape)
for o in bysize:
if amax(scalemap[o])>0: continue
scalemap[o] = A(o)**0.5
scale = median(scalemap[(scalemap>3)&(scalemap<100)])
return scale
def estimate_skew_angle(image,angles):
estimates = []
for a in angles:
......@@ -85,17 +75,6 @@ def estimate_skew_angle(image,angles):
_,a = max(estimates)
return a
def select_regions(binary,f,min=0,nbest=100000):
labels,n = measurements.label(binary)
objects = measurements.find_objects(labels)
scores = [f(o) for o in objects]
best = argsort(scores)
keep = zeros(len(objects)+1,'B')
for i in best[-nbest:]:
if scores[i]<=min: continue
keep[i+1] = 1
return keep[labels]
def H(s): return s[0].stop-s[0].start
def W(s): return s[1].stop-s[1].start
def A(s): return W(s)*H(s)
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment