Internati
o
nal
Journal of Ele
c
trical
and Computer
Engineering
(IJE
CE)
V
o
l.
6, N
o
. 4
,
A
ugu
st
2016
, pp
. 16
47
~
1
653
I
S
SN
: 208
8-8
7
0
8
,
D
O
I
:
10.115
91
/ij
ece.v6
i
4.1
055
3
1
647
Jo
urn
a
l
h
o
me
pa
ge
: h
ttp
://iaesjo
u
r
na
l.com/
o
n
lin
e/ind
e
x.ph
p
/
IJECE
A Zone Based Approach for Clas
s
i
fication and Recogniti
o
n of
Telugu Handwritten Characters
N. Sh
ob
a Ra
n
i
,
Sanjay Kum
a
r Verm
a, Anitta
Jose
ph
Department o
f
C
o
mputer Scien
c
e, Amrita Sc
hool
of Arts and
Sciences, M
y
suru C
a
mpus,
Amrita
Vishwa Vidy
apeetham,
Amr
ita Univers
i
t
y
,
Karn
atak
a,
In
dia
Article Info
A
B
STRAC
T
Article histo
r
y:
Received
Mar 18, 2016
Rev
i
sed
May 21
, 20
16
Accepte
d
J
u
n 6, 2016
Reali
z
a
tion of
high ac
cura
cies
and effi
cien
ci
e
s
in S
outh Indian char
ac
ter
recognition s
y
stems is one of th
e princi
p
l
e goals
to be attempted
time after
time so as to promote the usage of
optical ch
aracter r
ecognition
(OCR) for
South Indian languages lik
e Telugu. Th
e process of character
recognition
com
p
ris
e
s
pre-proces
s
i
ng, s
e
gm
entation
,
fe
ature e
x
trac
tion,
clas
s
i
f
i
ca
tion an
d
recognition. The featu
r
e
extraction st
age
is meant for uniqu
ely
recognizin
g
each
ch
ara
c
ter
i
m
a
ge for
the
p
u
rpos
e of
clas
s
i
f
y
ing
i
t
.
The
s
e
lec
tion of
a
featur
e extr
act
io
n algorithm
is
ver
y
crit
ica
l
an
d im
portant for an
y
im
age
processing application and mostly
of th
e tim
es it is direct
l
y
pro
portional t
o
the t
y
pe of th
e im
age obje
c
t
s
that we hav
e
to iden
tif
y.
For optica
l
techno
logies
lik
e South Ind
i
an
OCR, the featur
e ex
traction
tech
nique p
l
ay
s a
ver
y
vi
tal rol
e
i
n
accura
c
y
of re
cognition due to
the huge chara
c
ter sets. I
n
this work we m
a
inly
focus on
evaluating
the p
e
rformance of
various feature
extraction techn
i
ques with respect to
Telugu ch
aracter
recognition sy
stems
and an
al
yz
e
its
e
ffici
enci
es
and
a
ccura
ci
es
in r
eco
gnition of
T
e
lug
u
char
ac
te
r
se
t.
Keyword:
K-
neare
s
t
nei
g
hb
o
r
i
n
g
Statistical features
Zoning
Copyright ©
201
6 Institut
e
o
f
Ad
vanced
Engin
eer
ing and S
c
i
e
nce.
All rights re
se
rve
d
.
Co
rresp
ond
i
ng
Autho
r
:
N
.
Shob
a Ran
i
,
Depa
rt
m
e
nt
of
C
o
m
put
er Sci
e
nce,
Am
rita Vishwa
Vidy
apeet
ham
Uni
v
ersity
, M
y
sur
u
Cam
pus,
#1
14
, 7
th
c
r
oss
B
o
gadi
2
nd
stag
e, Mysu
ru
- 570
026
.
Em
a
il: n
.
sho
bha1
985
@g
m
a
il.
co
m
1.
INTRODUCTION
Th
e evo
l
u
tion
o
f
an
y tech
no
l
o
g
y
relies on
t
h
e reliab
ility a
n
d
efficien
cy of th
e ou
tco
m
es g
e
n
e
rated
b
y
it. These fact
ors can be c
o
nstituted into
optical technol
ogies
[1] only whe
n
the internal
procedures de
fined t
o
per
f
o
r
m
t
h
e p
r
ocessi
n
g
are
c
onsi
s
t
e
nt
with th
e typ
e
o
f
in
pu
t
d
a
ta.
Especially fo
r tech
no
log
i
es lik
e Sou
t
h
Indian
OCR pa
ckage
s
it
is very
dom
inant fac
t
or
t
h
at hi
ghly influe
nces
the
accuracy
of the system
. The
Sout
h
I
n
d
i
an
langu
ages lik
e Telu
g
u
h
a
s v
e
r
y
w
i
d
e
ch
ar
acter
set o
f
arou
nd
436 d
i
stin
ct ch
ar
acter
s
w
h
ich
inclu
d
e
s
v
o
wels, co
n
s
on
an
ts, sing
le an
d
m
u
lti-co
njun
ct vo
wel
conso
n
an
t clu
s
ters, h
o
wev
e
r t
h
e d
a
taset ex
cludes th
e
no
n
-
f
r
eq
ue
nt
l
y
occur
r
i
n
g ch
a
r
acters [2]. T
h
e identification a
nd selec
t
i
o
n of
uni
q
u
e feat
ure
s
for
reco
g
n
i
t
i
on
of
each c
h
aracte
r
of wi
der c
h
a
r
a
c
ter set inc
r
eas
es the c
h
a
n
ces
of errone
ous
outcom
es and that
m
a
y lead to
non-
reliab
ility o
f
the OCR.
Reco
gn
itio
n
o
f
h
a
ndwritten
op
tical ch
aracter is v
e
ry
d
i
fficu
lt d
u
e
t
o
d
i
fferen
t
writin
g
style o
f
th
e
di
ffe
re
nt
pe
rso
n
.
Due
t
o
l
a
r
g
e
num
ber
of
charact
e
r
and pres
ence
of
half c
h
aracte
r
and som
e
confusing
charact
e
r
s m
a
kes t
h
e
reco
g
n
i
t
i
on
pr
ocess
e
v
en
m
o
re c
o
m
p
l
e
x.
I
n
t
h
i
s
we t
a
ke
dat
a
f
r
om
m
a
ny
use
r
s a
n
d
fo
u
nd t
h
at
wri
t
i
ng st
y
l
e of e
v
ery
user i
s
di
ff
erent
.
S
o
,
t
h
e recognition
of the cha
r
acter
is
v
e
ry d
i
fficu
lt. In
th
is
work obj
ectiv
e is to
recogn
ize ch
a
r
acter i
n
T
e
lugu
by using
som
e
feature e
x
traction techniques
.
Evaluation Warning : The document was created with Spire.PDF for Python.
I
S
SN
:
2
088
-87
08
I
J
ECE
Vo
l. 6
,
N
o
. 4
,
Au
gu
st 2
016
:
16
47
–
1
653
1
648
2.
RELATED WORK
There a
r
e vari
ous
feature extraction technique
s that
are use
d
for extra
c
tion of features that are
uni
que
to a
pa
rticular c
h
arac
ter. T
h
e
broa
d categoriza
tion of technique i
n
cludes
directi
onal a
n
d z
o
ne
wise
feat
ure
s
. T
h
e f
eat
ure e
x
t
r
act
i
on t
e
c
h
ni
q
u
e
base
d o
n
di
re
ctio
n
a
l features [3
] co
m
p
rises th
e id
en
tificatio
n
of
featu
r
es lik
e
startin
g po
in
t and
in
tersection
p
o
i
n
t
lo
catio
n
s
, d
i
stingu
ish
i
n
d
i
v
i
du
al lin
e seg
m
en
ts, lab
e
lin
g lin
e
segm
ent
i
n
fo
r
m
at
i
on an
d l
i
n
e t
y
pe n
o
rm
al
izat
i
on.
The t
e
chni
que
s l
i
k
e
zoni
ng
[
4
]
i
n
v
o
l
v
e c
o
m
put
at
i
o
n
o
f
directional feat
ures
with res
p
ect to
every
zo
ne f
r
om
whi
c
h
a fi
xed si
ze
fe
at
ure v
ect
or i
s
deri
ved a
n
d se
nt
as
input to the classifier. T
h
e fea
t
ures
l
i
k
e l
i
n
e segm
ent
l
e
ngt
h,
l
i
n
e segm
ent
d
i
rect
i
on an
d i
n
t
e
rsect
i
on
poi
nt
s are
com
put
ed f
r
o
m
feat
ure
vect
or c
o
nsi
s
t
i
ng
of al
l
zo
nes i
n
fo
rm
ati
on. T
h
e
ot
he
r ki
nds
o
f
t
ech
ni
q
u
es
u
s
ed f
o
r
ch
aracter recog
n
ition
in
cl
u
d
e
s th
e tran
sitio
n featu
r
es [5
] that are co
m
p
u
t
ed
with
resp
ect
to
th
e lo
cation
s
o
f
t
h
e
i
m
ag
es wh
ere
th
e tran
sitio
n is h
a
pp
en
ed
fro
m
b
ack
g
r
oun
d to
fo
reg
r
oun
d p
i
x
e
ls. Laksh
m
i et. al. [6] h
a
d
p
r
ov
id
ed
so
m
e
n
o
v
e
l id
eas o
f
ex
tracting features us
ing
K-m
ean
s, with
resu
lts similar to
au
t
o
-enco
d
i
ng
tech
n
i
qu
es and also
em
p
l
o
y
ed
SVM classifier. Mallik
arjun
Hang
arg
e
et
.al [7
] h
a
d
p
r
op
o
s
ed
an
algo
rith
m
u
s
ing
d
i
ag
on
al
featu
r
e ex
t
r
actio
n
sch
e
m
e
fo
r recog
n
i
zing o
f
f-lin
e h
a
ndwritten
ch
ar
acters, ev
ery character
im
age of size 90x 60 pixel
s
is divided i
n
to 54 eq
ual zone
s, each of size 10x10
pixels and feat
ures are
extracted
from each zone
pixels by
m
oving along
the diagonals of
its
re
s
p
ective 10X10 pixels. Pai et.al [8],
had
p
r
o
p
o
sed
a t
echni
q
u
e
fo
r
reco
g
n
i
t
i
on
us
i
ng
Kar
h
une
n
L
oe
ve t
r
an
sf
or
m
a
t
i
on, an
d t
h
e t
o
p
o
g
r
a
phi
c
feat
ur
e
map
s
ob
tain
ed th
ro
ugh
wei
g
h
t
sh
aring
in
t
h
e system
. Sark
ar et.al
[9
]: In
th
is
p
a
per, t
h
ey presen
t a
syste
m
,
wh
ich
au
to
m
a
t
i
cally sep
a
rates th
e
scri
p
t
s of h
a
nd
writte
n
wo
rd
s
fro
m
a docu
m
en
t, wh
ich is written
i
n
B
a
n
g
l
a
or
De
van
a
g
r
i
m
i
xed wi
t
h
R
o
m
a
n scri
pt
s
.
I
n
t
h
i
s
scri
pt
separatio
n techn
i
qu
e th
ey are ex
t
r
actin
g th
e tex
t
lin
es
and
w
o
r
d
s
fr
o
m
docum
ent
pages
usi
n
g a s
c
ri
pt
i
n
de
pen
d
e
nt
Nei
g
h
b
o
ri
ng
C
o
m
pone
nt
A
n
al
y
s
i
s
t
ech
ni
q
u
e.
Th
en
for th
e scrip
t
sep
a
ration
th
ey
h
a
v
e
d
e
sig
n
e
d
a Mu
lti Layer
Percep
t
i
o
n
(MLP)
b
a
sed
classifier, t
r
ain
e
d
with
8
d
i
fferen
t wo
rd
lev
e
l
h
o
listic
features. For t
h
e Syste
m
evaluati
on t
h
ey prepa
r
ed t
w
o e
q
ual size
d
dat
a
set
s
, o
n
e
wi
t
h
B
a
n
g
l
a
and R
o
m
a
n scri
pt
s an
d
t
h
e ot
he
r wi
t
h
De
va
nag
r
i
and R
o
m
a
n scri
pt
s
.
Kan
d
u
l
a
Ve
nka
t
a
R
e
ddy
[1
0]
:
In th
is
p
a
p
e
r t
h
ere two
techniq
u
e
s
fo
r i
d
en
t
i
fy h
a
n
d
written
ch
aracter those are
active cha
r
acter detection
(ACR) and
c
ont
o
u
r al
go
ri
t
h
m
s
. These t
w
o
t
ech
ni
q
u
es ca
n
be i
m
pl
em
ent
e
d b
y
usi
n
g
f
u
zzy log
i
c. Patter
n
d
e
tection
and
ar
tif
icial n
e
ur
al n
e
two
r
k
and
fu
zzy lo
g
i
c. Th
e
u
nkno
wn
ch
ar
acter to
b
e
tested
for id
en
tificatio
n
is
also
con
v
e
rted to
an
im
age
and c
o
m
p
are with standard
im
age and the
r
e by
reco
g
n
i
zed
by
usi
n
g t
h
e f
u
zzy
l
ogi
c
ge
nerat
o
rs.
3.
PROP
OSE
D
METHO
D
OL
OGY
The propose
d
methodology for f
eature e
x
traction and clas
sificati
on is accom
p
lished in
two stage
s
.
The st
a
g
e
one
i
n
v
o
l
v
e
s
t
h
e
com
put
at
i
on
o
f
H
u
-M
om
ent
s
features, statistical featur
es
and classification of
feat
ure
s
t
o
v
a
ri
ous cl
asse
s i
s
per
f
o
r
m
e
d i
n
st
age t
w
o
.
Th
e B
l
ock Di
a
g
ra
m
of p
r
o
p
o
sed
sy
st
em
i
s
depi
ct
ed i
n
Fi
gu
re 1.
Fig
u
re
1
.
Blo
c
k
d
i
agram
o
f
C
h
aracter recogn
itio
n
In
itially th
e pro
p
o
s
ed
al
g
o
rith
m
assu
m
e
s a
n
inpu
t of t
h
e
seg
m
en
ted
ch
aracters
fro
m
th
e do
cu
m
e
n
t
i
m
ag
e. Th
e p
r
esen
t work
h
a
s e
m
p
l
o
y
ed
th
e h
a
nd
written
ch
aracter samp
les th
at are syn
t
h
e
tically g
e
n
e
rated
fr
om
vari
ous
users
.
Fi
g
u
re
2 sh
o
w
s t
h
e i
n
st
ances
of
fe
w of the cha
r
a
c
ter sam
p
les
that are considered for
expe
ri
m
e
nt
at
i
o
n.
Evaluation Warning : The document was created with Spire.PDF for Python.
I
J
ECE
I
S
SN
:
208
8-8
7
0
8
A
Zo
na
Ba
sed
App
r
oa
ch
for Cla
ssifica
tio
n
a
n
d
Recogn
itio
n
o
f
Telugu
Han
d
w
ritten
....
(
N
.
Sho
ba
Ran
i
)
1
649
Fi
gu
re
2.
sam
p
l
e
s of
Ha
n
d
w
r
i
t
t
e
n C
h
ara
c
t
e
rs
The
feature c
o
m
putation and
classi
fication
of c
h
aracters
are
disc
usse
d
in the sub
s
ection
s
A an
d B.
3.
1.
Feature Com
putati
o
n
Th
e f
eat
u
r
e com
p
u
t
atio
n
is p
e
r
f
o
r
m
e
d
o
n
t
h
e pr
e-p
r
o
cessed
inp
u
t
sam
p
les. Each
input sa
m
p
le o
r
ch
aracter is in
itiall
y d
i
v
i
d
e
d
i
n
to
‘9
’ Zon
e
s.
Th
e seg
m
en
ta
tio
n
o
f
ch
aracter i
m
ag
e in
to
zo
n
e
s is as p
r
esen
ted
in
Fi
gu
re 3.
Fi
gu
re
3.
Di
vi
si
on
o
f
c
h
aract
e
r
i
n
Zo
nes
Each Z
o
ne is furthe
r subject
ed to t
h
e proc
ess of
feat
ure com
put
at
i
on. I
n
t
h
e pr
o
pos
e
d
w
o
rk H
u
-
m
o
ments and
statistical features
are
em
ployed as
features. If Z
1
,
Z2, Z3, Z4…Z
9 represe
n
ts all the
nine
Zo
nes, t
h
en
t
h
e
H
u
f
eat
ure
s
c
o
m
put
ed i
s
gi
ve
n
by
eq
uat
i
o
n
(
1
)
.
Hu=
{H
u
_
f
1
(z
1
), Hu
_
f(Z
2
)… Hu
_f
(Z
9
)}
(1
)
whe
r
e each Hu-feature (Zi) a
nd i=1, 2, 3, 4...9 is furt
he
r com
posed of se
ven feat
ures a
nd
give
n by equation
(2
).
Hu
_f
(Z
9
)= {m
1
Zi, m
2
Zi, m
3
Zi
, m
4
Z
i
… m
7
Zi} (2)
Thus we
ha
ve
nine z
one
s and seven
features
from
each zone leading t
o
a
total
of 9*7 features of Hu-
m
o
ments. Similarly the statistical features
like ce
ntro
id of
each
zone, entropy of
eac
h z
one
is
give
n
by
equat
i
o
n (
3
).
S
t=
h.S
t
_
f(Z
1
), S
t
_f
(Z
2
)…
S
t
_f
(Z
9
) (3
)
whe
r
e eac
h
S
t
_
f(Z
i
)={cZ
i
,eZ
i
}
Whe
r
e cZ
i
, eZ
i
represe
n
ts the
centroid a
n
d e
n
tropy
features
of eac
h z
one
.
Indicating statistical feature
S
t.
The
feature
com
pution is
as
depicted
i
n
Fi
gu
re
4.
T
hus
f
o
r eac
h
char
acte
r
a t
o
tal of 81
features a
r
e
o
b
t
ain
e
d
.
Th
ese features are
fo
rward
e
d
for cla
ssificatio
n
st
ag
e
for th
e reco
gn
itio
n of ch
aracter.
Evaluation Warning : The document was created with Spire.PDF for Python.
I
S
SN
:
2
088
-87
08
I
J
ECE
Vo
l. 6
,
N
o
. 4
,
Au
gu
st 2
016
:
16
47
–
1
653
1
650
Fi
gu
re
4.
Feat
u
r
e c
o
m
put
at
i
o
n
3.
2.
Cl
assi
fi
c
a
ti
on and
Rec
o
gni
t
i
o
n
The cl
assi
fi
cat
i
on i
n
t
h
e p
r
ese
n
t
wo
rk i
s
pe
rf
orm
e
d usi
n
g K
NN a
nd S
V
M
C
l
assi
fi
cat
i
on. If C
1
, C
2
,
C3
…C
n
re
pres
ents the
classes
and classifier is applied
on th
e each
feature
set. The
cla
ssification of
features t
o
vari
ous
cl
asses
i
s
as
gi
ve
n i
n
Fi
gu
re
5.
Fi
gu
re
5.
C
l
assi
fi
cat
i
on
of
fea
t
ure c
o
m
put
ed
Th
e p
r
o
p
o
s
ed
work
con
s
id
ers
th
e
Telug
u
han
d
writte
n
vowels h
a
v
e
th
e
v
a
ri
o
u
s
classes. Th
ere are
to
tally 1
6
vo
wels and
wh
ich
are sh
own
in Fig
u
re
6
.
Fi
gu
re 6.
Vo
w
e
l
set
Th
e en
tire d
a
taset fo
r classificatio
n
con
s
id
ered
is o
f
100
u
s
er
s in
wh
ich
70% is u
s
ed
fo
r
tr
ain
i
ng
and
30
% i
s
use
d
f
o
r t
e
st
i
ng.
The t
r
ai
ni
n
g
set
i
s
c
o
m
posed
of
7
0
user
refe
re
nce
of
1
0
v
o
w
el
s;
t
h
e t
r
ai
ni
n
g
m
a
t
r
i
x
r
e
pr
esen
tatio
n
o
f
f
eatur
es is as show
n in
Figu
r
e
7.
Evaluation Warning : The document was created with Spire.PDF for Python.
I
J
ECE
I
S
SN
:
208
8-8
7
0
8
A
Zo
na
Ba
sed
App
r
oa
ch
for Cla
ssifica
tio
n
a
n
d
Recogn
itio
n
o
f
Telugu
Han
d
w
ritten
....
(
N
.
Sho
ba
Ran
i
)
1
651
Fi
gu
re
7.
Trai
n
i
ng m
a
t
r
i
x
re
pr
esent
a
t
i
o
n
Here V1,
V2, V3… V70 re
presents the cha
r
acter refe
re
nc
es consi
d
ere
d
fo
r t
r
ai
ni
n
g
set
and o
n
t
h
e
sam
e
w
a
y a test set of
size 30
*81
is co
m
p
uted
for
eac
h v
o
w
el
c
onsi
s
t
i
n
g of
t
h
r
ee
refe
re
nces eac
h. T
h
e
class
l
a
bel
s
are
o
f
1
6
a
n
d a
m
a
t
r
i
x
of di
m
e
nsi
on 70
*
1
i
s
cr
eated. Each row instance of tr
ai
n
i
ng
m
a
trix
. Th
e matrix
represen
tatio
n
o
f
lab
e
ls is as sh
own
in th
e
Fig
u
re
8
.
Fig
u
re
8
.
Class lab
e
l m
a
trix
rep
r
esen
tatio
n
The KNN clas
sifier and SVM classi
fier are e
m
ployed on training se
t for reco
gn
itio
n
of class lab
e
ls.
The o
u
t
c
om
e of r
eco
g
n
i
t
i
on
i
s
repre
s
ent
e
d i
n
Fi
g
u
re
9 an
d Fi
g
u
re
10
w
i
t
h
respect
t
o
bot
h K
NN a
n
d
SVM
Classifier
Figure
9. KNN classifier
View
p
e
rcen
tag
e
p
e
r tru
e
class in
clu
d
i
n
g
tru
e
po
sitiv
e rates (TPR)
an
d
false n
e
gativ
e rates
(FNR
).T
h
e ac
curacy
(AC
)
is that th
e
p
r
op
ortion
of the fu
ll rang
e
o
f
p
r
ed
iction
s
t
h
at were co
rrect. It’s
d
e
term
in
ed
usin
g th
e equ
a
tion
(4)
Evaluation Warning : The document was created with Spire.PDF for Python.
I
S
SN
:
2
088
-87
08
I
J
ECE
Vo
l. 6
,
N
o
. 4
,
Au
gu
st 2
016
:
16
47
–
1
653
1
652
Accuracy =
(4
)
Th
e recall or tru
e
po
sitiv
e rate (TP) is that th
e p
r
opo
rti
o
n
o
f
p
o
s
itiv
e cases th
at were p
r
op
erly
known, as
calculated
usi
n
g t
h
e eq
uat
i
o
n
(
5
)
Tru
e
Po
sitiv
e
Rate =
(5
)
Th
e false
n
e
g
a
tiv
e rate (FN)
i
s
th
at th
e
pro
p
o
r
tion
of
p
o
s
itiv
es cases th
at
were in
correctl
y
classified
as n
e
g
a
tiv
e, as
calcu
lated
u
s
ing
th
e equ
a
tion
(6)
False Neg
a
tiv
e Rate =
(6
)
Here m
a
trix confusion
for
KNN is
achie
ving accuracy of 76.47%
a
nd
overall error is s
h
owing in
re
d
color. Whe
n
we
are perform
i
ng
sam
e
feature extraction
on train
data as
we
ll as testd
a
t
a
(n
ew ch
aracter) t
h
en
it d
i
rectly p
r
opo
rtion
a
l an
d rep
r
esen
ted in
Fig
u
re
10
.
Fi
gu
re 1
0
. K
N
N
Scat
t
e
r
pl
ot
of
t
r
ai
n dat
a
SVM classifier analysis ne
w
characte
r
testset data w
ith
t
r
ain
d
a
ta an
d try to
fi
n
d
t
h
e
n
e
arest m
a
tch
.
It is sh
owing
h
o
w two
d
i
fferen
t
h
a
nd
written
ch
aracters
are similar with
n
earest
m
a
t
c
h
is represented
in
F
i
g
u
r
e
11
.
Figure 11. SVM
classifier
4.
E
X
PRIME
N
TAL RE
SULT
We co
llected
ch
aracters fro
m
sev
e
ral h
a
nd
written
d
o
c
u
m
en
ts o
f
Telug
u
.The n
u
m
b
e
r o
f
characters in
the testing set
is 51.All the c
h
aracters s
qua
re m
easur
e collected in a
n
e
x
ceedingly syste
m
atic
manner from
Evaluation Warning : The document was created with Spire.PDF for Python.
I
J
ECE
I
S
SN
:
208
8-8
7
0
8
A
Zo
na
Ba
sed
App
r
oa
ch
for Cla
ssifica
tio
n
a
n
d
Recogn
itio
n
o
f
Telugu
Han
d
w
ritten
....
(
N
.
Sho
ba
Ran
i
)
1
653
handwritten
pa
ges. W
e
have perform
e
d
Feature
Ext
r
ac
tion techniques on each zone
of characte
r
i
m
age
and
we com
puted
each zone in feature vect
or.
Whe
n
we
ge
t new c
h
aracter
im
age then
we
are trying to identify
the cha
r
acter with res
p
ect to KNN and
SVM
r
e
prese
n
t
e
d
i
n
F
i
gu
re 12
.
Figure
12.
KNN a
n
d SVM cl
assifier
We use
here t
w
o cl
assi
fi
e
r
KN
N & S
V
M
and i
t
re
prese
n
t
di
ffere
nt
Acc
u
racy
.
We ha
v
e
t
e
st
ed wi
t
h
51
cha
r
act
ers
f
r
om
bot
h
KN
N
an
d S
V
M
.
Ac
curacy is
re
pre
s
en
ted in
Figu
re 12
.
5.
CO
NCL
USI
O
NS
Th
e pr
opo
sed alg
o
r
ith
m
o
f
f
eat
ure e
x
traction techniques for i
n
creasi
ng acc
uracy and
p
e
rform
a
n
ce, particu
l
arly on
ce it
in
vo
l
v
es
T
e
l
u
g
u
l
a
ng
ua
ge
. It
de
pe
nds
u
p
o
n
t
h
e
Al
go
ri
t
h
m
s
t
h
at
are
w
ont
t
o
Extract a
n
d classify the c
h
a
r
acter.
The
pre
s
ent System
s
use
varie
d
tota
l
l
y tech
n
i
qu
es
f
o
r
Ex
tr
action and
Classification t
h
at di
ffe
rs
in
accuracy.
In thi
s
project we are considering s
canne
d im
ages as input in order to
per
f
o
r
m
Zoni
n
g
,.
St
at
i
s
t
i
cal
feat
ure
s
(m
ean
, ent
r
o
p
y
,
st
a
n
dar
d
di
vi
si
o
n
,
K
-
nea
r
est
nei
g
h
b
o
r
an
d
Su
pp
o
r
t
Vector Mach
i
n
e classifier to
g
e
t b
e
st resu
lt o
f
ch
aract
er
reco
gn
itio
n. Th
e fu
t
u
re en
h
a
n
c
e
m
en
ts
th
at we wan
t
to
im
p
l
e
m
en
t
m
u
l
tip
le Classi
ficatio
n
s
.
REFERE
NC
ES
[1]
A.
Negi,
et al.
, “An
OCR
sy
stem for Telugu,”
Document An
alysis and Reco
gnition, 2001
. Proceed
ings. Si
xth
International Co
nference on
. I
E
EE, 2001
.
[2]
H. Swethalaksh
m
i,
et al.
, “Online handwritten
character recogn
ition of
Devan
a
gari and
Telugu
Characters usin
g
support vector m
achin
es,”
Tenth
I
n
ternational workshop on
Frontiers in handwriting recognition,
Suvisoft,
2006.
[3]
M. Blumenstein,
et al.
, “
A
novel fea
t
ure extr
act
ion techn
i
qu
e fo
r the recog
n
ition of segmented handwritten
chara
c
t
e
rs
,”
Doc
u
ment Ana
l
ysis
and Re
cognit
i
on
, Proc
eed
i
ngs. S
even
th Int
e
rnati
onal Conf
erence
on
. IEEE, 2003.
[4]
O.
D.
Trier,
et
a
l
., “Feature ex
tr
act
ion m
e
thods f
o
r char
acter
recognition-a survey
,”
Pattern reco
gnition,
vol/issue:
29(4), pp
. 641-6
62, 1996
.
[5]
P.
D.
Gader,
et al.
, “Handwritten word recognition with char
act
er and inter-ch
a
racter neural networks,”
Sy
ste
m
s,
Man, and
Cyber
n
etics,
Part B:
C
y
bernetics, IEEE Transactions on
,
vol/issue: 27(1)
, pp
. 158-164
, 1
997.
[6]
C. V. Lakshmi
and C. Patv
ardh
an
, “
A
n opt
ic
al
char
ac
ter
recog
n
ition
s
y
stem f
o
r printed
Telu
gu tex
t
,”
Pa
tter
n
analysis and app
lications,
vo
l/iss
u
e: 7(2)
, pp
. 190
-204, 2004
.
[7]
M. Hangarge,
et al.
, “
S
tat
i
stic
al
textur
e fea
t
ures
based handwritten and pr
in
ted t
e
xt cl
assific
a
tio
n in south indian
documents,”
arXiv preprin
t
arXiv: 1303.3087
,
20
13.
[8]
N. R. Pai
and
S. K. Vijay
kumar, “D
esign
and
implementation
of optical
ch
ar
acter r
ecognitio
n using template
m
a
tching for m
u
lti fon
t
s/size
,”
I
n
ternational Jou
r
nal of Resear
ch
in Engin
eering
and Technology,
vol/issue:
4(2),
2015.
[9]
R. Sarkar
,
et al.
,
“
W
o
rd level scr
i
pt id
entif
ica
tion
from
Bangla
an
d Devanagr
i han
d
written
texts m
i
x
ed with Rom
a
n
script,”
arXiv pr
eprint arXiv:100
2.4007,
2010.
[10]
K.
V.
Reddy
,
et al.
, “
H
and W
r
itt
en Chara
c
te
r De
tec
tion
b
y
Using Fuzzy
Logic Techniques,”
In
ter
national Journa
l
of Emerging
Technology and
Ad
vanced
Engineering,
vol/issue: 3(
3), 2013
.
Classifier
Total
Da
ta
sets
Total Testing
I
m
age
Recognized
Correctly
Unrecogni
z
ed
KNN
212
51
41
10
SV
M
212
51
39
12
Evaluation Warning : The document was created with Spire.PDF for Python.