TELKOM
NIKA
, Vol. 11, No. 10, Octobe
r 2013, pp. 5
741 ~ 5
748
ISSN: 2302-4
046
5741
Re
cei
v
ed Ap
ril 15, 2013; Revi
sed
Jul
y
1, 2013; Accept
ed Jul
y
15, 2
013
Identifying Overlapping Communities in
Directed
Networks Via T
r
iangles
Qing
y
u
Zou,
Fu Liu*, Tao Hou, Yihan J
i
ang
Coll
eg
e of Co
mmunicati
on E
ngi
neer
in
g, Jili
n Un
iv
ersit
y
, C
han
gch
un 13
0
000, Jil
i
n, Chi
n
a
*Corres
p
o
ndi
n
g
author, e-ma
i
l
: q
y
z
ou1
1@m
a
ils.jl
u
.ed
u
.cn, liufu@
jlu.
edu.c
n
*
A
b
st
r
a
ct
A lot of com
p
lex system
s
in
nat
ure
and soc
i
ety can
be r
e
presented
as the form of network. T
he
sma
ll-sca
le su
bnets top
o
lo
gic
a
l featur
es are
vital to
un
derst
and th
e dyn
a
m
ics and fu
nctio
n
of the n
e
tw
orks.
T
r
iangl
es co
mprise
d of thre
e no
des
are t
he si
mple
st s
ubn
et in th
e
netw
o
rk. Base
d on
the tri
a
n
g
l
e
distrib
u
tion
of
the co
mplex
n
e
tw
ork, w
e
present
a n
o
ve
l
ap
proac
h to
detect ov
erla
p
p
in
g co
mmunit
y
structure in
dir
e
cted n
e
tw
orks. Different fro
m
pr
ev
i
ous st
udi
es focus
ed
on gr
ou
pin
g
n
odes, o
u
r
met
h
o
d
defines comm
unities
as gr
oup
s of links r
a
ther than nodes s
o
that n
odes
naturally belong to more than
one
community. It can id
entify a
suitab
le n
u
m
b
e
r of over
l
a
p
p
i
ng co
mmu
n
iti
e
s w
i
thout any
prior kn
ow
led
g
e
abo
ut the c
o
mmu
n
ity. W
e
ev
alu
a
t
ed
our
ap
proac
h o
n
sev
e
ral r
eal-
net
works. Experim
ental results prov
e
that the alg
o
rit
h
m pr
op
ose
d
is efficient for det
ecting ov
erl
a
ppi
ng co
mmu
n
i
t
ies in dir
e
cted
netw
o
rks.
Ke
y
w
ords
: dir
e
cted netw
o
rk, triang
les, overl
app
ing c
o
mmu
nities, li
nk si
mil
a
rity, partition
dens
ity
Copy
right
©
2013 Un
ive
r
sita
s Ah
mad
Dah
l
an
. All rig
h
t
s r
ese
rved
.
1. Introduc
tion
In orde
r to sh
ed light on th
e stru
ctu
r
e, d
y
nam
ic a
nd robu
st of com
p
lex system
s
in nature
and
so
ciety
they h
a
ve
bee
n
rep
r
e
s
ente
d
a
s
n
e
tworks,
in
whi
c
h nod
es
symb
olize the
comp
one
nts of
system an
d
links con
n
e
c
ting
n
ode
s d
enote the
rel
a
tionship
s
be
tween th
em [
1
-3
]
.
Comm
unity structu
r
e is o
n
e
of the most
important
fe
ature of man
y
complex ne
tworks [
4
-6
]
,
with
whi
c
h ca
n
re
veal
topol
ogi
cal relatio
n
sh
ips between
system
elem
ents
and
re
prese
n
t fun
c
tio
n
[
7
,
8
]
. Therefore
detecting th
e comm
unity stru
ctur
e i
n
the netwo
rk has b
een
attracte
d mu
ch
attention in re
cent years [
9
-
1
5
]
.
The
clu
s
terin
g
algo
rithm i
s
a
cla
s
s of
pattern
re
cog
n
ition metho
d
widely u
s
e
d
in many
fields [16-18]. By now there
are mainly two ki
nd
s of cl
usteri
ng alg
o
rithms have b
een propo
se
d
to
detect
comm
unities i
n
co
mplex net
works, o
ne i
s
o
p
t
imization
alg
o
rithm, an
d t
he othe
r i
s
t
h
e
hiera
r
chi
c
al
clusteri
ng m
e
thod [19]. O
n
e app
ro
ach
o
f
the first
sch
e
me i
s
ba
se
d on
a mea
s
ure
calle
d betwe
enne
ss. It calculate
s
on
e o
f
several me
as
u
r
e
s
[20, 21] of the flow of traffic across
the lin
ks of
a net
work
an
d then
re
mo
ves the
mo
st
traffic li
nks f
r
om th
e n
e
twork. T
w
o
ot
her
related
algo
ri
thms u
s
e
d
to
identify links for re
moval
are fluid
-
flo
w
and
cu
rre
nt-flow an
alogi
es
[22]. A different class of o
p
timization al
gorithm
s is th
e method
s b
a
se
d on info
rmation-th
eoretic
idea
s, su
ch
as the mini
m
u
m de
scriptio
n l
ength met
hod
s of Ro
svall and Bergstro
m [23]. The
basi
c
ide
a
is
to define a q
uantity that is high
for goo
d division
s of
a netwo
rk a
nd low for
ba
d
one
s, and th
en search th
e divisio
n
wi
th the high
e
s
t sco
r
e thro
ugh all th
e
possibl
e cases.
Variou
s different mea
s
u
r
e
s
for
cal
c
ulati
ng sco
r
e
s
ha
ve been
pro
p
o
se
d, su
ch
a
s
the li
kelih
o
od-
based m
e
a
s
u
r
es an
d oth
e
rs [24], b
u
t the mo
st wi
del
y use
d
me
asure i
s
th
e mo
dularity [15].
The
hiera
r
chi
c
al cl
usteri
ng al
gorithms in
clud
e
agglom
erativ
e and divi
sive
method
s to find communit
y
stru
cture in
n
e
tworks. Th
e
y
first comput
e the
stre
ngt
h of lin
k bet
ween e
a
ch p
a
ir node
s
ba
sed
on
different prop
erties,
su
ch
as lin
k betwe
enne
ss [25
], link cl
uste
rin
g
coeffici
ent [26], information
centrality [27], similarity ba
sed
on rand
o
m
wal
ks
[2
8], clu
s
terin
g
ce
ntrality [29], and so on. Th
en,
mergi
ng the two no
de
s wit
h
the highe
st
stren
g
th
of link repeate
d
ly (aggl
o
m
erative method),
or
removing the link
with the
lowes
t
s
t
rength repe
atedly
(divisive met
hod
s), the p
a
r
tition re
sult
s of
the netwo
rks are obtaine
d
.
Nearly all of these
meth
ods a
r
e ba
se
d on the pro
pertie
s
of no
des
and a
s
sum
e
d
each n
ode
b
e
long
s to
onl
y one
comm
unity
. Yong-Y
eol Ahn
et al
[30] and
T.
S.
Evans et al [31] reinvent comm
unitie
s
as gr
oup
s of links in undirected net
wo
rks a
nd sh
ow
that
the quality of a link pa
rtitio
n can b
e
eval
uated by
the
modula
r
ity of its co
rre
sp
on
ding line g
r
ap
h
.
Ho
wever, ma
ny of the networks
that we
would li
ke to study are di
re
cted, and a n
ode may belo
n
g
Evaluation Warning : The document was created with Spire.PDF for Python.
ISSN: 23
02-4
046
TELKOM
NIKA
Vol. 11, No
. 10, Octobe
r 2013 : 574
1 –
5748
5742
to several
co
mmunitie
s
, includi
ng the Wo
rld Wid
e
Web, food
webs, many bi
ologi
cal networks,
and even so
me
soci
al
n
e
tworks.
Th
e common
e
st ap
pr
oa
ch
to d
e
tecting
comm
unities in
dire
cted
netwo
rks h
a
s
be
en
simp
ly to ignore
the link
dire
ction
s
an
d a
pply algo
rith
ms d
e
sig
ned
for
undirecte
d
ne
tworks [3
2]. It is cle
a
r that
we a
r
e throwi
ng away a go
od deal
of informatio
n abo
ut
our n
e
two
r
k’
s stru
cture i
n
formatio
n that could
allow
us to ma
ke a
more a
c
cura
te determin
a
tion
of the commu
nities if discarding the directions of links.
In this pape
r, a new alg
o
ri
thm base
d
o
n
triangle di
stribution is p
r
opo
sed to de
tect the
overlap
p
ing
community
structure in
di
re
cted
netwo
rks
.
We
c
o
ns
ide
r
a
c
o
mmunity to
b
e
a set o
f
clo
s
ely inte
rrelated li
nks
rather than
a
set of
nod
es
with m
any lin
ks bet
wee
n
t
hem. Th
en, u
s
ing
hiera
r
chi
c
al cl
usteri
ng with
a similarity b
e
twee
n links to build a den
drog
ram
whe
r
e ea
ch leaf is a
link fro
m
the
origin
al net
work
and
bran
che
s
rep
r
e
s
e
n
t link
comm
unities. Th
e l
i
nk d
end
rog
r
am
provide
s
a
ri
ch hi
era
r
chy
of stru
cture
,
but to
obtain the mo
st
relevant
co
mmunitie
s
it is
necessa
ry to
determi
ne th
e be
st level a
t
which to
cut the tree. Fo
r this pu
rpo
s
e,
we int
r
odu
ce
a
new p
a
rtition
density ba
se
d on link d
e
n
s
ity insi
de
co
mmunitie
s
. Computing p
a
rtition density at
each level
of
the lin
k
den
drog
ram
allo
ws u
s
to
pick the
be
st le
vel to cut. We compa
r
e
d
the
perfo
rman
ce
of our
algo
rithm with th
ree
su
cces
sf
ul method
s:
cliqu
e
pe
rcol
ation [33], link
partition [31], and modul
arity spectral o
p
timizati
on [3
4] with three
real
-net
works includi
ng Ge
ne
netwo
rk, Em
ail netwo
rk
a
nd Metaboli
c
gene n
e
two
r
k.
Clique pe
rcolation is the
most pro
m
in
ent
overlap
p
ing
communitie
s
i
dentifying alg
o
rithm in
und
irecte
d net
wo
rks, lin
k pa
rtition is the fi
rst
detectin
g
overlap
p
ing
co
mmunitie
s
a
l
gor
ithm ba
sed on link prope
rty and modul
ari
t
y
maximizatio
n
can be g
ene
ralized in a p
r
inci
pled fa
sh
ion to inco
rpo
r
ate inform
ation co
ntained
in
link directio
n
s
. The appli
c
ation to re
a
l
-network
s show that ou
r method
wo
rks effe
ctively in
detectin
g
ove
r
lappi
ng com
m
unities in di
recte
d
networks.
2. Rese
arch
Metho
d
2.1. Communit
y
A comm
unity co
nsi
s
ted
of
node
s
and li
n
k
s bet
we
e
n
these n
ode
s i
s
p
a
rt of th
e
netwo
rk
with a few ti
es
with the rest of the
syst
em.
Althou
gh
no comm
on
definition has been ag
reed
upon, it is
widely a
c
cep
t
ed that a
community
sh
ould h
a
ve
more i
n
terna
l
than external
con
n
e
c
tion
s [21, 35]. The
node
s in th
e
same
comm
unity often h
a
ve com
m
on
prop
ertie
s
a
nd
den
sely intercon
ne
cted
co
mpared
with the re
st of
the
netwo
rk. It is noted that t
w
o
comm
unit
i
es
may
overla
p each
othe
r while
a nod
e can conn
ect wi
th different
co
mmunitie
s
si
multaneo
usly
[4].
In Figure 1, an example
of a directed
network
wit
h
commu
nitie
s
is sh
own. There are th
ree
comm
unitie
s
in this net
wo
rk, de
noted
by circle, sq
uare, p
entag
on and tri
a
n
g
le, re
spe
c
tively.
Nod
e
of
pent
agon
is a
co
mmon
nod
e
sin
c
e it
sh
oul
d bel
ong
to t
he
circle
co
m
m
unity as we
ll as
the triangle
community.
Figure 1. Example network sho
w
ing
com
m
unity stru
cture. The n
o
d
e
s of this net
work a
r
e divi
ded
into three g
r
o
ups, no
de pe
ntagon i
s
the comm
on no
d
e
of both the circle an
d tria
ngle
comm
unitie
s
Evaluation Warning : The document was created with Spire.PDF for Python.
TELKOM
NIKA
ISSN:
2302-4
046
Identifying O
v
erlap
p
ing
Co
mm
unities in Dire
cted
Net
w
orks via T
r
i
angle
s
(Qi
n
g
y
u Zou
)
5743
2.2. Triangle Vertex
es We
ightiness
In gene
ral, t
he
simple
bu
ilding bl
ocks
of com
p
lex n
e
tworks i
s
n
o
t a lin
k b
u
t
a sm
all
stru
cture of
several
node
s calle
d motif
[36]. Netw
o
r
k motifs a
r
e
small subg
rap
h
s that
ca
n
be
found in
a net
work
statisti
cally signifi
can
t
ly mo
re often
than in
ran
d
o
mize
d net
works. Amo
ng
the
possibl
e motifs, the si
mple
st one is th
e tri
angle
whi
c
h
repre
s
e
n
ts the
basi
c
u
n
it of transitivity an
d
redu
nda
ncy i
n
a grap
h, se
e Figure 2.
Figure 2. List of all 13 types of triangle
s
As sho
w
n in
Figure 2, there are 13 tri
a
n
g
le
ca
se
s at most, inclu
d
i
ng 39 vertex
es, in an
arbitrary dire
cted net
wo
rk.
We compa
r
e
all th
ree vert
exes on
e an
other for
ea
ch triangl
e
T
i
and
merg
e the
code
of verte
x
es h
ad th
e
sam
e
pl
ac
e
.
Then, the
r
e are 3
0
sp
ecial
vertexe
s
for
triangle
s
, en
coded from 1 t
o
30 in Fig
u
re 2. We a
s
si
gn differe
nt weights
w
i
to di
fferent vertex
es
i
, becau
se
some complex
triangle
s
co
ntain the
sim
p
le triangl
es,
such as tri
a
ngle 11
cont
ain
triangle 1. We assign hig
h
e
r wei
ghts to
the vert
exes whose are n
o
t affe
cted by other vertexes,
and lo
wer
wei
ghts to dep
en
d on other ve
rtexes. The
w
i
is cal
c
ulate
d
using a fun
c
t
i
on as follo
ws:
max
(
)
i
i
i
TC
w
TC
(1)
whe
r
e
TC
i
m
ean
s the nu
mber of vert
exes affecte
d
by vertex
i
.
We con
s
ide
r
that each verte
x
affec
t
s
it
s
e
lf.
For ins
t
ance,
for vertexes
1,
TC
1
=2, si
nce it affect
s ve
rtexes 25
and
itself; si
milarl
y,
TC
6
=3, sin
c
e
vertex 6 affected vertexe
s
17, 20
and it
self. The weig
hts of 3
0
vert
exes a
s
sh
o
w
n
in Table 1.
Table 1. The
weig
hts of 30
vertexes
Vertex Weight
Vertex Weight
Vertex Weight
Vertex
Weight Vertex Weight Vertex
Weight
No.1
0.5
No.6
0.75 No.11
1
No.16
0.25 No.21
0.25 No.26
0.25
No.2
0.5
No.7
1
No.12
1
No.17
0.25 No.22
0.25 No.27
0.25
No.3
0.75
No.8
0.75 No.13
1
No.18
0.25 No.23
0.25 No.28
0.25
No.4
0.75
No.9
1
No.14
1
No.19
0.25 No.24
0.25 No.29
0.25
No.5
0.75 No.10
0.75 No.15
0.75 No.20
0.25 No.25
0.25 No.30
0.25
2.3. Triangle Degr
ee
The nu
mbe
r
of triangle
s
th
at the node t
ouc
he
s is tri
a
ngle de
gree
of it. For a no
de
u
, as
sho
w
n Fi
gu
re
3, the triangl
e deg
ree
val
ues
of the 30
vertexes of
node
u
, are
pre
s
ente
d
in
the
Table 2.
Evaluation Warning : The document was created with Spire.PDF for Python.
ISSN: 23
02-4
046
TELKOM
NIKA
Vol. 11, No
. 10, Octobe
r 2013 : 574
1 –
5748
5744
Figure 3. An illustratio
n
of a small net
wo
rk
containi
ng
node
u
. The v
a
lue
s
of the 30 vertexes of
the triangle d
egre
e
are sh
own in Ta
ble
2
Table 2. Valu
es of the 30 vertex
es of the
triangle de
gree of node
u
in
F
i
g
.
3
Vertex Degree
Vertex Degree
Vertex
Degree
Vertex Degree
Vertex Degree
Vertex
Degree
No.1
0
No.6
3 No.11
0 No.16
0
No.21
0 No.26
1
No.2
2
No.7
1 No.12
2 No.17
0
No.22
0 No.27
0
No.3
0
No.8
2 No.13
0 No.18
0
No.23
0 No.28
1
No.4
3
No.9
1 No.14
1 No.19
0
No.24
0 No.29
0
No.5
0
No.10
3 No.15
0 No.20
0
No.25
0 No.30
0
2.4. Link Sim
ilarit
y
The lin
k simil
a
rity is a me
a
s
ure of cl
ose
nes
s bet
wee
n
a pai
r of lin
ks.
We limit o
u
rselves
to only conn
e
c
ted
pairs of l
i
nks (i.e.
sh
aring a
nod
e)
si
nce
it is
unlikely that
a p
a
ir of disj
oint lin
ks
are
more
si
milar to
ea
ch
other than
a
pair of
lin
ks that share a
nod
e; at the
sam
e
time
this
choi
ce i
s
m
u
ch more effici
e
n
t. For a
co
n
necte
d pai
r of
links
e
ik
a
nd
e
jk
, we
call th
e sh
ared n
o
d
e
k
a sha
r
e no
de
and
i
and
j
eq
ual nod
es. Th
e simila
rity between lin
ks is defined a
s
:
30
0
30
0
(,
)
(,
)
l
l
ik
jk
l
l
Di
j
Se
e
w
(2)
whe
r
e
S
(
e
ik
,
e
jk
) mean
s lin
k
sim
ilarity value,
D
l
(
i
,
j
) i
s
distance b
e
twe
e
n
the vertex
l
of node
s
i
an
d
j
as
:
(,
)
ll
ll
ll
ni
n
j
Di
j
w
ni
n
j
(3)
whe
r
e
n
l
(
i
) i
s
the triangl
e d
egre
e
of vert
ex
l
of node
i
,
n
l
(
i
)
∩
n
l
(
j
) m
e
ans the
num
ber of
l
tr
ia
ngle
s
sha
r
ed by no
de
i
and
j
,
n
l
(
i
)
∩
n
l
(
j
) mea
n
s the numbe
r of all
l
triangles conn
ecte
d
node
i
and
j
.
2.5. Hierarch
ical Cluste
ring
For
a given n
e
twork, we
calcul
ate
the
similari
tie
s
fo
r
all conn
ecte
d
link-p
a
irs
at first, a
nd
then use ave
r
age
-lin
kag
e
hiera
r
chi
c
al clusteri
ng [
37]
to find hierarchi
c
al
comm
unity structu
r
e.
The finding p
r
oce
s
se
s are
descri
bed in t
he followi
ng three
step
s.
Stage 1:
cal
c
ulate the li
nk
simila
rities
S
(
e
ik
,
e
jk
) for li
nk
e
ik
and
e
jk
, a
nd ea
ch
lin
k
i
s
initially
assign
ed to a
single
clu
s
ter.
Stage 2: me
rge cl
uste
rs iterativ
ely if their
similarity i
s
hig
h
e
s
t usi
ng the ave
r
a
ge lin
kag
e
function a
nd ties, whi
c
h a
r
e
commo
n, are
agglome
r
ate
d
simultan
eo
usly.
Stage 3: stop
mergin
g wh
e
n
all links bel
ong to a uniq
ue clu
s
ter.
The hi
sto
r
y o
f
the clu
s
te
rin
g
process i
s
t
hen
stored in
a de
nd
rog
r
a
m
, whi
c
h
co
n
t
ains
all
the informati
on of the hie
r
archi
c
al
com
m
unity or
ga
n
i
zation. Th
e simila
rity value at whi
c
h t
w
o
clu
s
ters merg
e is con
s
id
ered as the strength
of the merg
ed co
m
m
unity, and is en
code
d as the
height of the relevant den
drogra
m
bra
n
ch
to provide a
dditional info
rmation.
2.6. Dendrogram Partition
Hierarchi
c
al
clu
s
terin
g
m
e
thod
s repe
at
edly merg
e group
s u
n
til all elem
ents
are
membe
r
s of
a si
ngle
cl
ust
e
r. Thi
s
even
tually forc
es
highly di
sp
arate region
s
o
f
the net
wo
rk into
singl
e clu
s
ters. To find meanin
g
ful co
mmunitie
s
ra
ther than ju
st the hiera
r
chi
c
al organi
zati
on
Evaluation Warning : The document was created with Spire.PDF for Python.
TELKOM
NIKA
ISSN:
2302-4
046
Identifying O
v
erlap
p
ing
Co
mm
unities in Dire
cted
Net
w
orks via T
r
i
angle
s
(Qi
n
g
y
u Zou
)
5745
pattern of co
mmunitie
s
, it is important
to k
now
whe
r
e to partition
the dendro
g
ram. Modul
arity
has b
een
widely use
d
for simil
a
r p
u
rpo
s
e
s
, but
is not easily defined for overl
appi
ng
comm
unitie
s
. Thus, we int
r
odu
ce
d a ne
w quantity, the partition de
nsity
D
, which measure
s
the
quality of a link partition.
For a network with
M
links and
N
n
ode
s
,
P
=[
P
1
,…,
P
c
,…,
P
k
] is a partition of the links
into
k
su
bsets. Then we def
ine the den
sit
y
,
D
c
, of community
C
is
1
2
1
(1
)
(1
)
k
cc
i
c
i
cc
i
mn
D
nn
k
n
(4)
Whe
r
e
m
c
is the numb
e
r o
f
links in
sub
s
et
P
c
,
k
is the num
ber
of sub
s
et of ne
twork,
n
c
is t
h
e
numbe
r
of no
des whi
c
h
li
n
ks of
P
c
touch, and
n
ci
is t
he nu
mbe
r
of
comm
on
no
des
between
P
c
and
P
i
. The p
a
rtition den
sit
y
,
D
, is the averag
e of
D
c
:
1
1
k
c
c
D
D
k
(5)
The maximum of
D
is 1, whe
n
every community is
a fully conne
cted cli
que a
nd ea
ch com
m
unity
is inde
pend
e
n
t.
3. Results a
nd Analy
s
is
To evaluate the perfo
rma
n
c
e of the pro
posed
metho
d
three re
al-netwo
rks con
t
aining
Gene network,
Email
n
e
twork and Me
tabolic gen
e net
wo
rk are
use
d
to b
e
the test
networks.
The main propertie
s
of them su
ch a
s
aver
a
ge d
egre
e
, avera
ge sho
r
te
st path length
and
averag
e clu
s
t
e
ring
coeffici
ent are
sho
w
n in Table 3.
Table 3. Prop
erties of real-netwo
rks
Net
w
ork nam
e
Nodes
Links
Degree
Shorte
st path len
g
th
Clustering coefficient
Gene n
e
t
w
ork
1624
3212
3.960
2.070
0.221
Email netw
o
rk
1
133
5451
19.245
3.606
0.297
Metabolic
gene network
962
2724
5.437
3.798
0.241
Mycoba
cte
r
iu
m tube
rculo
s
is i
s
a
n
extraordi
na
rily succe
ssful
pat
hoge
n that
currently
infects a
p
p
r
o
x
imately one-third of the gl
obal po
pul
ati
on [38]. In order to evalu
a
t
e our metho
d
,
we
use a
n
e
w
g
ene
regul
atory n
e
two
r
k
(G
RN)
of
Mycoba
cte
r
iu
m
tube
rculo
s
i
s
con
s
tructe
d
b
y
Sanz
et al [39]. Removing
duplicate intera
ction
s
, the
resulting T
R
N involving 1
624 no
de
s a
nd
3212 inte
ra
ct
ions, with
83
regul
atory g
ene
s co
ntrolli
ng the expre
ssi
on of 15
9
8
gene
s, so
me
main pa
ram
e
ters a
r
e li
ste
d
in Table
3
.
A
GRN m
odel represe
n
ts the mol
e
cula
r re
gulati
o
n
pro
c
e
s
s by
whi
c
h g
ene
s reg
u
late tra
n
scriptio
n
of
other
gen
es.
A gen
e X di
rectly
regul
ates
a
gene Y, if pro
t
ein encode
d by X is a transcription
al factor for Y.
The email co
mmuni
cation
netwo
rk
[40] covers
all
th
e
em
ail com
m
unication
s within a
data set of a
r
oun
d half a million email
s
. The node
s of the network are e
m
ail a
ddre
s
se
s, an
d
there i
s
a lin
k betwee
n
two
node
s if at least one
email
exists bet
we
en them. La
st
ly, this netwo
rk
con
s
i
s
ts of 11
33 nod
es a
n
d
5451 lin
ks.
E. coli is con
s
ide
r
ed
the
most
compl
e
te ava
ilabl
e p
r
oka
r
yotic
and
the TRN
of E. coli is
the be
st chara
c
te
rized
of all pro
k
aryoti
c
o
r
g
anism
s. Th
e
metaboli
c
function
al g
ene
transcription
a
l
regul
atory n
e
twork (TRN) of Esche
r
i
c
h
i
a coli is a
n
e
wly version
of the TRN
of
E.coli, whi
c
h
wa
s
do
wnlo
aded
from
the
Reg
u
lon
D
B [41], co
ntrolling m
e
tab
o
lism
ba
sed
on
function
al an
notation
s
fro
m
GeneP
rotEC [42] and G
ene Ontol
ogy
(GO)
[43].
In order to
evaluate
algo
rithm q
uality we mu
st
b
e
a
s
se
ssed i
n
a
d
i
fferent
way.
The m
o
st
comm
on met
hod is mod
u
l
a
rity, which
measures th
e relative nu
mber of intercomm
unity and
intracommu
ni
ty links. A hig
h
mod
u
larity i
ndicates th
at there
are
more intra
c
om
mu
nity links th
a
n
woul
d be expecte
d by ch
ance. Ho
wever the modul
arity measu
r
e,
Q
, is defined only for non-
intersect
com
m
unities.
Nicosia et al. [4
4]
prop
osed
prop
osed a n
e
w mo
dula
r
ity measu
r
e,
Q
ov
,
whi
c
h is
defi
ned for
dire
ct
ed net
works
with over
l
app
ing co
mmunit
i
es
stru
ctures. In a netwo
rk
Evaluation Warning : The document was created with Spire.PDF for Python.
ISSN: 23
02-4
046
TELKOM
NIKA
Vol. 11, No
. 10, Octobe
r 2013 : 574
1 –
5748
5746
including
n
n
ode
s and
m
links,
k
i
and
k
j
is the the numbe
r of links of
i
and
j
, res
p
ec
tively.
Modula
r
ity,
Q
ov
, was define
d
as:
,
1
ou
t
i
n
ij
OV
ij
c
i
j
i
j
c
cC
i
j
V
kk
Qr
A
s
mm
(5)
whe
r
e
r
ijc
and
s
ijc
are the portion of the contributio
n
to modula
r
ity given by comm
unity
C
beca
u
s
e
of link
l
(
i
,
j
) a
nd
A
ij
are the
terms of the
adja
c
en
cy m
a
trix.
Q
ov
=0 when all verti
c
es bel
ong to
one
comm
unity, a
nd hig
h
e
r
val
ues of
Q
ov
in
dicate
strong
er
comm
unity
stru
ctu
r
e.
We u
s
e m
odula
r
ity,
Q
ov
, here to
evaluate
som
e
well-kno
w
n
algo
rithm
s
a
nd o
u
r algo
rit
h
m o
n
real
-world
networks.
Figure 4 sh
o
w
s the m
odul
arity,
Q
ov
, of
the networks li
sted in Tabl
e 3.
4. Conclusio
n
In this pa
per,
we p
r
e
s
ente
d
a ne
w alg
o
r
ithm for d
e
te
cting ove
r
lap
p
ing
comm
un
ities in
dire
cted networks, whi
c
h p
a
rtiti
on
comm
unities
with li
nks in
stead
o
f
node
s. A n
e
w m
e
a
s
ure
of
link simila
rity based on
t
r
ia
ngle distri
buti
on
h
a
s
be
en
introduced. Using link
simil
a
rity values,
a
dend
rog
r
am
of
link
ha
s b
een co
nstruct
ed
by hier
archical
cl
uste
ri
ng meth
od. T
o
dete
r
min
e
t
h
e
best
cut
level
,
a ne
w pa
rtition d
e
n
s
ity ha
s b
een
introd
uce
d
. Mainly
contri
bution
o
f
the alg
o
rith
m
is that it can succe
s
sfully re
veal overl
appin
g
com
m
unities a
n
d
hiera
r
chies
simultan
eou
sl
y in
dire
cted network. The algo
rithm
h
a
s bee
n
appli
ed
to
server
re
al-n
etwork co
mpa
r
ed with
several
popul
ar
com
m
unity stru
ct
ure id
entify algorithm
s. Th
e
re
sults
sh
o
w
that it is rather effici
en
t to
discover the
co
mmunity
stru
ct
ure in
dire
cted
net
works.
Howe
ver its full p
o
tential rema
ins
unexplo
r
ed.
Our work ha
s prima
r
ily focu
sed on
the
highly overla
pping commu
nity structu
r
e
of
compl
e
x net
works, b
u
t a
n
existin
g
lim
itation of o
u
r algo
rithm i
s
the
relatio
n
ship b
e
twe
en
the
overlap
s
an
d hierarchi
c
al. Therefo
r
e, the
hierarchy that orga
nizes th
ese ove
r
lap
p
ing
comm
unitie
s
kee
p
up great
promi
s
e for f
u
rthe
r study.
Figure 4.
Q
ov
of each
real
-n
etwork
calcul
ated by f
our community det
ecting al
gorit
hm. O: Our
algorith
m
; C: Cliqu
e
percol
a
tion algo
rith
m; L:
Link partition algorith
m
; M: Modula
r
ity spe
c
tral
optimizatio
n algorith
m
Referen
ces
[1]
Rotun
do G, Au
sloos M. Orga
nizati
on of n
e
tw
o
r
ks
w
i
th
ta
g
ged
nod
es an
d
biase
d
li
nks: A priori
distinct
communiti
es T
he c
a
se
of
intel
lige
n
t d
e
si
gn
pr
opo
ne
nts
an
d Dar
w
i
n
ia
n evo
l
ution
d
e
fend
er
s.
Physica a-
Statistical Mec
han
ics an
d Its
Appl
icatio
ns.
2
010; 38
9(2
3
): 5479-
549
4.
[2]
Mason O, Verw
o
e
r
d
M. Grap
h the
o
r
y
a
nd
n
e
t
w
o
r
ks i
n
Bi
ol
og
y.
Iet System
s B
i
ology
.
20
07; 1(
2): 89-
119.
[3]
Leic
h
t E A, Clarkson G, She
dde
n K, et al. Larg
e
-
scal
e
structure of time
evolvi
ng citati
o
n
net
w
o
rks.
Europ
e
a
n
Phy
s
ical Jo
urna
l B.
2007; 5
9
(1): 7
5
-83.
[4]
Ne
w
m
a
n
M E J. Communiti
e
s
, modules
an
d larg
e-scal
e
s
t
ructure in n
e
tw
o
r
ks.
Nature Physics.
201
2;
8(1): 25-3
1
.
O
C
L
M
O
C
L
M
O
C
L
M
0
0.
1
0.
2
0.
3
0.
4
0.
5
0.
6
0.
7
0.
8
Gen
e
ne
tw
ork
E
m
ai
l ne
tw
o
r
k
Me
t
abo
li
c ge
ne
n
e
t
w
ork
Evaluation Warning : The document was created with Spire.PDF for Python.
TELKOM
NIKA
ISSN:
2302-4
046
Identifying O
v
erlap
p
ing
Co
mm
unities in Dire
cted
Net
w
orks via T
r
i
angle
s
(Qi
n
g
y
u Zou
)
5747
[5]
Ne
w
m
a
n
M E J. Modularit
y and commu
n
i
t
y
structure i
n
net
w
o
rks.
P
r
ocee
din
g
s of
the Nation
al
Acade
my of Sc
ienc
es of t
he U
n
ited States of
Amer
ica
. 20
06;
103(2
3
): 857
7
-
858
2.
[6]
Doro
govtsev S
N, Goltsev A
V, Mend
es J
F
F
.
Critical
ph
e
nome
na
in c
o
mple
x
net
w
o
rk
s.
Reviews of
Moder
n Physic
s
. 2008; 80(
4): 127
5-13
35.
[7]
Kalten
bac
h H
M, Stelli
ng
J.
Modu
lar an
al
ysis
of bi
olo
g
ica
l
n
e
t
w
orks.
A
d
v Exp M
e
d Biol.
201
2; 7
36:
3-
17.
[8]
F
o
rtunato S.
Commun
i
t
y
de
tection i
n
gr
ap
hs.
Physics R
eports-R
eview
Section
of P
h
ysics L
e
tters
.
201
0; 486(
3-5)
: 75-174.
[9]
Yang B, Ji
n D, Liu J M, et al.
Hierarc
hic
a
l co
mmuni
t
y
det
ection
w
i
t
h
a
ppl
ic
ations to r
eal-
w
o
r
ld
net
w
o
r
k
analy
sis.
Data &
Know
led
ge Engi
neer
in
g
. 2013; 83: 2
0
-38.
[10]
Ochab J K, B
u
rda Z
.
Ma
xi
mal entr
o
p
y
r
and
om
w
a
lk i
n
commu
nit
y
detectio
n
.
Eur
ope
an P
h
ysica
l
Journ
a
l-Sp
eci
a
l T
opics.
201
3; 216(
1): 73-8
1
.
[11]
Lanc
ichi
netti A, Radicchi F
,
Ramasc
o J J, et
al. F
i
ndin
g
statisticall
y
signific
ant co
mmunities i
n
net
w
o
rks.
Plos One
. 201
1; 6(4
)
: e1896
1.
[12]
Ball B,
Karrer
B, Ne
w
m
an
M E J. Efficient
an
d pr
inci
pl
ed m
e
thod
for
detecti
ng c
o
mmunities
i
n
net
w
o
rks.
Phys
Rev E
. 2011; 84(3): 03
61
03-
1-03
610
3-1
3
.
[13]
Gao Z
K, Jin N D,
Ieee. De
tecting comm
u
n
it
y
structur
e i
n
compl
e
x n
e
tw
o
r
ks bas
ed
on K-me
an
s
clusteri
ng an
d data
fi
el
d
th
eor
y.
20th
C
h
in
es
e C
ontrol
a
n
d
Decisi
on
Co
nfe
r
ence. Y
anta
i
.
200
8;
441
1-
441
6.
[14]
Ne
w
m
a
n
M E J. F
i
nding c
o
m
m
unit
y
structur
e
in n
e
t
w
orks
usin
g the e
i
ge
nvectors of ma
trices.
Phys
Rev E
. 2006; 7
4
(3): 1-22.
[15]
Ne
w
m
a
n
M
E
J, Girvan M. F
i
ndi
ng
a
nd
ev
alu
a
ting
comm
unit
y
structur
e
in
net
w
o
rks.
Phys Rev E
.
200
4; 69(2): 1-
16.
[16]
Kashef R, Kam
e
l M
S. Cooper
ative cluster
i
ng
.
Pattern Recognition
. 20
10; 43(6): 23
15-
23
29.
[17]
Ne
w
m
a
n
M E J. Random Gra
phs
w
i
th C
l
uste
ring.
Physic
a
l Review
Letters
. 2009; 10
3(5).
[18]
Xu R,
W
unsch D.
Surv
e
y
of cl
usterin
g
al
gorit
hms.
Ieee T
r
an
sactions
on N
e
ural N
e
tw
orks
. 200
5; 16(
3):
645-
678.
[19]
F
agio
l
o G. Clu
stering i
n
comp
le
x dir
e
cted n
e
tw
o
r
ks.
Phys Rev E
. 2007; 76(
2).
[20]
W
ilkinso
n D M, Huberm
an B
A. A met
hod for findin
g
comm
uniti
es of relate
d gen
es.
Proc Natl Acad S
c
i
U S A.
2004; 1
01 Sup
p
l 1: 52
41-5
248.
[21]
Radicc
hi F
,
C
a
stell
ano
C,
Cecco
ni F
,
et
al. Defi
ni
ng
and
id
entif
yin
g
commun
i
ties
in n
e
t
w
orks.
Procee
din
g
s o
f
the Natio
nal
Acade
my
of Sc
ienc
es of the
United St
ates
of Amer
ica
. 2
004; 1
01(9)
:
265
8-26
63.
[22]
Z
anja
n
i A A
H, Daro
one
h A
H
.
F
i
ndin
g
com
m
uniti
es in
li
ne
ar time b
y
dev
elo
p
in
g the s
e
eds.
Phys Rev
E
. 2011; 84(
3).
[23]
Rosval
l
M, Ber
g
strom C T
.
An inform
ation-t
heor
etic frame
w
o
r
k for reso
lv
ing c
o
mmun
i
t
y
structure i
n
complex
net
w
o
rks.
Proc Natl Acad Sci U S A
. 2007; 10
4(1
8
): 7327-
73
31.
[24]
Li Z
,
Z
hang S, W
ang R-S, et al. Q
uantitativ
e function for
communit
y
d
e
tection.
Phys Rev E
. 2008;
77(3): 03
61
09-
1-03
610
9-9.
[25]
Girvan M, Ne
w
m
an M E J. Co
mmunit
y
st
ructure in s
o
cia
l
a
nd bi
ol
ogic
a
l n
e
t
w
o
r
ks.
Proce
edi
ngs of t
h
e
Natio
nal Aca
d
e
m
y of Scie
nces
of
the United
States of Amer
ica
. 200
2; 99(1
2
): 7821-
78
26.
[26]
Radicc
hi F
,
Castell
ano C, C
e
ccon
i
F
,
et al
. Definin
g
and
identif
yi
ng co
mmunities i
n
n
e
t
w
o
r
ks.
Proc
Natl Acad Sci
U S A
. 2004; 1
01(9): 26
58-
26
63.
[27]
F
o
rtunato S, Latora
V, Mar
c
hiori
M. Met
hod
to fi
nd
c
o
mmunit
y
stru
ctures b
a
se
d
on
informati
on
central
i
t
y
.
Phys
Rev E Stat N
onlin Soft Matter Phys
. 2004; 7
0
(5 Pt 2): 0561
04.
[28]
Pons P, Latap
y M. Computing
communiti
es in
large n
e
t
w
orks
using ra
nd
om w
a
lks.
Lect Not
e
s Co
mp
u
t
Sc
. 2005; 37
33
: 284-29
3.
[29]
Yang B, L
i
u
J M. Discover
i
ng Glo
b
a
l
Ne
t
w
ork
Commu
nities Bas
ed
on L
o
cal C
ent
ralities.
Ac
m
T
r
ansactio
n
s o
n
the W
e
b
. 200
8; 2(1).
[30]
Ahn Y
Y, Bagr
o
w
J P,
Le
hma
nn S.
Link
com
m
uniti
es rev
eal
multisca
le c
o
mple
xit
y
in
net
w
o
rks.
Nat
u
re
.
201
0; 466(
730
7): 761-U
7
1
1
.
[31]
Evans T
S, Lambiotte R. Li
ne
graphs
, l
i
nk p
a
r
titions, an
d ov
erla
ppi
ng com
m
uniti
es.
Phys Rev E
. 200
9;
80(1).
[32]
Rese
ndis-A
n
to
nio O, F
r
e
y
re-
G
onzal
ez J A,
Menc
h
a
ca-M
end
ez R, et a
l
. Modu
lar a
n
al
ysis
of the
transcripti
ona
l regu
lator
y
n
e
tw
o
r
k of E-coli.
T
r
ends in Gen
e
tics
. 2005; 2
1
(
1): 16-20.
[33]
Palla G, Dereny
i I, Farkas I,
et al. Unc
o
v
e
rin
g
the
over
lap
p
in
g comm
unit
y
structure
of compl
e
x
net
w
o
rks i
n
nat
ure an
d soci
et
y.
Nature
. 200
5; 435(7
0
4
3
): 81
4-81
8.
[34]
Leic
h
t E A,
Ne
w
m
a
n
M E J.
Commu
n
i
t
y
str
u
cture in dir
e
cted net
w
o
rks.
P
h
ysical Review
Letters
. 200
8;
100(
11):
1
187
03-1-
118
70
3-4
.
[35]
Lanc
ichi
netti
A, F
o
rtunato S,
Kertesz J. Detecting th
e overl
a
p
p
in
g
and h
i
erarc
h
i
c
al commu
nit
y
structure in complex
net
w
ork
s
.
New
Journal
of Physics
. 2009; 11: 1-2
0
.
[36]
Milo
R, Shen-
O
rr S, Itzkovitz S, et al.
Net
w
ork motifs: Simple
building
blocks of
com
p
lex
net
w
o
rks.
Scienc
e
. 200
2; 298(5
5
9
4
): 82
4-82
7.
Evaluation Warning : The document was created with Spire.PDF for Python.
ISSN: 23
02-4
046
TELKOM
NIKA
Vol. 11, No
. 10, Octobe
r 2013 : 574
1 –
5748
5748
[37]
Da
y W
E, Ed
elsbr
unn
er H.
Efficient al
gor
ithms for a
ggl
o
m
erative
hier
a
r
chical
cluster
i
ng meth
ods.
Journ
a
l of Clas
s
ificatio
n
. 198
4
;
1(1): 7-24.
[38]
W
o
rld He
alt
h
Organ
izati
on. Glob
al
tubercu
losis
control.
Geneva. R
e
port num
ber
:
W
H
O/CDS/T
B/200
0.27
5:11
79
. 2000.
[39]
Sanz J, Nav
a
rro J, Arbues
A, et al. T
h
e T
r
anscriptio
nal Reg
u
l
a
tor
y
Net
w
ork of
M
y
co
bacter
i
um
tubercu
losis.
Plos One
. 201
1; 6(7).
[40]
Guimera
R, D
ano
n
L, Di
az-
G
u
iler
a
A,
et a
l
. Self-simi
l
ar
c
o
mmuni
t
y
stru
cture
i
n
a
n
e
tw
o
r
k of
h
u
man
interactions.
Phys Rev E
. 200
3; 68(6):
06
51
03-1-
065
10
3-4
.
[41]
Serres M H,
Gosw
ami S, Riley
M.
GenProtEC:
an
u
p
d
a
t
ed a
n
d
impr
o
v
ed
an
al
ysis
o
f
functions
of
Escheric
hia co
l
i
K-12 prote
i
ns.
Nucleic Ac
ids
Research
. 20
0
4
; 32: D30
0
-D3
02.
[42]
Cons
ortium T
G O.
T
he
Gene Ontol
o
g
y
: en
ha
nceme
n
ts for 2011.
Nucleic Acids Res.
2
0
12;
40(D
a
tabas
e is
sue): D55
9
-56
4
.
[43]
Gama-Castro
S, Salg
ad
o H,
Pe
ralta-Gi
l M,
et al.
Reg
u
l
o
n
D
B versi
o
n
7.0
:
transcripti
ona
l reg
u
l
a
tion
o
f
Escheric
hia co
l
i
K-12 inte
grat
ed
w
i
thi
n
ge
ne
tic sensor
y
r
e
s
pons
e units (Gensor U
n
its).
Nucleic Acids
Research.
20
1
1
; 39: D98-D
1
0
5
.
[44]
Nicosi
a
V, Man
g
io
ni G, Carchi
olo V, et al. Ext
end
ing th
e defi
n
itio
n of modu
l
a
rit
y
to d
i
recte
d
grap
hs
w
i
t
h
over
lap
p
i
n
g commun
i
ties.
Journa
l of Statistical Mec
han
i
cs-T
heory an
d Experi
m
ent.
20
09;
P030
24.
Evaluation Warning : The document was created with Spire.PDF for Python.