TELK
OMNIKA
,
V
ol.
14,
No
.
2,
J
une
2016,
pp
.
725
733
ISSN:
1693-6930,
accredited
A
b
y
DIKTI,
Decree
No:
58/DIKTI/K
ep/2013
DOI:
10.12928/telk
omnika.v13.i1.1122
725
Resear
c
h
on
Comm
unity
Detection
Algorithm
Based
on
the
UIR-Q
Zilong
Jiang
1
,
W
ei
Dai
*2
,
Xiuf
eng
Cao
1
,
Liangc
hen
Chen
1
,
K
e
Zheng
1
,
and
Abdoula
y
e
Sidibe
1
1
School
of
Computer
Science
and
T
echnology
,
W
uhan
Univ
ersity
of
T
echnology
,
W
uh
an,
China
2
School
of
Economics
and
Management,
Hubei
P
olytechnic
Univ
ersity
,
Huangshi,
China
*
Corresponding
author
,
e-mail:
dw
eisky@163.com
Abstract
Aiming
at
the
current
prob
lems
of
comm
unity
detection
algor
ithm
in
which
user’
s
proper
ties
are
not
used;
the
comm
unity
str
ucture
is
not
stab
le
and
the
efficiency
of
the
algor
ithm
is
lo
w
,
this
paper
proposes
a
comm
unity
detection
algor
ithm
based
on
the
user
influence
.
In
ter
ms
of
the
concept
of
user
influe
nce
in
the
subject
comm
unicat
ion
and
the
P
ageRank
alg
or
ithm,
this
paper
uses
the
proper
ties
of
nodes
of
users
in
social
netw
or
ks
to
f
or
m
the
user
influence
f
actor
.
Then,
the
user
with
the
biggest
influence
is
set
as
the
initial
node
of
ne
w
comm
unity
and
the
local
modular
ity
method
is
introduced
into
detecting
the
comm
unity
str
ucture
.
Exper
iments
sho
w
that
the
impro
v
ed
algor
ithm
can
efficiently
detect
the
comm
unity
str
ucture
with
large
scale
users
and
the
results
are
stab
le
.
Theref
ore
,
this
algor
ithm
will
ha
v
e
a
wide
ap
plied
prospect.
K
e
yw
or
d:
social
netw
or
ks;
comm
unity
detection;
user
influence;
modular
ity
Cop
yright
c
2016
Univer
sitas
Ahmad
Dahlan.
All
rights
reser
ved.
1.
Intr
oduction
Comm
unity
detection
means
the
cohesiv
e
subg
roup
detected
in
the
social
netw
or
k
and
as
an
impor
tant
prob
lem
e
xisting
in
the
analysis
of
social
netw
or
k
it
is
beneficial
f
or
people
to
fur
ther
understanding
and
master
ing
the
complicated
netw
or
k
subject
in
the
research
and
making
deep
applied
research,
f
or
instance
,
individual
recommendation
[1],
compression
of
netw
or
k
with
large
scale
[2],
social
netw
or
k
e
v
olution
[3],
and
so
on.
Comm
unity
detection
algor
ithm
based
on
the
netw
or
k
str
ucture
is
a
class
of
popular
al-
gor
ithm
which
it
divides
the
social
netw
or
k
into
sub-comm
unities
with
close
connection
in
same
comm
unity
and
sparse
connection
in
diff
erent
comm
unities
using
the
relationship
betw
een
users
.
Comm
unity
detection
algor
ithm
based
on
modular
ity
is
a
classical
kind
of
comm
unity
detection
algor
ithm
based
on
the
netw
or
k
str
ucture
.
Ne
wman
and
Gir
v
an
proposed
an
e
v
aluation
function
of
netw
or
k
modular
ity
,
that
is
,
mod-
ular
ity
Q.
Q
function
is
the
diff
erence
of
real
connected
n
umber
in
a
comm
unity
and
e
xpected
connected
n
umber
in
a
comm
unity
with
r
andom
connection
and
it
descr
ibes
the
good
an
d
bad
of
detected
comm
unity
[4].
Clauset
impro
v
ed
the
f
ast
Ne
wman
algor
ithm
through
utilizing
heap
str
ucture
to
calculate
the
netw
or
k
modular
ity
and
named
it
as
CNM
algor
ithm
in
which
a
sparse
matr
ix
is
emplo
y
ed
to
e
xpress
the
diff
erence
of
modu
lar
ity
betw
een
nodes
and
Maxheap
is
used
f
or
sa
ving
the
maxim
um
modular
ity
diff
erence
.
Ev
er
y
time
,
a
maxim
um
v
alue
is
selected
from
the
Maxheap
and
combined
with
the
correlativ
e
nodes
into
a
comm
unity
,
and
then
the
sparse
matr
ix
and
heap
are
updated.
The
process
does
not
ter
minate
until
the
whole
netw
or
k
has
only
a
com-
m
unity
[5].
Chen
et
al.
brought
f
orw
ard
a
kind
of
local
comm
unity
detection
algor
ithm
called
as
LCE
algor
ithm
in
which
the
nodes
with
local
maxim
um
modular
ity
are
selected
as
the
core
nodes
,
and
the
core
nodes
are
e
xpanded
in
ter
ms
of
the
modified
local
modular
ity
R.
This
algor
ithm
can
detect
the
o
v
er
lap
of
comm
unity
and
is
good
f
or
par
allelization
without
pr
ior
kno
wledge
[6].
Hu
et
al.
proposed
the
LMDMR
algor
ithm
which
is
a
kind
of
local
comm
unity
detection
algor
ithm
based
on
e
xpanding
the
set
of
local
maxim
um
nodes
and
impro
ving
the
inde
x
R
(modified
local
modular-
ity
R).
By
estimati
ng
the
potential
of
appear
ing
in
the
same
comm
unity
betw
een
nodes
and
their
neighbor
ing
nodes
with
tw
o-le
v
el,
this
algor
ithm
can
confir
m
the
centr
ality
and
then
detect
the
Receiv
ed
September
16,
2015;
Re
vised
March
7,
2016;
Accepted
March
22,
2016
Evaluation Warning : The document was created with Spire.PDF for Python.
726
ISSN:
1693-6930
comm
unity
through
combining
the
modified
local
modular
ity
R
with
the
diffusiv
e
str
ategy
of
local
comm
unity
defined
b
y
strong
comm
unity
based
on
the
phenomenon
that
in
the
special
situation
the
inde
x
R
will
miss
a
lot
of
nodes
belonged
to
strong
comm
unity
[7].
Compare
with
the
CNM
algor
ithm,
these
mentioned
comm
unity
detection
algor
ithms
based
on
local
modular
ity
has
made
some
impro
v
ement
in
the
quality
of
comm
unity
detection
b
ut
there
are
also
some
deficiencies
in
them,
such
as
instability
of
comm
unity
str
ucture
and
the
prob
lem
of
o
v
er
lapping
comm
unity
.
Moreo
v
er
,
in
the
tr
aditional
comm
unity
detection
algor
ithms
based
on
the
netw
or
k
str
ucture
the
social
netw
or
k
is
regarded
as
the
static
topological
g
r
aph
without
con-
sider
ing
the
inter
action
f
actors
betw
een
nodes
,
which
is
no
longer
suitab
le
f
or
the
moder
n
social
netw
or
ks
such
as
microb
log,
W
eChat
and
so
on.
The
inf
or
mation
flo
w
of
nodes
(users)
in
moder
n
social
netw
or
k
is
usually
the
representativ
e
of
user
influence
.
User
influence
in
social
netw
or
k
means
the
capacity
that
in
social
netw
or
k
user
utiliz
e
the
w
a
y
of
spreading
the
inf
or
mation
or
inter
acting
with
other
people
to
influence
other
people’
thoughts
or
motiv
ate
them
to
inter
act
with
others
.
Man
y
e
xper
ts
and
scholars
ha
v
e
made
man
y
researches
on
the
user
influence
in
social
netw
or
k.
Mee
y
oung
C
.,
et
al.
made
a
deep
research
on
the
user
beha
vior
and
user
influence
.
This
research
f
ocused
on
users
three
beha
viors:
being
concer
ned,
b
eing
f
orw
arded
and
being
called
and
analyz
ed
their
respectiv
e
types
of
user
influence
[8].
Y
e
et
al.
divided
user
influence
into
three
kinds
and
fiv
e
sor
ting
methods
.
Three
kinds
of
influence
ref
er
to
influence
of
n
umber
of
f
ans
,
f
orw
arding
influence
,
replying
influence
.
And
fiv
e
sor
ting
methods
means
sor
ting
b
y
the
n
umber
of
f
ans
,
n
umber
of
replying
inf
or
mation,
n
umber
of
f
orw
arding
inf
or
mation,
n
umber
of
respondents
,
or
n
umber
of
f
orw
arders
.
In
this
research,
user
influence
with
the
largest
n
umber
of
respondents
is
regarded
as
the
most
stab
le
and
the
n
umber
of
respondents
is
used
f
or
sor
ting
user
influence
[9].
Based
on
the
inter
acte
d
relationship
betw
een
user
influence
and
the
correlation
with
the
released
inf
or
mation,
Ar
lei
Silv
a,
et
al.
used
HITS
algor
ithm
to
ob
tain
the
score
of
user
influence
and
the
correlation
with
the
released
inf
or
mation
[10].
At
present,
there
is
lac
k
of
research
on
user
influence
in
combination
with
comm
unity
detection
algor
ithm;
theref
ore
,
this
paper
proposes
a
comm
unity
detection
algor
ithm
integ
r
ating
with
user
influence
.
2.
The
resear
c
h
on
the
comm
unity
detection
algorithm
based
on
UIR-Q
2.1.
The
user
influence
factor
s
of
social
netw
ork
Because
the
personal
relationship
of
real
lif
e
is
the
basis
of
the
social
netw
or
k,
the
user’
s
influence
of
the
social
netw
or
k
is
similar
with
their
individual
influence
in
real
lif
e
.
Through
making
an
analysis
of
beha
viors
of
users
in
sina
microb
log
including
f
orw
arding
messages
,
commenting
on
message
s
and
replying
messages
,
this
paper
co
nstr
ucts
three
f
actors
to
e
v
aluate
user
influ-
ence
,
that
is
,
frequency
of
updating
microb
log,
deg
ree
of
contin
uous
attention
in
f
ans
an
d
user’
s
capacity
of
spreading
inf
or
mation.
2.1.1.
The
frequenc
y
of
updating
micr
ob
log
In
sina
microb
log
netw
or
k,
this
pa
per
considers
tw
o
f
actors:
the
frequency
of
releasing
the
messages
in
microb
log
and
the
times
of
f
orw
arding,
commenting
on
and
replying
messages
.
In
gener
al,
through
releasing
the
messages
in
microb
log
to
e
xpress
the
attitudes
or
ideas
of
e
v
ents
,
users
think
that
the
more
messages
the
y
release
in
microb
log,
the
more
ideas
the
y
can
e
xpress
and
the
y
ha
v
e
more
chance
to
influence
other
users
.
The
users
f
orw
ard
the
interesting
messages
in
microb
log
to
share
with
their
f
ans
,
which
ma
y
mak
e
their
f
ans
f
orw
ard
and
comment
on
the
messages
.
In
this
case
,
the
f
ans
of
their
f
ans
can
see
these
messages
.
Theref
ore
,
the
user
influence
can
be
quic
kly
increased
in
the
social
netw
or
k.
This
paper
will
define
the
frequency
of
updating
microb
log
as
the
total
times
of
f
orw
arding,
releasing,
commenting
on
and
replying
messages
in
microb
log
in
the
statistical
per
iod
unit
such
as
one
w
eek
f
or
users
.
TELK
OMNIKA
V
ol.
14,
No
.
2,
J
une
2016
:
725
733
Evaluation Warning : The document was created with Spire.PDF for Python.
TELK
OMNIKA
ISSN:
1693-6930
727
The
calculating
f
or
m
ula
can
be
e
xpressed
as
f
ollo
ws
.
A
i
=
n
i
T
i
(1)
In
this
f
or
m
ula,
A
i
represents
the
frequency
of
updating
microb
log
in
the
statistical
per
iod
unit
f
or
one
user
.
T
i
represents
a
statistical
per
iod.
n
i
represents
the
total
n
umber
of
f
orw
arding,
releas-
ing,
commenting
on
and
replying
messages
in
microb
log
in
the
statistical
per
iod
unit
f
or
one
user
.
2.1.2.
The
degree
of
contin
uous
attention
in
fans
The
deg
ree
of
contin
uous
attention
in
f
ans
denotes
the
deg
ree
that
f
ans
are
interested
in
the
pre
vious
released
messages
in
microb
log.
This
f
actor
is
defined
as
the
r
atio
which
the
times
of
the
user
j
f
orw
arding,
commenting
on
and
replying
messages
in
user
i’
s
microb
log
accounts
f
or
the
total
times
of
releasing
messages
in
microb
log
in
the
sta
tistical
per
iod
unit.
It
can
be
represented
as
the
f
ollo
wing
f
or
m
ula.
R
(
i;
j
)
=
R
t
(
i;
j
)
+
1
S
C
(
i
)
+
1
(2)
In
this
f
or
m
ula,
Rt
(i,
j)
represent
s
the
total
times
which
the
user
j
f
orw
ards
and
comments
on
messages
in
user
i’
s
microb
log
and
SC
(i)
represents
the
total
times
which
the
user
i
releases
and
f
orw
ards
the
messages
in
microb
log
in
the
statistical
per
iod
unit.
If
the
times
which
user
j
f
orw
ards
or
comments
on
the
messages
in
user
i’
s
microb
log
is
high
in
the
statistical
per
iod
unit,
it
is
reasonab
le
to
belie
v
e
that
user
j
will
do
the
same
thing
in
future
.
R
(i,
j)
represents
the
deg
ree
of
attention
of
user
j
to
user
i
in
the
f
or
m
of
probability
.
2.1.3.
The
user’
s
capacity
of
spreading
inf
ormation
The
concept
of
user’
s
capacity
of
spreading
inf
or
mation
can
be
defined
as
the
frequency
of
updating
microb
log
in
combination
with
the
deg
ree
of
contin
uous
attention
in
f
ans
.
S
(i,
j)
represents
the
m
ultiply
betw
een
the
user
i’
s
frequency
of
updating
microb
log
and
the
deg
ree
of
contin
uous
attention
of
user
j
to
user
i.
The
f
or
m
ula
can
be
descr
ibed
as
f
ollo
ws
.
S
(
i;
j
)
=
A
i
R
(
i;
j
)
(3)
In
this
f
or
m
ula,
the
user’
s
capacity
of
spreading
inf
or
mation
reflects
the
a
v
er
age
which
user
i
deliv
ers
the
v
olume
of
inf
or
mation
to
f
an
j
in
statistical
per
iod.
2.2.
Resear
c
h
on
user
influence
based
on
P
a
g
eRank
2.2.1.
P
a
g
eRank
algorithm
and
the
solution
of
RankSink
phenomenon
In
P
ageRank
algor
ithm
[11],
PR
v
alue
is
used
f
or
representing
the
author
ity
of
page
.
The
calculating
w
a
y
of
distr
ib
ution
of
PR
v
alue
in
P
ageRank
algor
ithm
can
be
descr
ibed
as
f
ollo
ws
.
P
R
(
v
)
=
c
X
u
2
U
(
v
)
P
R
(
u
)
N
(
u
)
(4)
In
this
f
or
m
ula,
PR(v)
means
the
PR
v
alue
of
page
v
,
U(v)
ref
ers
to
the
page
set
of
redirect-
in
v
.
u
represents
the
URL
redirecting
from
pa
ge
u
to
page
v
.
N
(u)
means
the
n
umber
of
URL
redirected
from
page
u.
In
the
research
on
the
directed
str
ucture
of
w
eb
page
,
Larr
y
P
age
and
other
researchers
f
ound
a
phenomenon
that
the
redirecting
relations
of
some
pages
could
compose
a
cycle
.
According
to
the
f
or
m
ula
(4),
the
PR
v
alue
of
page
in
this
compositiv
e
cycle
will
be
increased
all
the
time
in
the
iter
ativ
e
process
while
the
PR
v
alue
of
page
outside
this
compositiv
e
cycle
will
be
decre
ased.
Finally
,
e
xcept
f
or
the
page
of
this
compositiv
e
cycle
,
the
Research
on
Comm
unity
Detection
Algor
ithm
Based
on
the
UIR-Q
(Zilong
Jiang)
Evaluation Warning : The document was created with Spire.PDF for Python.
728
ISSN:
1693-6930
PR
v
alues
of
other
pages
are
equal
to
0.
Larr
y
P
age
called
this
phenomenon
as
the
RankSink
phenomenon.
In
order
to
eliminate
the
prob
lem,
Larr
y
P
age
introduced
t
he
damping
f
actor
into
this
algor
ithm
to
represent
the
probability
of
r
andom
access
to
the
page
and
modified
the
f
or
m
ula
(4)
with
the
f
ollo
wing
f
or
m
ula
(5).
P
R
(
v
)
=
(1
d
)
+
d
X
u
2
U
(
v
)
P
R
(
u
)
N
(
u
)
(5)
In
the
f
or
m
ula
(5),
d,
as
the
damping
f
actor
,
represents
that
when
user
clic
ks
some
page
the
redirected
URL
will
be
clic
k
ed
with
the
probability
d
and
the
page
will
redirect
other
URL
with
the
probability
from
1
to
d.
In
calculating
the
user
influence
of
social
netw
or
ks
,
there
are
tw
o
reasons
wh
y
this
paper
uses
the
P
ageRank
algor
ithm.
Firstly
,
in
the
microb
log,
users
ha
v
e
their
o
wn
relation
of
attention
and
user-f
ans
,
which
are
re-
spectiv
ely
similar
to
the
relation
of
redirect-in
and
redirect-out.
Thu
s
,
it
is
reasonab
le
to
belie
v
e
that
microb
log
and
page
ha
v
e
similar
netw
or
k
str
ucture
.
Secondly
,
e
v
aluating
a
user’
s
influence
in
t
he
social
netw
or
ks
is
similar
to
the
e
v
aluation
of
author
ity
which
w
eb
is
in
the
netw
or
k
in
the
P
ageRank
algor
ithm.
Theref
ore
,
using
the
P
ageRank
algor
ithm
to
calculate
the
user
influence
of
microb
log
is
equal
to
calculate
the
users
PR
v
alue
.
Then,
the
user
with
big
influence
can
be
f
ound
through
r
anking
the
PR
v
alue
so
that
the
inf
or
mation
prediction
can
be
made
.
2.2.2.
The
impr
o
ved
user
influence
rank
algorithm
based
on
P
a
g
eRank
In
the
P
ageRank
algor
ithm,
the
PR
v
alue
of
e
v
er
y
page
is
distr
ib
uted
to
the
redirect-
in
URL
of
page
on
a
v
er
age
.
In
order
to
a
v
oid
the
disturbance
caused
b
y
z
ombie
f
ans
and
online
w
ater
ar
m
y
and
impro
v
e
the
accur
acy
of
e
v
aluating
user
influence
in
the
analysis
of
user
influence
of
social
netw
or
ks
,
this
paper
introduces
the
user
influence
f
actors
to
impro
v
e
the
P
ageRank
algor
ithm
and
proposes
the
impro
v
ed
user
influence
r
ank
algor
ithm
based
on
P
ageRank.
The
basic
idea
of
UIR
algor
ithm
is
that
the
deg
ree
of
contin
uous
attenti
on
in
f
ans
is
regarded
as
damping
f
actor
;
when
th
e
algor
ithm
distr
ib
utes
the
v
alue
of
influence
,
user’
s
capacity
of
spreading
inf
or
mation
is
tak
en
into
consider
ation
and
users
with
strong
capacity
of
spreading
inf
or
mation
can
be
distr
ib
uted
more
UIR
v
alue
in
a
big
probability
while
those
who
ha
v
e
w
eak
capacity
can
be
distr
ib
uted
less
UIR
v
alue
in
a
big
probability
as
w
ell.
In
the
impro
v
ed
user
influence
r
ank
algor
ithm
based
on
P
ageRank,
the
f
or
m
ula
of
calculating
user
influence
r
ank
can
be
e
xpressed
as
f
ollo
ws
.
U
I
R
(
u
)
=
(1
R
(
u;
v
))
+
R
(
u;
v
)
X
v
2
E
(
u
)
B
(
u;
v
)
U
I
R
(
v
)
(6)
In
this
f
or
m
ula,
E(u)
is
the
use
u’
s
f
ans
set
and
R(u,v)
means
the
deg
ree
of
contin
uous
attention
in
f
ans
which
is
deter
mined
b
y
the
frequency
of
f
ans
f
orw
arding
messages
from
users’
microb
logs
.
B(u,v)
represents
the
user
v’
s
capacity
of
spreading
inf
or
mation
to
user
u
and
is
deter
mined
b
y
the
r
atio
of
user
u’
s
capacity
of
spreading
inf
or
mation
accounting
f
or
the
sum
of
all
f
ans
capacity
of
spreading
inf
or
mation.
The
calculating
f
or
m
ula
can
be
descr
ibed
as
f
ollo
ws
.
B
(
u;
v
)
=
S
(
u;
v
)
P
f
2
E
(
v
)
S
(
f
;
v
)
(7)
All
users’
UIR
v
alue
can
be
initializ
ed
to
1
and
after
being
repeatedly
iter
ativ
e
calculated
b
y
f
or
m
ula
(7),
the
UIR
v
alue
tends
to
con
v
ergence
.
At
that
time
,
the
UIR
algor
ithm
will
be
ter
minated
and
all
users’
UIR
v
alues
of
netw
or
k
can
be
obtained.
The
specific
algor
ithm
can
be
descr
ibed
as
f
ollo
ws
.
2.3.
The
algorithm
idea
of
comm
unity
detection
algorithm
based
on
UIR-Q
Clauset
put
f
orw
ard
comm
unity
detection
algor
ithm
CNM
based
on
local
modular
ity
[5,12].
This
algor
ithm
firstly
defines
local
modular
ity
Q
of
a
local
comm
unity
and
finds
out
all
neighbor
ing
TELK
OMNIKA
V
ol.
14,
No
.
2,
J
une
2016
:
725
733
Evaluation Warning : The document was created with Spire.PDF for Python.
TELK
OMNIKA
ISSN:
1693-6930
729
Algorithm
1
UIR
algor
ithm
Input:
List
<
Node
>
U//U
is
all
users’
set.
Output:
List
<
Node
>
U//U
is
a
set
of
users
whose
influence
v
alue
is
calculated.
BEGIN
F
or
each
u
in
U
Obtaining
the
f
ans
set
E
from
user
u
Calculating
the
updating
frequency
A
i
according
to
the
f
or
m
ula
(1).
F
or
each
e
in
E
Calculating
R(i,j)
according
to
f
or
m
ula
(2)
Calculating
S(i,j)
according
to
f
or
m
ula
(3)
Endf
or
Endf
or
While
the
UIR
v
alue
cannot
reach
the
stab
le
situation
do
f
or
u
in
U
Obtaining
the
f
ans
set
E
from
user
u
f
or
e
in
E
Calculating
user
u’
s
influence
deliv
ered
b
y
user
e
according
to
f
or
m
ula
(6)
and
updating
Endf
or
Endf
or
Endwhile
retur
n
U
ENG
BEGIN
nodes
of
this
local
comm
unity
,
and
then
calculate
the
ne
w
local
modular
ity
of
n
e
w
local
comm
unity
when
e
v
er
y
neighbor
ing
node
is
ad
ded
into
this
local
comm
unity
.
In
this
case
,
the
neighbor
ing
node
with
the
biggest
v
alue
of
ne
w
local
modular
ity
can
be
really
added
into
the
local
comm
unity
.
This
p
rocess
is
iter
ativ
ely
added
until
the
v
alue
of
Q
of
the
local
comm
unity
cannot
be
increased.
At
th
at
time
,
the
local
comm
unity
reaches
satur
ation.
The
v
alue
of
Q
can
be
defined
as
the
f
ollo
wing
f
or
m
ula.
Q
=
L
in
L
in
+
L
out
(8)
In
this
f
or
m
ula,
L
in
represents
the
n
umber
of
edges
whose
connection
happens
in
all
nodes
in
the
comm
unity
and
L
out
represents
n
umber
of
edges
whose
connection
hap
pens
betw
een
the
nodes
within
the
comm
unity
and
nodes
outside
the
comm
unity
.
The
v
alue
of
Q
represents
the
boundar
y
char
acter
istic
of
comm
unity
,
that
is
,
if
a
node
is
added
into
the
comm
unity
to
mak
e
the
v
alue
of
Q
increase
,
more
edges
are
added
into
the
comm
unity
,
which
leads
to
the
situation
that
the
edges
outside
the
comm
unity
become
sparse
.
Based
on
the
idea
of
abo
v
e-mentioned
CNM
algor
ithm
[12,
13],
the
paper
proposes
a
kind
of
comm
unity
detection
algor
ithm
(UIR-Q)
integ
r
ated
with
user
influence
.
The
basic
idea
of
comm
unity
detection
algor
ithm
based
on
UIR-Q
is
descr
ibed
as
f
ollo
ws
.
1)
Initializing
the
netw
or
k
proper
ties
.
Through
the
process
of
read-in
the
netw
or
k,
e
v
er
y
node
in
the
netw
or
k
is
assigned
to
an
ID
and
the
proper
ties
of
deg
ree
of
node
,
influence
and
the
comm
unity
label.
2)
Selecting
initial
node
.
The
UIR
v
alue
of
e
v
er
y
node
is
put
in
order
from
big
to
small
and
a
set
queue
is
obtained.
F
rom
the
set
queue
,
the
node
that
the
v
alue
of
comm
unity
label
is
n
ull
and
the
UIR
v
alue
is
maxim
um
is
selected
as
the
initial
node
of
ne
w
comm
unity
e
v
er
y
times
.
If
the
queue
is
n
ull,
step
8
is
carr
ied
out.
Otherwise
,
the
node
of
queue
v
i
with
the
maxim
um
UIR
v
alue
is
used
f
or
initializing
the
comm
unity
C
j
.
The
Q
v
alue
of
C
j
is
initializing
to
0.
3)
Finding
the
candidate
neighbor
nodes
set.
The
e
xter
nal
node
which
has
connection
with
the
nodes
of
the
C
j
is
added
into
the
candidate
neighbor
nodes
set
B
.
4)
F
or
ming
the
comm
unity
str
ucture
.
The
ne
w
local
modular
ity
q
v
is
calculated
when
e
v
er
y
node
v
in
the
set
B
are
added
into
the
comm
unity
C
j
and
the
maxim
um
q
v
is
selected.
If
q
v
>
Q
c
,
the
q
v
Research
on
Comm
unity
Detection
Algor
ithm
Based
on
the
UIR-Q
(Zilong
Jiang)
Evaluation Warning : The document was created with Spire.PDF for Python.
730
ISSN:
1693-6930
is
used
f
or
updating
the
Q
c
,
and
the
corresponding
node
v
of
is
added
into
the
comm
unity
C
j
.
The
comm
unity
label
of
n
ode
v
is
added
into
the
j
and
the
node
v
in
the
queue
is
deleted.
If
q
v
<
Q
c
,
step
6
is
carr
ied
out.
5)
Repeating
step
3)
and
step
4).
6)
Obtaining
the
comm
unity
j.
7)
Repeating
step
2),
step
3),
step
4),
step
5)
and
step
6).
8)
Obtaining
cluster
ing
result.
This
algor
ithm
can
be
descr
ibed
as
the
f
ollo
wing
algor
ithm
2.
Algorithm
2
Comm
unity
detection
algor
ithm
based
on
UIR-Q
Input:
List
<
Node
>
U
//U
is
the
set
of
all
users
Output:
List
<
Comm
unity
>
CS
//CS
is
the
set
of
detected
comm
unity
BEGIN
U=UIR(U)//Calling
algor
ithm
1
to
calculate
user
influence
Putting
U
in
order
WHILE
there
is
node
in
the
U
,
do
Initializing
the
ne
w
comm
unity
C
Finding
the
node
i
with
the
maxim
um
UIR
v
alue
from
U
and
adding
it
to
the
comm
unity
C
WHILE
there
is
no
satur
ation
in
the
C
Calculating
B
,
set
of
neighbor
ing
node
of
C
Calculating
the
ne
w
Q
v
alue
(
q
v
)
when
e
v
er
y
node
in
the
B
is
added
into
the
C
Finding
out
the
maxim
um
q
v
IF
q
v
>
Q
c
Adding
the
corresponding
node
of
q
v
into
the
C
Deleting
the
corresonding
node
of
q
v
from
U
Q
=
q
v
ELSE
C
reaches
satur
ation
ENDIF
END
WHILE
Sa
ving
the
comm
unity
C
into
the
comm
unity
set
Cs
Deleting
the
node
i
from
U
END
WHILE
RR
TURN
Cs
ENG
BEGIN
3.
Experiment
and
anal
ysis
3.1.
Har
d
ware
and
software
en
vir
onment
The
cluster
system
of
e
xper
imental
platf
or
m
consists
of
se
v
en
personal
computers
among
which
a
computer
is
used
as
master
and
the
other
six
computers
as
sla
v
ers
.
Ub
untu
12.04
is
installed
into
e
v
er
y
personal
computer
and
The
en
vironment
of
Spar
k
[14],
Hadoop
[15]
and
Y
ar
n
is
estab
lished
in
Ub
untu.
The
comm
unity
detection
algor
ithm
based
on
UIR-Q
will
be
par
alleliz
ed
under
the
en
vironment.
The
specific
configur
ation
of
hardw
are
and
softw
are
is
sho
wn
in
the
tab
le
1.
T
ab
le
1.
The
configur
ation
of
cluster
Name
Specific
configur
ation
PC
Intel
core
i3
3.2
GHz,
8G
RAM,
500G
Hard
Disk
Ether
net
100
Mb/s
Oper
ation
system
Ub
untu
12.04
L
TS
J
a
v
a
JDK
1.8
Hadoop
(Including
Y
ar
n)
Hadoop
2.2.0
Spar
k
Spar
k
1.0
T
ab
le
2.
The
compar
ison
of
diff
erent
algor
ithms
Algor
ithm
NMI
Q
ov
The
n
umber
of
comm
unities
CNM
0.735
0.729
23
LCE
0.827
0.787
37
LMDMR
0.751
0.704
33
UIR-Q
0.859
0.771
28
TELK
OMNIKA
V
ol.
14,
No
.
2,
J
une
2016
:
725
733
Evaluation Warning : The document was created with Spire.PDF for Python.
TELK
OMNIKA
ISSN:
1693-6930
731
3.2.
Dataset
of
e
xperiments
All
data
of
this
e
xper
iment
is
a
dataset
which
contains
3458
users’
sina
microb
logs
and
includes
inf
or
mation
of
microb
logs
,
the
retr
ansmission
relation
of
microb
logs
,
users’
inf
or
mation,
users’
fr
iendship
and
other
data.
3.3.
Ev
aluation
metrics
3.3.1.
Normaliz
ed
Mutual
Inf
ormation
(NMI)
Nor
maliz
ed
Mutual
Inf
or
mation
(NMI),
proposed
b
y
Lancichhetti
f
or
detecting
the
e
v
al-
uation
inde
x
of
comm
unity
,
can
be
used
f
or
eff
ectiv
e
e
v
aluati
ng
the
accur
ateness
of
detecting
o
v
er
lapping
comm
unity
[16].
NMI
represents
the
”m
utual
inf
or
mation”
betw
een
the
set
of
comm
u-
nity
detected
b
y
the
algor
ithm
and
the
set
of
real
comm
unity
str
ucture
and
its
v
alue
sho
ws
the
deg
ree
of
consistency
betw
een
these
tw
o
comm
unities
.
The
NMI
v
alue
is
usually
betw
een
0
and
1.
The
bigger
the
NMI
v
alue
is
,
the
better
the
result
of
comm
unity
detection
is
.
Its
definition
can
be
sho
wn
as
the
f
ollo
wing
f
or
m
ula.
N
M
I
=
I
(
a
;
b
)
p
H
(
a
)
H
(
b
)
(9)
in
this
f
or
m
ula,
H
(
a
)
=
P
k
(
a
)
h
n
a
h
n
log
n
a
h
n
,
H
(
b
)
=
P
k
(
b
)
l
n
b
l
n
log
n
b
l
n
,
and
I
(
a
;
b
)
=
P
h
P
l
n
h;l
n
log
(
n
h;l
n
=
(
n
a
h
n
n
b
l
n
))
.
a
,
b
respectiv
ely
denotes
the
diff
erent
comm
unity
str
uctures
.
k
(
b
)
means
the
n
umber
of
comm
unities
in
the
comm
unity
str
ucture
a
.
n
a
h
denotes
the
n
umber
of
nodes
of
the
hth
comm
unity
in
th
e
comm
unity
str
ucture
a
.
n
h;l
denotes
the
n
umber
of
nodes
which
sim
ultaneously
e
xist
in
the
h
th
comm
unity
of
the
comm
unity
str
ucture
a
and
the
l
th
comm
unity
of
the
comm
unity
str
ucture
b
.
3.3.2.
Overlapping
Modularity
(
Q
ov
)
In
order
to
con
v
enient
ly
e
v
aluate
the
o
v
er
lapping
comm
unity
str
ucture
,
Nicosia,
et
al.
proposed
the
Q
ov
function
[17],
which
is
similar
to
Q
function.
The
v
alue
of
Q
ov
function
r
anges
from
0
and
1.
The
bigger
the
v
alue
of
Q
ov
function
is
,
the
better
the
o
v
er
lapping
comm
unity
str
ucture
is
.
Its
definition
can
be
sho
wn
as
the
f
ollo
wing
f
or
m
ula.
Q
ov
=
1
m
X
c
2
C
X
i;j
2
V
(
(
C
i
;
C
j
;
C
)
A
ij
out
l
(
i;j
)
;c
k
out
i
in
l
(
i;j
)
;c
k
in
j
m
)
(10)
In
the
f
or
m
ula
(10),
m
denotes
then
umber
of
edges
in
the
g
r
aph.
(
C
i
;
C
j
;
C
)
denotes
the
affilia-
tion
relationship
betw
een
node
i
and
node
j
to
comm
unity
C
and
it
can
be
e
xpressedb
y
the
f
or
m
ula
(11).
(
C
i
;
C
j
;
C
)
=
(
1
;
i
2
C
;
j
2
C
0
;
other
w
ise
(11)
When
node
i
and
nod
e
j
belong
tothe
same
comm
unity
,
A
ij
is
equal
to
1,otherwise
,
it
is
0.
k
out
i
denotes
the
out-deg
ree
of
node
i,
that
is
,
the
n
umber
of
connected
edges
betw
een
node
i
and
the
nodes
be
y
ond
the
comm
unity
C
.
The
f
or
m
ula
of
calculating
out
l
(
i;j
)
;c
is
sho
wn
as
the
f
ollo
wing
f
or
m
ula
(12).
out
l
(
i;j
)
;c
=
P
j
2
V
F
(
/
i;c
;
/
j
;c
)
j
V
j
(12)
In
the
f
or
m
ula
(12),
/
i;c
denotes
the
strength
coefficient
of
node
i
belonging
to
comm
unity
C
and
its
v
alue
is
assigned
to
1
=n
.
n
means
the
n
umber
of
comm
unities
to
which
node
i
belongs
.
k
in
j
denotes
the
in-deg
reeof
node
j,
that
is
,
the
n
umber
of
connected
edges
betw
een
node
j
and
the
Research
on
Comm
unity
Detection
Algor
ithm
Based
on
the
UIR-Q
(Zilong
Jiang)
Evaluation Warning : The document was created with Spire.PDF for Python.
732
ISSN:
1693-6930
Figure
1.
The
comm
unities
str
ucture
Figure
2.
The
inter
nal
str
ucture
of
comm
unities
nodes
inthe
comm
unity
C
.
The
f
or
m
ula
of
calculating
in
l
(
i;j
)
;c
is
sho
wn
as
the
f
ollo
wing
f
or
m
ula
(13).
in
l
(
i;j
)
;c
=
P
i
2
V
F
(
/
i;c
;
/
j
;c
)
j
V
j
(13)
Function
F
is
sho
wn
as
the
f
ollo
wing
f
or
m
ula
(14).
F
(
/
i;c
;
/
j
;c
)
=
1
(1
+
e
f
(
/
i;c
)
)(1
+
e
f
(
/
j
;c
)
)
(14)
Function
f
is
selected
as
the
f
ollo
wing
f
or
m
ula
(15)
according
to
the
liter
ature
[27].
f
(
x
)
=
60
x
30
(15)
3.3.3.
The
n
umber
of
comm
unities
The
n
umber
of
comm
unities
denot
es
the
n
umber
of
sub-comm
unities
of
e
x
ecuting
the
comm
unity
detection
algor
ithms
on
the
dataset
of
social
netw
or
k.
3.4.
The
result
and
anal
ysis
The
result
obtained
from
the
comm
unity
detection
algor
ithm
based
on
UIR-Q
which
is
e
x
ecuted
on
the
dataset
of
sina
microb
log
is
sho
wn
in
a
visib
le
w
a
y
as
the
f
ollo
wing
figure
1,
figure
2
[18].
According
to
these
tw
o
figures
,
it
can
be
f
ound
that
the
prob
lem
of
o
v
er
lapping
comm
unity
is
impro
v
ed
and
the
fr
iends
are
roughly
e
v
enly
distr
ib
uted
around
the
user
.
The
compar
ison
betw
een
the
comm
unity
detection
algor
ithm
based
on
UIR-Q
and
the
abo
v
e-mentioned
comm
unity
detection
algor
ithms
based
on
local
modular
ity
can
be
sho
wn
as
the
tab
le
2.
According
to
the
tab
le
2,
the
NMI
v
alue
is
the
highest
in
the
comm
unity
detection
algor
ithm
based
on
UIR-Q,
while
its
Q
ov
v
alue
is
just
lo
w
er
than
LCE
algor
ithm.
In
LCE
algor
ithm,
37
comm
unities
in
cluding
unstab
le
comm
unities
are
detected.
These
unstab
le
comm
unities
should
attach
to
other
bigger
comm
unities
.
In
the
UIR-Q
algor
ithm,
there
is
a
relativ
ely
good
balance
among
NMI
v
alue
,
modular
ity
and
the
n
umber
of
comm
unities
.
4.
The
conc
lusion
and
the
future
w
ork
This
paper
proposes
an
impro
v
ed
comm
un
ity
detection
algor
ithm
based
on
user
influ-
ence
and
local
modular
ity
.
Through
utilizing
the
proper
ties
of
social
netw
or
ks
to
f
or
m
the
user
influence
f
actors
,
this
paper
emplo
ys
the
P
ageRank
algor
ithm
to
calculate
the
UIR
v
alues
of
all
users
.
The
node
with
the
maxim
um
UIR
v
alue
in
the
netw
or
k
is
used
f
or
initializing
a
comm
unity
,
and
the
user
is
selected
as
the
center
of
comm
unity
.
Then,
the
node
which
mak
es
the
v
alue
of
local
modular
ity
Q
in
ter
ms
of
the
CNM
algor
ithm
is
added
into
the
comm
unity
.
This
process
does
not
ter
minate
until
all
nodes
belong
to
one
comm
unity
or
more
comm
unities
.
Finally
,
par
alleliza-
tion
of
the
algor
ithm
is
implemented
under
Spar
k
fr
ame
w
or
k.
The
e
xper
imental
result
sho
ws
that
compared
with
the
CNM
algor
ithm,
LCE
algor
ithm
and
LMDMR
algor
ithm,
the
comm
unity
detec-
tion
algor
ithm
based
on
UIR-Q
can
obtain
good
modular
ity
and
small
n
umber
of
comm
unity
in
the
TELK
OMNIKA
V
ol.
14,
No
.
2,
J
une
2016
:
725
733
Evaluation Warning : The document was created with Spire.PDF for Python.
TELK
OMNIKA
ISSN:
1693-6930
733
social
netw
or
k
with
uncer
tain
str
ucture
and
large
scale
.
Ho
w
e
v
er
,
the
phenomenon
of
o
v
er
lapping
comm
unities
cannot
be
completely
eliminated
in
this
algor
ithm.
Theref
ore
,
the
future
w
or
k
will
be
f
ocused
on
fur
ther
impro
ving
the
quality
of
comm
unity
detection.
Ac
kno
wledg
ements
This
w
or
k
is
suppor
ted
in
par
t
b
y
Humanities
and
Social
Science
Y
outh
Fund
Project
of
Ministr
y
of
Education,
P
.
R.
C
,
No
.13YJCZH028.
Ref
erences
[1]
Lim
K
H.,
Datta
A..
F
ollo
wing
the
f
ollo
w
er
:
Detecting
comm
unities
with
common
interests
on
T
witter
.
In
Proc.
of
the
23rd
A
CM
Conf
.
on
Hyper
te
xt
and
Social
Media
,
Ne
w
Y
or
k:
A
CM
Press
.
2012;
317-318.
[2]
]Dour
isboure
Y
.,
Ger
aci
F
.,
P
elleg
r
ini
M..
Extr
action
and
classification
of
dense
comm
unities
in
the
W
eb
.
In:
Proc.
of
the
16th
Intl
Conf
.
on
W
or
ld
Wide
W
eb
,
Ne
w
Y
or
k:
A
CM
Press
.
2007;
461-470.
[3]
Asur
S
.,
P
ar
thasar
ath
y
S
.,
Ucar
D
..
An
e
v
ent-based
fr
ame
w
or
k
f
or
char
acter
izing
the
e
v
o-
lutionar
y
beha
vior
of
inter
action
g
r
aphs
.
A
CM
T
r
ans
.
on
Kno
wledge
Disco
v
er
y
from
Data
(TKDD)
.
2009;
3(4):16.
[4]
Ne
wman
MEJ
,
Gir
v
an
M..
Finding
and
e
v
aluating
comm
unity
str
ucture
in
netw
or
ks
.
Ph
ysical
Re
vie
w
E
.
2004;
69(2):026113.
[5]
Clauset
A.Ne
wman
M
E
J
,
Moore
C
..
Finding
comm
unity
str
ucture
in
v
er
y
large
netw
or
ks
.
Ph
ysical
Re
vie
w
E
.
2004;
70(6):
066111.
[6]
Chen
Q.,
F
ang
M..
An
Efficient
Algor
ithm
f
or
Comm
unity
Detection
in
Comple
x
Netw
or
ks
.
THE
6TH
sna-kdd
W
or
kshop
.
2012;
733-740.
[7]
Y
u
Hu.
Research
on
comm
unity
dectection
algor
ithms
in
comple
x
netw
or
k
based
on
local
e
xpansion.
[D].Changchun:
Jilin
Univ
ersity
,
2015.
[8]
Mee
y
oung
C
..
Measur
ing
user
influence
in
twitter
:
The
million
f
ollo
w
er
f
allacy
.
Proceedings
of
Inter
national
Conf
erence
on
W
eb
logs
and
Social
Media
.
2010;
887
-893.
[9]
Shaozhi
Y
e
,
S
.
F
elix
W
u.
Measur
ing
Mseeage
Propagation
and
Social
Influence
on
T
wit-
ter
.com.
SocInf
o
2010
.
2010;
223-228.
[10]
Ar
lei
Silv
a,Sar
a
Guimares
,
W
agner
Meir
a,Jr
.
Mohammed
Zaki.
ProfileRank:
Finding
Rele-
v
ant
Content
and
Influential
Users
Based
on
Inf
or
mation
Diffiision.
Proceedings
of
the
A
CM
Inter
national
Conf
erence
on
7t
h
W
or
kshop
on
Social
Netw
or
k
Mining
and
Analysis
.
2013;
667-773.
[11]
P
age
L.,
Br
in
S
.,
Motw
ani
R,
et
al..
The
pager
ank
citation
r
an
king:
Br
inging
order
to
the
w
eb
.
Stanf
ord
Digital
Libr
ar
ies
.
1999;
344-349.
[12]
Aaron
Clauset
A..
Finding
local
comm
unity
str
ucture
in
netw
or
ks
.
Ph
ys
.
Rec.
E
.
2005;
72(2):
26132-26137,
[13]
L
yr
ic
D
.,
Jonas
K.,
Stef
an
N.,
et
al..
Predictng
mo
vie
pr
ices
through
dynamic
social
netw
or
k
analysis
.
Procedia
Social
and
Beha
vior
al
Sciences
.
2010;
2:
6423-6433.
[14]
Spar
k.
[EB/OL].
http://spar
k.apache
.org/.
[15]
Dhr
uba
Bor
thaku.
The
Hadoop
Distr
ib
uted
File
System:
Architecture
and
Design.
Retr
ie
v
ed
from.
http://hadoop
.apache
.org/common/
.
2010.
[16]
A.
lancichineet,
S
.
F
or
tunat,
J
.
K
er
tesz.
Detecting
the
Ov
er
lapping
and
Hier
achical
Comm
u-
nity
Str
ucture
in
Comple
x
Netw
or
ks
.
Ne
w
Jour
nal
of
Ph
ysics
.
2009;
11(3):033015.
[17]
V
Nieosia,
G
mangionoi,
V
Carehiolo
,
M
Malger
i.
Extending
the
d
efinition
of
modular
ity
to
Directed
g
r
aphs
with
o
v
er
lapping
comm
unities
.
Jour
nal
of
Statistical
mechanics:
Theor
y
and
Exper
iment
.
2009;
P03024.
[18]
Shneider
man
B
,
Ar
is
A..
Netw
or
k
Visualization
b
y
Semantic
Substr
ates
.
IEEE
T
r
ansactions
on
Visualization
and
Computer
Gr
aphics
.
2006;
12(5):733-740.
Research
on
Comm
unity
Detection
Algor
ithm
Based
on
the
UIR-Q
(Zilong
Jiang)
Evaluation Warning : The document was created with Spire.PDF for Python.