Fast virus detection by using high speed time delay neural networks, Hacking and IT E-Book Dump Release, pobierz pdf

[ Pobierz całość w formacie PDF ]
J Comput Virol (2010) 6:115–122
DOI 10.1007/s11416-009-0120-x
CORRESPONDENCE
Fast virus detection by using high speed time delay neural
networks
Hazem M. El-Bakry
Received: 17 January 2007 / Revised: 10 July 2007 / Accepted: 26 March 2009 / Published online: 15 April 2009
© Springer-Verlag France 2009
Abstract
This paper presents an intelligent approach to
detect unknown malicious codes by using new high speed
time delay neural networks. The entire data are collected
together in a long vector and then tested as a one input pattern.
The proposed fast time delay neural networks (FTDNNs)
use cross correlation in the frequency domain between the
tested data and the input weights of neural networks. It is
proved mathematically and practically that the number of
computation steps required for the presented time delay neu-
ral networks is less than that needed by conventional time
delay neural networks (CTDNNs). Simulation results using
MATLAB conﬁrm the theoretical computations.
known patterns. One drawback of this method is that a copy
of a malicious program must be known before extracting the
pattern necessary for its detection [
2
].
Some researchers tried to overcome this intrusion by using
intelligent algorithms to detect virus codes. In an early
attempt, the authors in [
3
] conducted an analysis of sev-
eral programs evidently by hand and identiﬁed tell-tale signs,
which they subsequently used as a ﬁlter to protect new pro-
grams. IBM researchers have applied neural networks for
virus detection and incorporated a similar approach for
detecting boot-sector viruses into IBM’s Anti-Virus soft-
ware [
4
]. Others used data mining techniques such as naïve
bayes classiﬁers to detect virus codes [
5
]. However, the work
in literature has shown that the ability of neural networks
to generalize is far better than that of the bayes classiﬁer
[
6
–
10
]. This is because of the powerful learning capability
of neural networks rather than bayes classiﬁer.
Recently, time delay neural networks have shown very
good results in different areas such as automatic control,
speech recognition, blind equalization of time-varying chan-
nel and other communication applications. The main objec-
tive of this paper is to improve the speed of time delay neural
networks for fast virus detection. The purpose is to perform
the testing process in the frequency domain instead of the
time domain. This approach was successfully applied for
sub-image detection using fast neural networks (FNNs) as
proposed in [
11
–
13
]. Furthermore, it was used for fast face
detection [
14
,
15
], and fast iris detection [
16
]. Another idea
to further increase the speed of FNNs through image decom-
position was suggested in [
14
].
FNNs for detecting a certain code in one dimensional
serial stream of sequential data were described in [
17
,
18
].
Compared with conventional neural networks, FNNs based
on cross correlation between the tested data and the input
weights of neural networks in the frequency domain showed
1 Introduction
Fast virus detection is very important for computer and
network security. Since the appearance of the ﬁrst computer
virus in 1986, many new viruses have been created every
year. The number of these viruses is growing rapidly and this
threatens to outpace the manual efforts of anti-virus experts
in designing solutions for detecting these viruses and remov-
ing them from the computer system [
1
].Thereareawide
variety of protection mechanisms to over come virus attack
like ﬁrewalls and antivirus tools. As the number and intensity
of malware attacks is on the rise, computer security compa-
nies, researchers and users do their best to ﬁnd new solutions
thwart and defend against such assaults [
25
–
27
].
New technology exists for detecting known viruses. Pro-
grams such as Norton and MacAfee’s Antivirus are ubiq-
uitous. These programs search for the executable code of
B
H. M. El-Bakry (
)
Faculty of Computer Science and Information Systems,
Mansoura University, Mansoura, Egypt
e-mail: helbakry20@yahoo.com
123
116
H. M. El-Bakry
a significant reduction in the number of computation steps
required for certain data detection [
11
–
20
]. Here, we make
use of our theory on FNNs implemented in the frequency
domain to increase the speed of time delay neural networks
for fast virus detection.
The idea of moving the testing process from the time
domain to the frequency domain is applied to time delay
neural networks. Theoretical and practical results show that
the proposed FTDNNs are faster than CTDNNs. In Sect.
2
,
our theory on FNNs for detecting certain data in one dimen-
sional matrix is described. Experimental results for FTDNNs
are presented in Sect.
3
.
where g is the activation function and b(i) is the bias of each
hidden neuron (i). Equation
1
represents the output of each
hidden neuron for a particular sub-matrix I. It can be obtained
to the whole input matrix Z as follows:
⎛
⎞
n
/
2
⎝
⎠
h
i
(
u
)
=
g
W
i
(
k
)
Z
(
u
+
k
)
+
b
i
(2)
k
=−
n
/
2
Eq.
2
represents a cross correlation operation. Given any two
functions f and d, their cross correlation can be obtained by:
∞
d
(
x
)
⊗
f
(
x
)
=
f
(
x
+
n
)
d
(
n
)
(3)
n
=−∞
Therefore, Eq.
2
may be written as follows [
11
]:
2 Theory of FNNs based on cross correlation
in the frequency domain
h
i
=
g
(
W
i
⊗
Z
+
b
i
)
(4)
where h
i
is the output of the hidden neuron (i) and h
i
(
is
the activity of the hidden unit (i) when the sliding window is
located at position (u) and (u)
u
)
Finding a certain virus in the input one dimensional matrix
is a searching problem. Each position in the input matrix is
tested for the presence or absence of the required virus. At
each position in the input matrix, each sub-matrix is multi-
plied by a window of weights, which has the same size as
the sub-matrix. The outputs of neurons in the hidden layer
are multiplied by the weights of the output layer. When the
ﬁnal output is high, this means that the sub-matrix under test
contains the required virus and vice versa. Thus, we may
conclude that this searching problem is a cross correlation
between the matrix under test and the weights of the hidden
neurons.
The convolution theorem in mathematical analysis says
that a convolution of f with h is identical to the result of
the following steps: let F and H be the results of the Fourier
Transformation of f and h in the frequency domain. Multiply
F and H* in the frequency domain point by point and then
transform this product into the spatial domain via the inverse
Fourier Transform. As a result, these cross correlations can
be represented by a product in the frequency domain. Thus,
by using cross correlation in the frequency domain, speed up
in an order of magnitude can be achieved during the detection
process [
11
–
18
,
21
]. Assume that the size of the virus code
in 1
.
Now, the above cross correlation can be expressed in terms
of one dimensional Fast Fourier Transform as follows [
11
]:
∈[
N
−
n
+
1
]
F
−
1
W
i
⊗
Z
=
(
F
(
Z
)
•
F
∗
(
W
i
))
(5)
Hence, by evaluating this cross correlation, a speed up ratio
can be obtained comparable to conventional neural networks.
Also, the ﬁnal output of the neural network can be evaluated
as follows:
g
q
O
(
u
)
=
W
o
(
i
)
h
i
(
u
)
+
b
o
(6)
i
=
1
where q is the number of neurons in the hidden layer. O(u)
is the output of the neural network when the sliding window
located at the position (u) in the input matrix Z. W
o
is the
weight matrix between hidden and output layer.
The complexity of cross correlation in the frequency
domain can be analyzed as follows:
1. For a tested matrix of 1
N elements, the 1D-FFT
requires a number equal to N log
2
N of complex com-
putation steps [
22
]. Also, the same number of complex
computation steps is required for computing the 1D-FFT
of the weight matrix at each neuron in the hidden layer.
2. At each neuron in the hidden layer, the inverse 1D-FFT
is computed. Therefore, q backward and (1
×
n
(sliding window) is extracted from the tested matrix, which
hasasize1
×
n
.
In virus detection phase, a sub matrix I of size 1
×
Such sub matrix, which may be a virus
code, is fed to the neural network. Let W
i
be the matrix of
weights between the input sub-matrix and the hidden layer.
This vector has a size of 1
×
N
.
q) forward
transforms have to be computed. Therefore, for a given
matrix under test, the total number of operations required
to compute the 1D-FFT is (2q+1)N log
2
N
+
n
matrix. The output of hidden neurons h(i) can be calculated
as follows:
×
n and can be represented as 1
×
.
3. The number of computation steps required by FNNs is
complex and must be converted into a real version. It
is known that, the one dimensional Fast Fourier Trans-
form requires (N/2)log
2
N complex multiplications and
⎛
⎞
n
⎝
⎠
h
i
=
g
W
i
(
k
)
I
(
k
)
+
b
i
(1)
k
=
1
123
Fast virus detection by using high speed time delay neural networks
117
Nlog
2
N complex additions [
22
]. Every complex multi-
plication is realized by six real ﬂoating point operations
and every complex addition is implemented by two real
ﬂoating point operations. Therefore, the total number of
computation steps required to obtain the 1D-FFT of a
1
I
1
I
2
Output
Layer
×
Nmatrixis:
Output
6
(
log
2
N
+
2
Nlog
2
N
ρ
=
N
/
2
)
(7)
I
n-1
which may be simpliﬁed to:
Hidden
Layer
ρ
=
5N log
2
N
(8)
I
n
Dot multiplication in time domain
between the (n) input data and
weights of the hidden layer.
4. Both the input and the weight matrices should be dot
multiplied in the frequency domain. Thus, a number of
complex computation steps equal to qN should be con-
sidered. This means 6qN real operations will be added
to the number of computation steps required by FNNs.
5. In order to perform cross correlation in the frequency
domain, the weight matrix must be extended to have the
same size as the input matrix. So, a number of zeros
Input
Layer
Serial input data 1:N in groups of (n) elements
shifted by a step of one element each time.
I
N
Fig. 1
Classical time delay neural networks
=
(N
n) must be added to the weight matrix. This requires
a total real number of computation steps
−
n) for
all neurons. Moreover, after computing the FFT for the
weight matrix, the conjugate of this matrix must be
obtained. As a result, a real number of computation steps
= qN should be added in order to obtain the conjugate of
the weight matrix for all neurons. Also, a number of real
computation steps equal to N is required to create butter-
ﬂies complex numbers
=
q(N
−
I
1
I
2
Output
Layer
Output
e
−
jk
(
2
n
/
N
)
),
.
These (N/2) complex numbers are multiplied by the ele-
ments of the input matrix or by previous complex num-
bers during the computation of FFT. To create a complex
number requires two real ﬂoating point operations. Thus,
the total number of computation steps required for FNNs
becomes:
(
where 0
<
K
<
L
I
N-1
Hidden
Layer
I
N
Cross correlation in the frequency
domain between the total (N) input data
and the weights of the hidden layer.
)
5N log
2
N
+
σ =
(
2q
+
1
6qN
Fig. 2
Fast time delay neural networks
+
−
)
+
+
q
(
N
n
qN
N
(9)
CTDNNs and FTDNNs are shown in Figs.
1
and
2
respectively.
which can be reformulated as:
)
5N log
2
N
+
σ =
(
2q
+
1
q
(
8N
−
n
)
+
N(10)
3 Experimental results of time delay neural networks
for fast virus detection
×
6. Using sliding window of size 1
n for the same matrix of
1
1) computation steps are
required when using CTDNNs for certain virus detec-
tion or processing (n) input data. The theoretical speed
up factor
×
N pixels, q(2n
−
1)(N
−
n
+
First neural networks are trained to classify virus from non
virus examples and this is done in time domain. In the virus
detection phase, each sub-matrix
η
can be evaluated as follows:
(
1
×
n
)
in the incoming data
(probe matrix 1
is tested for the presence or absence of
the virus. At each position in the incoming input matrix, each
×
N
)
q
(
2n
−
1
)(
N
−
n
+
1
)
η
=
(11)
(
2q
+
1
)(
5N log
2
N
)
+
q
(
8N
−
n
)
+
N
123
118
H. M. El-Bakry
2q
2n
1
(
sub-matrix is multiplied by a window of weights which has
the same size as the sub-matrix. This multiplication is done
in the time domain. The outputs of neurons in the hidden
layer are multiplied by the weights of the output layer. When
the ﬁnal output is high this means that the sub-matrix under
test contains a virus and vice versa. Thus, we may conclude
that this searching problem is cross correlation in the time
domain between the incoming data and the input weights of
neural networks.
Time delay neural networks accept serial input data with
ﬁxed size (n). Therefore, the number of input neurons equals
to (n). Instead of treating (n) inputs, the proposed new
approach is to collect all the incoming data together in a
long vector (for example 100
θ
=
−
N
−
n
+
1
)
(12)
The speed up ratio in this case can be computed as follows:
2q
(
2n
−
1
)(
N
−
n
+
1
)
η
=
(13)
(
+
)(
)
+
(
−
)
+
2q
1
5N log
2
N
q
8N
n
N
The theoretical speed up ratio for searching short successive
(n) data in a long input vector (L) using complex-valued time
delay neural networks is shown in Figs.
3
,
4
, and
5
. Also, the
practical speed up ratio for manipulating matrices of differ-
ent sizes (L) and different sized weight matrices (n) using a
2.7 GHz processor and MATLAB is shown in Fig.
6
.
Then the input data is
tested by time delay neural networks as a single pattern with
length L
×
n
).
3.1.2 For a two dimensional input matrix
Such a test is performed in the
frequency domain as described in Sect.
2
. The virus inserted
in the incoming data may have real or complex values in
a form of one or two dimensional array. Complex-valued
neural networks have many applications in ﬁelds dealing
with complex numbers such as telecommunications, speech
recognition and image processing with the Fourier Trans-
form [
23
,
24
]. Complex-valued neural networks mean that
the inputs, weights, thresholds and the activation function
have complex values. In this section, formulas for the speed
up ratio with different types of inputs (real /complex) will be
presented. Also, the speed up ratio in the case of a one and two
dimensional incoming input matrix will be concluded. The
operation of FNNs depends on computing the Fast Fourier
Transform for both the input and weight matrices and obtain-
ing the resulting two matrices. After performing dot multipli-
cation for the resulting two matrices in the frequency domain,
the Inverse Fast Fourier Transform is calculated for the ﬁnal
matrix. Here, there is an excellent advantage with FNNs that
should be mentioned. The Fast Fourier Transform is already
dealing with complex numbers, so there is no change in the
number of computation steps required for FNNs. Therefore,
the speed up ratio in the case of complex-valued time delay
neural networks can be evaluated as follows:
(
L
=
100
×
n
).
n
2
n
2
Multiplication of
(
)
complex-valued weights by
(
)
real
2n
2
n
2
inputs requires
(
)
real operations. This produces
(
)
real
n
2
numbers and
(
)
imaginary numbers. The addition of these
2n
2
numbers requires
real operations. The multiplica-
tion and addition operations are repeated
(
−
2
)
2
for all
possible sub matrices in the incoming input matrix. In addi-
tion, all of these procedures are repeated at each neuron in
the hidden layer. Therefore, the number of computation steps
required by conventional neural networks can be calculated
as:
(
N
−
n
+
1
)
2.5E+11
Number of Computation Steps Required
by CTDNNs
Number of Computation Steps Required
by FTDNNs
2E+11
1.5E+11
1E+11
5E+10
0
10000
2E+05
5E+05
1E+06
2E+06
3E+06
4E+06
Length of one dimensional input matrix
Fig. 3
A comparison between the number of computation steps
required by FTDNNs and CTDNNs in case of real-valued one dimen-
sional input matrix and complex-valued weight matrix (n
=
400)
3.1 In case of real inputs
3.5E+11
Number of Computation Steps Required
by CTDNNs
Number of Computation Steps Required
by FTDNNs
3E+11
3.1.1 For a one dimensional input matrix
2.5E+11
2E+11
Multiplication of (n) complex-valued weights by (n) real
inputs requires (2n) real operations. This produces (n) real
numbers and (n) imaginary numbers. The addition of these
numbers requires (2n
1.5E+11
1E+11
5E+10
2) real operations. The multiplication
and addition operations are repeated (N
−
0
1) for all possi-
ble sub matrices in the incoming input matrix. In addition, all
of these procedures are repeated at each neuron in the hidden
layer. Therefore, the number of computation steps required
by conventional neural networks can be calculated as:
−
n
+
10000
2E+05
5E+05
1E+06
2E+06
3E+06
4E+06
Length of one dimensional input matrix
Fig. 4
A comparison between the number of computation steps
required by FTDNNs and CTDNNs in the case of real-valued one
dimensional input matrix and complex-valued weight matrix (n
=
625)
123
Fast virus detection by using high speed time delay neural networks
119
5E+11
3.5E+11
Number of Computation Steps Required
by CTDNNs
Number of Computation Steps Required
by FTDNNs
Number of Computation Steps Required
by CTDNNs
Number of Computation Steps Required
by FTDNNs
3E+11
4E+11
2.5E+11
3E+11
2E+11
1.5E+11
2E+11
1E+11
1E+11
5E+10
0
0
10000
2E+05
5E+05
1E+06
2E+06
3E+06
4E+06
100
300
500
700
900
1100
1300
1500
1700
1900
Length of one dimensional input matrix
Size of two dimensional input matrix
Fig. 5
A comparison between the number of computation steps
required by FTDNNs and CTDNNs in the case of real-valued one
dimensional input matrix and complex-valued weight matrix (n
Fig. 8
A comparison between the number of computation steps
required by FTDNNs and CTDNNs in the case of real-valued two
dimensional input matrix and complex-valued weight matrix (n
=
900)
=
25)
40
4.5E+11
35
Number of Computation Steps Required
by CTDNNs
Number of Computation Steps Required
by FTDNNs
4E+11
30
3.5E+11
25
3E+11
2.5E+11
20
2E+11
15
1.5E+11
10
Practical Speed up ratio (n=400)
Practical Speed up ratio (n=625)
Practical Speed up ratio (n=900)
1E+11
5
5E+10
0
0
10000
2E+05
5E+05
1E+06
2E+06
3E+06
4E+06
100
300
500
700
900
1100
1300
1500
1700
1900
Length of one dimensional input matrix
Size of two dimensional input matrix
Fig. 6
Practical speed up ratio for time delay neural networks in case
of one dimensional real-valued input matrix and complex-valued
weights
Fig. 9
A comparison between the number of computation steps
required by FTDNNs and CTDNNs in the case of real-valued two
dimensional input matrix and complex-valued weight matrix (n
=
30)
2E+11
40
1.8E+11
Number of Computation Steps Required
by CTDNNs
Number of Computation Steps Required
by FTDNNs
35
1.6E+11
30
1.4E+11
25
1.2E+11
1E+11
20
8E+10
15
6E+10
Speed up Ratio (n=20)
Speed up Ratio (n=25)
Speed up Ratio (n=30)
10
4E+10
2E+10
5
0
0
100
300
500
700
900
1100
1300
1500
1700
1900
100
300
500
700
900
1100
1300
1500
1700
1900
Size of two dimensional input matrix
Size of two dimensional input matrix
Fig. 7
A comparison between the number of computation steps
required by FTDNNs and CTDNNs in the case of real-valued two
dimensional input matrix and complex-valued weight matrix (n
Fig. 10
Practical speed up ratio for time delay time neural networks in
case of two dimensional real-valued input matrix and complex-valued
weights
=
20)
2q
2n
2
1
ces (n) using a 2.7 GHz processor and MATLAB is shown in
Fig.
10
.
2
θ
=
−
(
N
−
n
+
1
)
(14)
The speed up ratio in this case can be computed as follows:
3.2 In case of complex inputs
2n
2
2
2q
(
−
1
)(
N
−
n
+
1
)
η
=
(15)
5N
2
log
2
N
2
8N
2
(
2q
+
1
)(
)
+
q
(
−
n
2
)
+
N
3.2.1 For a one dimensional input matrix
The theoretical speed up ratio for detecting (n
×
n) real valued
×
submatrix in a large real valued matrix (N
N) using com-
plex-valued time delay neural networks is shown in Figs.
7
,
8
,
9
. Also, the practical speed up ratio for manipulating matri-
ces of different sizes (N
Multiplication of (n) complex-valued weights by (n) com-
plex inputs requires (6n) real operations. This produces (n)
real numbers and (n) imaginary numbers. The addition of
these numbers requires (2n
×
N) and different sized weight matri-
−
2) real operations. Therefore,
123
[ Pobierz całość w formacie PDF ]

Pokrewne