Monday, June 7, 2010

Anomaly Detection

Anomaly detection is a unique approach to find the odd one out or malicious value. The approach basically involves learning the normal behavior and then detecting variation from this established behavior, which is called a profile. The variation is found based on a model. A model supports in learning as well as detecting. The crux of the approach is "the model".
A basic understanding of the approach can be had from the following program which learns A, an arbitrary integer variable. This model learns that A normally lies between the minimum and maximum values input during the learning mode. It also learns a threshold as 10% of the mean of the entered values. After successfully learning the values of A the model switches to the detection mode. In this mode the difference of the entered value and the mean is compared to the threshold. A difference greater than the threshold is marked anomalous and the value is put in the anomaly list else it is appended to the normal list of values. This is a very naive but working implementation of anomaly detection approach.
The original python implementation is here:

class learna(object):

    def learnA(self):
        """a function to learn a"""
        list=[]

        list=l.learn()
        low=min(list)
        high=max(list)
        avg=sum(list)/len(list)
        print "average is",avg
        l.detect(low,high,avg)


    def learn(self):
        print "learning mode"
        alearned=[]
        for i in range(5):
            al=int(raw_input("enter integer value for a."))
            alearned.append(al)
        lower=min(alearned)
        upper=max(alearned)
        print lower,"<",upper
        return alearned

    def detect(self,low,high,avg):
        print "running in detection mode."
        aentered=[]
        anomaly=[]
        normal=[]
        anomalous=[]
        threshold=0.1*avg
        for i in range(5):
            ae=int(raw_input("enter current integer value."))
            aentered.append(ae)
            if (aehigh):
                anomaly.append(ae)
            else:
                normal.append(ae)
            if (abs(ae-avg)>threshold):
                anomalous.append(ae)
        print "total anomalous value",len(anomaly)
        print "total normal values",len(normal)
        print "total entered values", len(aentered)
        print "total detected anomalous values",len(anomalous)

1 comment: