Saturday, July 10, 2010

Malicious Javascript - blueprints

Javascript might be a great scripting language but it has been recently been abused a lot to carry out drive-by-downloads attacks. It targets the browsers and the plugins vulnerabilities at the client side.

There are certain features in the structure of the malicious javascript, though, which can be used to detect its presence with high precision. My GSOC 2010 project aims at finding these features and extracting them and thus classify scripts on the basis of these scores into benign and malicious. Finally integrating the complete solution in the low interaction client honeypot - PhoneyC.
I have extracted 9 features, which have been mentioned by a lot of people in their works. These features have been extracted from a very very modest corpus, which is not very broad yet, of 15 benign and 10 malicious JS samples.
The findings expressed as graphs, file against the feature value can be found here. The graphs show malicious scripts features in red and benign scripts in blue.

1. average characters per line
2. average eval() argument length
3. string definition to string use ratio
4. # unicode characters
5. # lines in the script
6. % human readable characters
7. % white space in the script 
8. # words in the script
9. dynamic execution calls

Though the results aren't very encouraging for all the features, but some of them like the string definition to use ratio, % human readable characters, %white space, offer some hope. Improvements in the implementation of the features extraction with little assumptions is required to build a proper extractor for the classifier.
The code for the feature extractor and the classifier may be accessed in my svn branch of phoneyc under njain-anomalydetection.

I sincerely appreciate the comments and reviews on the current work feature extraction and classification.
p.s. - truly speaking,this is my first attempt at regex, pickling, or in short, programming, in that case. All thanks to the mentor for his able guidance and constant motivation.

No comments:

Post a Comment