Tuesday, July 27, 2010

Blogger's unusual shenanigan ? safe : malicious

A very interesting thing happened while making the earlier post. I quite naively pushed the 'Publish Post' button after keying in the test javascript code snippet. Though the script was entered in the multiline comments construct /* */ of JS, but blogger identified it as javascript and started popping all the alerts there!

Yes!

This is not it, when I tried to edit the script to have '//' comment construct before each line.. there was no script to be found!
Blogger consumed the script, for all the alert calls the respective alerts were displayed and finally the post looked empty, without any JS code..

It may sound foolish... Well considering the fact that I am a /no{2}b/ this might make some sense...

Anyways, this also shows that any one can easily post any damn nasty JS code onto Blogger.(Any comments???)
Is blogger safe?

'top' property of 'Window'

The top element of the DOM in Javascript has the following properties:

   1. It refers to the [object window] or the self.
   2. In case of frames or iframes, as obvious, 'top' refers to the top object that created the [i]frame i.e. the [window]
   3. But when we do a window.open(), then the top of this window refer to itself and not to the parent window.
    ps. to refer to the parent window from this newly created window there is the 'opener' property.

The following code explains it clearly...

/************************ Toping.js *******************************/

//var i = "i am at the top."
//function foo(j)
//{
//return eval(3*j);
//}
//alert(foo(3));
//alert("self: "+self.location.href+"\n"+"top: "+top.location.href);
//document.write();
//o = window.open("teesri.js",'win')

/************************ bottom.js ******************************/

//alert(top.i);
//alert("self: "+self.location.href+"\n"+"top: "+top.location.href);
//var foo = top.foo;
//alert("top ka foo in bottom : "+ String(foo(4)));
//window.open("teesri.html","win");

/********************** teesri.js ********************************/

//alert("self: "+self.location.href+"\n top: "+top.location.href);

It might be a trivial concept for some, but it took long to sink in, for me!

Nothing malicious

There is nothing malicious in the last post related to 'top' attribute. Its just that I was unaware of the fact that blogger doesnot accept the multiline javascript comments, or perhaps I commented out the script incorrectly that there are so many non-sensical alerts popping up!
I sincerely regret the annoyance being caused. But, Blogger seems to have consumed the JS code :(
So, I am in a fix how to remove that code...
I appreciate suggestions on this!

Saturday, July 10, 2010

Malicious Javascript - blueprints

Javascript might be a great scripting language but it has been recently been abused a lot to carry out drive-by-downloads attacks. It targets the browsers and the plugins vulnerabilities at the client side.

There are certain features in the structure of the malicious javascript, though, which can be used to detect its presence with high precision. My GSOC 2010 project aims at finding these features and extracting them and thus classify scripts on the basis of these scores into benign and malicious. Finally integrating the complete solution in the low interaction client honeypot - PhoneyC.
I have extracted 9 features, which have been mentioned by a lot of people in their works. These features have been extracted from a very very modest corpus, which is not very broad yet, of 15 benign and 10 malicious JS samples.
The findings expressed as graphs, file against the feature value can be found here. The graphs show malicious scripts features in red and benign scripts in blue.

1. average characters per line
2. average eval() argument length
3. string definition to string use ratio
4. # unicode characters
5. # lines in the script
6. % human readable characters
7. % white space in the script 
8. # words in the script
9. dynamic execution calls

Though the results aren't very encouraging for all the features, but some of them like the string definition to use ratio, % human readable characters, %white space, offer some hope. Improvements in the implementation of the features extraction with little assumptions is required to build a proper extractor for the classifier.
The code for the feature extractor and the classifier may be accessed in my svn branch of phoneyc under njain-anomalydetection.

I sincerely appreciate the comments and reviews on the current work feature extraction and classification.
p.s. - truly speaking,this is my first attempt at regex, pickling, or in short, programming, in that case. All thanks to the mentor for his able guidance and constant motivation.