Wednesday, August 11, 2010

Training the classifier or handling vmware errors!!

Training the classifier doesn't seem to be as much fun as I thought it would be.
Reasons I thought it would be fun:
  1. I had found some new malicious web pages, by simple google searches!!
  2. The lists for training the classifier for both the mal and safe classes was prepared.
  3. The only task was to now to pickle the features dictionary.
Reasons it became a pita:
  1. the vmware error, lack of memory, at the end of 12 URL scan
  2. the continuation of this error now, at each URL, and even after restarting the host machine, and allocating larger RAM to vmware.
  3. finally it rewrote the pickle file that it had learned, means features of 15 URLs..
I have googled for this vmware error, but haven't found any suitable solution.
I thought the OS were trained to handle batch jobs, very early after their birth. This anomalous behaviour is out my understanding!!
Any body with any relevant suggestions??

