In ACM CCS 2007 a new version of CAPTCHA (Completely Automated Public Turing test to tell Computers and Humans Apart) was presented: Asirra (Animal Species Image Recognition for Restricting Access) is a HIP (Human Interactive Proof) that works by asking users to identify photographs of cats and dogs.

This task is difficult for computers, but people can accomplish it quickly and accurately.
In this project our goal was to build a classifier based on Bag Of Words algorithm to distinguish between cats and dogs.


We used the Oxford-IIIT Pet Dataset. The following annotations are available for every image in the dataset: (a) species and breed name; (b) a tight bounding box (ROI) around the head of the animal; and (c) a pixel level foreground-background segmentation (Trimap).

Based on Parkhi (2012), we built classifiers based on three different layouts.

  1. Image Layout: whole image and four quarters
  2. Image + Head Layout: whole image, four quarters, head and image without head
  3. Image + Head + Body Layout: whole image, four quarters, head, body, four quarters of body and body without head
We also added some layouts of our own. We used combinations of DSIFT descriptors and OTC descriptors


First experiment: the best layout was Image + Head + Body

Second experiment: parameter tuning example

Third experiment: best classifier result

Technical Details

For technical details please check out the download section.

The project was conducted at the Laboratory of Computer Graphics & Multimedia, Department of Electrical Engineering, Technion: