This piece was written and originally published by DrivenData. We sponsored and hosted its recent Naive Bees Classifier contest, and these are the exciting results.
Wild bees are important pollinators and the spread of colony collapse disorder has only made their role more critical. Right now it takes a lot of time and effort for researchers to gather data on wild bees. Using data submitted by citizen scientists, Bee Spotter is making this process easier. However, they still require that experts examine and identify the bee in each image. When we challenged our community to build an algorithm to pick out the genus of a bee based on the image, we were shocked by the results: the winners achieved a 0.99 AUC (out of 1.00) on the held out data!
We caught up with the top three finishers to learn about their backgrounds and how they tackled this problem. In true open data fashion, all three stood on the shoulders of giants by leveraging the pre-trained GoogLeNet model, which has performed well in the ImageNet competition, and tuning it to this task. Here's a little bit about the winners and their unique approaches.
Meet the winners!
1st Place - E.A.
Name: Eben Olson and Abhishek Thakur
Home base: New Haven, CT and Berlin, Germany
Eben's Background: I work as a research scientist at Yale University School of Medicine. My research involves building hardware and software for volumetric multiphoton microscopy. I also develop image analysis/machine learning approaches for segmentation of tissue images.
Abhishek's Background: I am a Senior Data Scientist at Searchmetrics. My interests lie in machine learning, data mining, computer vision, image analysis and retrieval and pattern recognition.
Method overview: We applied a standard technique of finetuning a convolutional neural network pretrained on the ImageNet dataset. This is often effective in situations like this one where the dataset is a small collection of natural images, as the ImageNet networks have already learned general features which can be applied to the data. This pretraining regularizes the network which has a large capacity and would overfit quickly without learning useful features if trained directly on the small amount of images available. This allows a much larger (more powerful) network to be used than would otherwise be possible.
For more details, make sure to check out Abhishek's fantastic write-up of the competition, which includes some truly terrifying deepdream images of bees!
2nd Place - L.V.S.
Name: Vitaly Lavrukhin
Home base: Moscow, Russia
Background: I am a researcher with 9 years of experience both in industry and academia. Currently, I am working for Samsung and dealing with machine learning developing intelligent data processing algorithms. My previous experience was in the field of digital signal processing and fuzzy logic systems.
Method overview: I employed convolutional neural networks, since nowadays they are the best tool for computer vision tasks . The provided dataset contains only two classes and it is relatively small. So to get higher accuracy, I decided to fine-tune a model pre-trained on ImageNet data. Fine-tuning almost always produces better results .
There are many publicly available pre-trained models. But some of them have license restricted to non-commercial academic research only (e.g., models by Oxford VGG group). It is incompatible with the challenge rules. That is why I decided to take open GoogLeNet model pre-trained by Sergio Guadarrama from BVLC .
One can fine-tune a whole model as is but I tried to modify pre-trained model in such a way, that could improve its performance. Specifically, I considered parametric rectified linear units (PReLUs) proposed by Kaiming He et al. . That is, I replaced all regular ReLUs in the pre-trained model with PReLUs. After fine-tuning the model showed higher accuracy and AUC in comparison with the original ReLUs-based model.
In order to evaluate my solution and tune hyperparameters I employed 10-fold cross-validation. Then I checked on the leaderboard which model is better: the one trained on the whole train data with hyperparameters set from cross-validation models or the averaged ensemble of cross- validation models. It turned out the ensemble yields higher AUC. To improve the solution further, I evaluated different sets of hyperparameters and various pre- processing techniques (including multiple image scales and resizing methods). I ended up with three groups of 10-fold cross-validation models.
3rd Place - loweew
Name: Edward W. Lowe
Home base: Boston, MA
Background: As a Chemistry graduate student in 2007, I was drawn to GPU computing by the release of CUDA and its utility in popular molecular dynamics packages. After finishing my Ph.D. in 2008, I did a 2 year postdoctoral fellowship at Vanderbilt University where I implemented the first GPU-accelerated machine learning framework specifically optimized for computer-aided drug design (bcl::ChemInfo) which included deep learning. I was awarded an NSF CyberInfrastructure Fellowship for Transformative Computational Science (CI-TraCS) in 2011 and continued at Vanderbilt as a Research Assistant Professor. I left Vanderbilt in 2014 to join FitNow, Inc in Boston, MA (makers of LoseIt! mobile app) where I direct Data Science and Predictive Modeling efforts. Prior to this competition, I had no experience in anything image related. This was a very fruitful experience for me.
Method overview: Because of the variable positioning of the bees and quality of the photos, I oversampled the training sets using random perturbations of the images. I used ~90/10 split training/ validation sets and only oversampled the training sets. The splits were randomly generated. This was performed 16 times (originally intended to do 20-30, but ran out of time).
I used the pre-trained googlenet model provided by caffe as a starting point and fine-tuned on the data sets. Using the last recorded accuracy for each training run, I took the top 75% of models (12 of 16) by accuracy on the validation set. These models were used to predict on the test set and predictions were averaged with equal weighting.