Logo black r

Made at



Halting the Spread of HIV

Emily Schuch, Data Scientist at Assembly Media

The HIV incidence rate is defined as the number of new HIV infections in a population in a given year. A rate of 0.4% means that 4 out of every 1000 people became newly infected with HIV.

view project


Brian Kim, Data Scientist at FabFitFun

The app tracks twitter trends in volume, sentiment, and topicality for 2016 Election candidates. It was done using Flask, MongoDB, D3, Vader Sentiment Analysis, and Gensim on an EC2 Server.

view project

Estimating House Prices in San Francisco

Rui Chang, Lead Data Scientist at Target

This project is trying to estimate house prices based on the features using publicly available data, and build a web application for house prices estimation.

view project

KenKen Solver

Ken Myers, Jr. Data Scientist at Uncommon Goods

Ken uses computer vision to solve KenKen puzzles. (Currently this application only accepts 4x4 KenKens). Simply upload a puzzle and get the solution.

view project

A statistical analysis of minesweeper – Placing the Mines

David Dupuis, Data Scientist Researcher at Kwanko

There are some key elements to coding the game that can and probably should be memorized as they have other practical applications in computer science.

view project

Visaurant: Reimagining the food search experience

Jeff Wen, Data Scientist at Tesla

Visaurant is a reimagination of the way users search through images that they are interested in. One prime use case for Visaurant is in sorting and filtering through food images (hence VIS -ual rest- AURANT).

view project

The (Data) Science of Binge Watching on Netflix

Jamie Fradkin, BuzzFeed, Jr. Data Scientist

Who among us hasn’t fallen victim to the addictive power of a binge-worthy Netflix show? For Jamie's final project at Metis, she chose to explore elements in popular shows that might lead you to start “binge watching” on Netflix.

view project

Investigating Worker Exploitation in California

Ash Chakraborty, Data Scientist at Credit Sesame

At a recent DataKind SF event, Ash was rather intrigued by the challenges faced in investigating wage theft and other labor violations not just throughout the nation, but also specific to California and the Bay Area regions.

view project

End-to-End Funding Loan Predictor

Frederik Durant, Staff Member Data Innovation at Colruyt Group

Frederik delved deep in the micro-finance mechanics at Kiva, looking for a practical problem to solve.

view project

NBA Matchup Analysis

Yong Cho, Data Scientist at GrubHub

Yong built an analytics tool for Vantage sports.

view project

Chord Classification using Neural Networks

Henri Dwyer, Data Scientist at Dataiku

Henri is currently working on classifying chords from audio using neural networks.

view project


Ryan Lambert, Data Scientist at Gild

PUBmatch.co makes it easier to parse through the giant open access database PubMed by allowing you to input anything from a news article clipping to an email thread.

view project

The Science of Signing Along

Garrett Hoffman, Senior Data Scientist at StockTwits

Henri is currently working on classifying chords from audio using neural networks.

view project

Numer.ai and Ensembles - Voting, Averaging, Rank Averaging

Andre Gatorano, Data Scientist at Blitsy

In the spirit of the Kaggle revolution, an industrious and risk taking hedge fund put their investments in the hands of the public.

view project