About
Silicon Valley Intro

About


SeeFood 2.0

Victoria Liu, Gloria Liu

tldr

Run the app with Docker after cloning this repo

Imagine a world where only 3 things existed: hotdogs, carrots, and bananas. That’s the world Seefood 2.0 lives in. It can classify these 3 foods with 92% validation accuracy. Periscope, here we come!

What It Does

Our web app allows users to drag or upload an image to the application, and then the app will classify it as a hotdog, banana, or carrot with 92% validation accuracy and 94% training accuracy. The app uses a TensorFlow backend with a convolutional neural network that has been trained on thousands of images of hotdogs, bananas, and carrots. To use the web app, clone the Github repository and build it with docker; more instructions can be found on the Github repository. There are also a few clickable links on the web app, but you’ll have to find them yourselves!

We know that a fictional hotdog classifier from Silicon Valley motivated us to create this fun idea to life, and we hope that others are also curious to see a real-life hotdog classifier! It would be a dream to have others go through our code to learn more about image classification. Since the code is very portable, it can be applied to other machine learning situations as well. For more information on how to use our web app, follow the directions in our Github directory

Inspiration

When we watched the first few seasons of the TV show Silicon Valley a couple summers ago, neither of use seriously considered pursuing computer science as a career. And although we didn’t understand a lot of the theory behind what the characters were building, their crazy ideas still intrigued us. One of them was Jian Yang’s SeeFood app which would classify food as either “Hot Dog” or “Not Hot Dog”. In the years since watching that episode, both of us have come to understand the impact of machine learning and computer vision, and now we are both seriously pursuing computer science. It is interesting how television gimmicks can stick with people, and we even occasionally discussed how funny it would be to replicate a hotdog app before Creatica, but never had time to pursue it. However, when we saw Creatica suggest it in the “Far Out Track”, we knew this hackathon would be the perfect time to achieve a years-long dream. We wanted to see if we could replicate the app and even expand on it to classify multiple types of foods. And most importantly, we wanted to have fun while learning more about machine learning and web app development.

Summary

First, we discussed our goals and what we could reasonably accomplish in two days. We had 3 goals in mind: optimize the user experience (accuracy of images, aesthetic user interface, easy to load), write well-documented code that is clear, beautiful and simple, and most importantly, have fun. We did a brief literature search on what types of ML algorithms are best at image classification, and we decided that convolutional neural networks with transfer learning was the best way to go. Next, we prototyped our project, where we clearly defined the architecture of our machine learning algorithm to be a convolutional neural network and decided on what kind of user interface we wanted. Then, we separated the tasks. Victoria was in charge of the backend and linking it to the app while Gloria performed image validation, documented the process, and helped with front end tasks.

Challenges

One of the first issues we ran into was that Gloria’s laptop could not download all the images from our script. Thus, we did most of the technical work on Victoria’s laptop. We were lucky enough that one of our computers could handle machine learning, but it also made us cognizant that many people do not have equal access in the world of tech. A solution to this particular problem could have been cloud computing and cloud storage.

When we started collecting training images, we found that a lot of datasets were too small or didn’t have enough variety. We used a script to download thousands of images of hot dogs, carrots, and bananas from Stanford vision lab’s image net. However, a lot of the links to image sets, and to the images themselves, were broken, so we had to manually sort them out. We ran into an issue where we couldn’t find every broken image, leading to problems with the code. We tried to find the broken images for hours, but we simply couldn’t find them among the thousands we had downloaded. So we wrote a script to delete any image that couldn’t be opened by PIL.Image.open. We also validated our images by making sure they could be stored with ImageDataGenerator, which we used in preprocessing. In addition, the open source code we were using employed an older version of TensorFlow, and some of the functions were already deprecated. We had to fix all of those functions before we could start on creating our own network architecture. In the backend, when we tried to implement 4 classes, we saw that our validation accuracy dropped significantly, so we ended up only using 3, since we were aiming for accuracy. This is something we want to improve on in the future. In the frontend, we had trouble linking our TensorFlow backend to a web app so we ended up using open source for a template. However, the app code needed many modifications because we needed to put our own model onto it.

Finally, morale was occasionally a challenge because there seemed to be an endless number of computer bugs. Nevertheless we prevailed and we are happy that we created this app for the hackathon.

Acknowledgments

Griffin Chure’s Reproducible Website

Classification with Convolutional Neural Networks

InceptionNetV3

Deploying Keras Model with Flask