Image Visualization
At this point, we've already run our script entitled GetData.py
, which uses Stanford Vision Lab's image-net to download thousands of hot dog and non-hot dog images. Now, we get to the first part of machine learning...visualizing the data! Note that this is for the binary classification only; currently, we haven't written a script for visualizing multiple classes, but we will do that after the Hackathon.
import os
import numpy as np
import cv2
import matplotlib.pyplot as plt
When converting the data to arrays, we will need a file path to get the data, so we define it here for clarity. We chose not to use os.getcwd()
because ran into a few issues with that for some reason.
#define paths and constants
data_path = "/Users/victorialiu/git/creatica/code/data/"
TARGET_SIZE = 299
We define a helper function that uses matplotlib
to plot individual images. We make sure that the images are in the right colors and also the right size.
img_path = os.path.join(data_path, 'test/')
#
def get_image(file_path):
if os.path.isfile(img_path + file_path):
image_bgr = cv2.imread(img_path + file_path,cv2.IMREAD_COLOR)
image_rgb = cv2.cvtColor(image_bgr, cv2.COLOR_BGR2RGB)
image_rgb_resized = cv2.resize(image_rgb, (TARGET_SIZE, TARGET_SIZE), interpolation = cv2.INTER_CUBIC)
print(image_rgb_resized.shape)
plt.imshow(image_rgb_resized)
plt.title(file_path)
plt.axis("off")
plt.show()
Let's plot a few images to see what kind of pictures we're working with, as well as if the labels make sense. Note that this can also be run from the command line using python3 visualize_hotdogs.py
, which is a python script version of this exact notebook.
def main():
#get non-hotdog pictures from directory
nothotdog_list = os.listdir(os.path.join(data_path, 'test/nothotdog'))
nothotdog_pics = [os.path.join('nothotdog/', nothotdog_list[i]) for i in range(len(nothotdog_list))]
#get hotdog pictures from directory
hotdog_list = os.listdir(os.path.join(data_path, 'test/hotdog'))
hotdog_pics = [os.path.join('hotdog/', hotdog_list[i]) for i in range(len(hotdog_list))]
#concat
all_pics = nothotdog_pics + hotdog_pics
#plot every 40th image; otherwise too many images!
for i in range(0, len(all_pics), 40):
print(all_pics[i])
get_image(all_pics[i])
return True
# if __name__ == '__main__': main()
nothotdog/76271.jpg
nothotdog/45817.jpg
nothotdog/63947.jpg
nothotdog/80315.jpg
hotdog/198641.jpg
hotdog/413426.jpg
hotdog/195251.jpg
hotdog/190809.jpg
True
These are some very yummy looking images. I am getting hungry just looking at them! More importantly, it looks like we have a good variety of hotdog pictures, and the labels are all correct. Note that we are merely visualizing the data, and we are not data snooping, because we are not making any assumptions about our models.
Authors: Victoria Liu and Gloria Liu
Last modified: November 2020
Description: A script to visualize hot dogs before doing training. See what we're working with, without data snooping!