High School Student Blog: Project Bridging AI and Facial Recognition

Giving Robots an Infant’s Sense of Vision

After the first part of this mini-series on detecting human emotions through audio features, it is the right choice to teach the robot to detect emotion through image features. For the second part of my three-part mini-series on AI & robotics, I am demonstrating an experimental Face Emotion Recognition project to explore its potential.

Here’s a quick overview of the project:

Project Description: Using a cascade classifier to detect faces in a frame and a custom sequential model to detect emotions using that frame in real-time.

Data Set:

  • Includes male and female

  • Contains data separated into train and validation and further into angry, disgust, fear, happy, neutral, sad, surprise

Furthermore, let’s divide the project into 2 parts:

  • Model: loading the data and building, training, and testing the model

  • Detecting faces: Finding regions of interest (ROI) or faces in a frame and real-time emotion detection based on them

MODEL

To prepare the data for our model, we would have to take a train and validation generator and then read each file from the data folder. We also have to rescale the images so it is easy to deal with them. We can use this code sample:

# Define data generators
train_dir = "../RawData/facial-expression/images/train/"
val_dir = "../RawData/facial-expression/images/validation/"

num_train = 28709
num_val = 7178
batch_size = 64
num_epoch = 50

train_datagen = ImageDataGenerator(rescale=1./255)
val_datagen = ImageDataGenerator(rescale=1./255)

train_generator = train_datagen.flow_from_directory(
        train_dir,
        target_size=(48,48),
        batch_size=batch_size,
        color_mode="grayscale",
        class_mode='categorical')

validation_generator = val_datagen.flow_from_directory(
        val_dir,
        target_size=(48,48),
        batch_size=batch_size,
        color_mode="grayscale",
        class_mode='categorical')

Now, onto building the model:

# Create the model
model = Sequential()

model.add(Conv2D(32, kernel_size=(3, 3), activation='relu', input_shape=(48,48,1)))
model.add(Conv2D(64, kernel_size=(3, 3), activation='relu'))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Dropout(0.25))

model.add(Conv2D(128, kernel_size=(3, 3), activation='relu'))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Conv2D(128, kernel_size=(3, 3), activation='relu'))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Dropout(0.25))

model.add(Flatten())
model.add(Dense(1024, activation='relu'))
model.add(Dropout(0.5))
model.add(Dense(7, activation='softmax'))

Finally, if running this code for the first time, or the model needs to be retrained, this code sample should be used:

 model.compile(loss='categorical_crossentropy',optimizer=Adam(lr=0.0001, decay=1e-6),metrics=['accuracy'])
    model_info = model.fit_generator(
            train_generator,
            steps_per_epoch=num_train // batch_size,
            epochs=num_epoch,
            validation_data=validation_generator,
            validation_steps=num_val // batch_size)
    plot_model_history(model_info)
    model.save_weights('model.h5')

It uses the Adam optimizer and categorical cross-entropy as its loss function. Finally, it stores the weight generates in a .h5 file so it could be loaded quickly when run after this.

DETECTING FACES

Before we analyze emotions from faces in a frame, we need to start a video source, prepare the image and find faces/regions of interest (ROI) in it. To add to that, sometimes we just need to load the model instead of training it, and to do all that, we can use this code sample:

print("[*] Loading the model...")
    model.load_weights('model.h5')

    # prevents openCL usage and unnecessary logging messages
    cv2.ocl.setUseOpenCL(False)

    # dictionary which assigns each label an emotion (alphabetical order)
    emotion_dict = {0: "Angry", 1: "Disgusted", 2: "Fearful", 3: "Happy", 4: "Neutral", 5: "Sad", 6: "Surprised"}

    # start the webcam feed
    print("[*] Loading camera...")
    cap = cv2.VideoCapture(0)
    while True:
        # Find haar cascade to draw bounding box around face
        ret, frame = cap.read()
        frame = cv2.rotate(frame, cv2.ROTATE_180)
        if not ret:
            print("[*] Camera not found")
            break

        facecasc = cv2.CascadeClassifier('haarcascade_frontalface_default.xml')
        gray = cv2.cvtColor(frame, cv2.COLOR_BGR2GRAY)
        faces = facecasc.detectMultiScale(gray,scaleFactor=1.3, minNeighbors=5)

        cv2.imwrite("original_fr.png", frame)
        cv2.imwrite("grayscale_fr.png", gray)

Finally, let’s make a prediction and draw a rectangle around the face for the output frame for all faces in the original frame.

 for (x, y, w, h) in faces:
            cv2.rectangle(frame, (x, y-50), (x+w, y+h+10), (255, 0, 0), 2)
            roi_gray = gray[y:y + h, x:x + w]
            cropped_img = np.expand_dims(np.expand_dims(cv2.resize(roi_gray, (48, 48)), -1), 0)
            prediction = model.predict(cropped_img)
            maxindex = int(np.argmax(prediction))
            cv2.putText(frame, emotion_dict[maxindex], (x+20, y-60), cv2.FONT_HERSHEY_SIMPLEX, 1, (255, 255, 255), 2, cv2.LINE_AA)
            if(debug):
                print(emotion_dict[maxindex])

        # cv2.imshow('Video', cv2.resize(frame,(1600,960),interpolation = cv2.INTER_CUBIC))

CONCLUSION

Similar to how an infant starts crying even if you look at them with an angry face, this model can be used by robots to know and judge our emotions while looking at us. Moreover, it’s an addition to recognizing human emotions through voice features from the last part of this mini-series.

This code was tested on a Raspberry Pi 4 powered Humanoid — Shelbot (one of my ongoing projects)

The full code for this project can be found here.

ABOUT ME

Githubhttps://github.com/LakshBhambhani

LinkedInhttps://www.linkedin.com/in/lakshbhambhani/

Laksh Bhambhani is a Student Ambassador in the Inspirit AI Student Ambassadors Program. Inspirit AI is a pre-collegiate enrichment program that exposes curious high school students globally to AI through live online classes. Learn more at https://www.inspiritai.com/.

https://lakshbhambhani.medium.com/giving-robots-an-infants-sense-of-vision-1008f720a669

Previous
Previous

Learn AI in High School: 7 Reasons To Do So

Next
Next

Understanding AI and Its Impact as a High School Student