Facial Emotion Recognition using
Transfer Learning (MobileNetV2)
Author: Halil Ibrahim Ibali
Generated: 2026-01-15
Abstract
Facial Emotion Recognition (FER) aims to classify human affective states from facial images.
This report documents an end-to-end FER pipeline implemented in TensorFlow/Keras
using transfer learning with MobileNetV2. The workflow loads an image dataset from a
directory structure, applies batching and data pipeline optimizations
(cache/shuffle/prefetch), performs data augmentation, trains a MobileNetV2-based
classifier with early stopping and learning-rate scheduling, and evaluates performance
using accuracy, classification report, and confusion matrix. In addition, a simple learningprofile demo visualizes predicted emotions over an ordered sequence of test samples. The
described approach is computationally efficient and suitable for practical FER deployments
where training data and compute may be limited.
1. Introduction
FER is widely used in human–computer interaction, user experience analytics, affective
computing, and assistive systems. However, FER performance is affected by illumination
changes, pose variation, occlusions, and limited labeled data. Transfer learning addresses
these constraints by reusing features learned from large-scale datasets (e.g., ImageNet) and
fine-tuning them for emotion classes.
This document focuses on **implementation-level details** of the provided notebook,
including:
Dataset loading and class discovery from directory structure
tf.data pipeline optimization (cache, shuffle, prefetch)
Data augmentation configuration
MobileNetV2 transfer learning model construction
Training callbacks (EarlyStopping, ReduceLROnPlateau, ModelCheckpoint)
Evaluation pipeline (classification report + confusion matrix)
Learning profile (emotion timeline) demo
2. Environment and Reproducibility
The notebook uses Python with TensorFlow/Keras and scikit-learn. Key imports include:
# %pip install pandas numpy matplotlib scikit-learn tensorflow
import pandas as pd
import numpy as np
import tensorflow as tf
from tensorflow.keras import layers, models
import matplotlib.pyplot as plt
from tensorflow.keras.callbacks import EarlyStopping, ReduceLROnPlateau,
ModelCheckpoint
from sklearn.metrics import confusion_matrix, ConfusionMatrixDisplay,
classification_report
import warnings
warnings.filterwarnings('ignore')
import tensorflow as tf
tf.get_logger().setLevel('ERROR')
Important runtime constants detected in the notebook:
Paramete Value (from notebook)
r
IMAGE_S
IZE
96
BATCH_S
IZE
32
DATASE
T_DIR
'c:\\Users\halil.ibali\Desktop\IntroductionToDataScience\Project\FaceEmotio
nRecognition\\archive\processed_data'
Note: The dataset path in the notebook is a local Windows path. For portability, you should
replace DATASET_DIR with a relative path (e.g., './data/processed_data') or an environment
variable.
3. Dataset Loading and Splitting
The dataset is loaded using `tf.keras.preprocessing.image_dataset_from_directory`, which
automatically:
Reads images from subfolders (one subfolder per class)
Assigns integer labels based on folder names
Creates a tf.data Dataset with batching
Provides `dataset.class_names` to list detected emotion classes
Code used for dataset loading:
IMAGE_SIZE = 96
BATCH_SIZE = 32
DATASET_DIR =
'c:\\Users\halil.ibali\Desktop\IntroductionToDataScience\Project\FaceEmotionRecognition
\\archive\processed_data'
dataset = tf.keras.preprocessing.image_dataset_from_directory(
DATASET_DIR,
seed=123,
shuffle=True,
image_size=(IMAGE_SIZE, IMAGE_SIZE),
batch_size=BATCH_SIZE
)
class_names = dataset.class_names
n_classes = len(class_names)
print("Classes:", class_names)
After loading, the notebook creates train/validation/test subsets (as shown by the presence
of `train_ds`, `val_ds`, and `test_ds`). These splits ensure that evaluation is performed on
unseen data.
4. Data Pipeline Optimization (cache/shuffle/prefetch)
To improve training throughput, the notebook applies caching, shuffling, and prefetching.
`cache()` keeps data in memory after the first epoch, `shuffle(1000)` randomizes training
order, and `prefetch(AUTOTUNE)` overlaps preprocessing and GPU/CPU execution.
Optimization code:
shuffle=True,
AUTOTUNE = tf.data.AUTOTUNE
train_ds = train_ds.cache().shuffle(1000).prefetch(buffer_size=AUTOTUNE)
val_ds = val_ds.cache().prefetch(buffer_size=AUTOTUNE)
test_ds = test_ds.cache().prefetch(buffer_size=AUTOTUNE)
kf = KFold(n_splits=k, shuffle=True, random_state=42)
5. Data Augmentation
Data augmentation is used to reduce overfitting by introducing realistic input variations.
The notebook defines a Keras `Sequential` augmentation pipeline applied during training.
Augmentation code:
data_augmentation = tf.keras.Sequential([
layers.RandomFlip("horizontal"),
layers.RandomRotation(0.15),
layers.RandomZoom(0.1),
layers.RandomContrast(0.1),
])
Augmentation operations in this notebook:
Random horizontal flip
Random rotation (0.15)
Random zoom (0.1)
Random contrast (0.1)
6. Model Architecture (MobileNetV2 Transfer Learning)
The model uses MobileNetV2 as a pretrained convolutional feature extractor. MobileNetV2
is efficient due to inverted residual blocks and depthwise separable convolutions, making it
suitable for real-time and resource-constrained settings.
Architecture code (base model + classification head + compile):
base_model = tf.keras.applications.MobileNetV2(
input_shape=(IMAGE_SIZE, IMAGE_SIZE, 3),
include_top=False,
weights='imagenet'
)
base_model.trainable = False
model = tf.keras.Sequential([
data_augmentation,
layers.Rescaling(1./255),
base_model,
layers.GlobalAveragePooling2D(),
layers.Dense(128, activation='relu'),
layers.Dropout(0.4),
layers.Dense(n_classes, activation='softmax')
])
model.compile(
optimizer=tf.keras.optimizers.Adam(learning_rate=0.001),
loss=tf.keras.losses.SparseCategoricalCrossentropy(from_logits=False),
metrics=['accuracy']
)
model.summary()
Key design choices in the notebook:
Pretrained MobileNetV2 backbone (ImageNet weights)
GlobalAveragePooling2D to reduce spatial feature maps
Dense(128, ReLU) as a compact classifier layer
Dropout(0.4) for regularization
Softmax output with `n_classes` units
Adam optimizer with learning_rate=0.001
SparseCategoricalCrossentropy loss (integer labels)
7. Training Strategy and Callbacks
The notebook uses callbacks to stabilize training and automatically keep the best model.
EarlyStopping prevents overfitting by monitoring validation loss; ReduceLROnPlateau
lowers the learning rate when validation loss plateaus; ModelCheckpoint saves the best
weights.
Callback configuration (as in notebook):
plt.plot(history.history['val_accuracy'], label='Val Acc')
plt.legend()
plt.title("Training & Validation Accuracy")
plt.show()
# 9: Classification Report (Precision, Recall, F1-Score)
# Eğer Confusion Matrix hücresini zaten çalıştırdıysan,
# y_true ve y_pred listeleri zaten doludur.
# Emin olmak için gerekirse Confusion Matrix hücresini önce bir kere çalıştır.
y_true_array = np.array(y_true)
y_pred_array = np.array(y_pred)
print("Classification Report (per class):\n")
print(classification_report(
y_true_array,
y_pred_array,
target_names=class_names,
digits=4
))
y_true = []
y_pred = []
for images, labels in test_ds:
preds = model.predict(images)
y_true.extend(labels.numpy())
y_pred.extend(np.argmax(preds, axis=1))
cm = confusion_matrix(y_true, y_pred)
disp = ConfusionMatrixDisplay(confusion_matrix=cm, display_labels=class_names)
disp.plot(xticks_rotation=45, cmap='Blues')
plt.title("Confusion Matrix - Facial Emotion Recognition")
plt.show()
# 11: Simple Learning Profile (Emotion Timeline Demo)
# y_true ve y_pred Confusion Matrix hücresinden geliyor
# Bunları kullanarak basit bir "zaman ekseninde duygu" grafiği çıkarıyoruz.
# 1) Zaman adımı + sınıf isimleri ile bir DataFrame oluşturalım
time_steps = np.arange(len(y_pred)) # her test örneğini bir zaman adımı gibi düşün
emotion_profile_df = pd.DataFrame({
8. Evaluation
After training, the model is evaluated on the test set. The notebook computes predicted
labels and prints a classification report (precision, recall, F1-score) and visualizes the
confusion matrix.
Evaluation code snippet:
print(classification_report(
y_true_array,
y_pred_array,
target_names=class_names,
digits=4
))
y_true = []
y_pred = []
for images, labels in test_ds:
preds = model.predict(images)
y_true.extend(labels.numpy())
y_pred.extend(np.argmax(preds, axis=1))
cm = confusion_matrix(y_true, y_pred)
disp = ConfusionMatrixDisplay(confusion_matrix=cm, display_labels=class_names)
disp.plot(xticks_rotation=45, cmap='Blues')
plt.title("Confusion Matrix - Facial Emotion Recognition")
plt.show()
Interpretation guideline: A strong diagonal in the confusion matrix indicates correct
predictions; off-diagonal entries highlight confusions between emotions (often between
visually similar expressions).
9. Learning Profile Demo (Emotion Timeline)
The notebook contains a simple demo that treats test predictions as a time-ordered
sequence and builds a DataFrame to visualize an emotion profile over time. This can be
extended to create user-specific learning profiles or reading-session emotion timelines.
Learning profile code (from notebook section 11):
# 11: Simple Learning Profile (Emotion Timeline Demo)
# y_true ve y_pred Confusion Matrix hücresinden geliyor
# Bunları kullanarak basit bir "zaman ekseninde duygu" grafiği çıkarıyoruz.
# 1) Zaman adımı + sınıf isimleri ile bir DataFrame oluşturalım
time_steps = np.arange(len(y_pred)) # her test örneğini bir zaman adımı gibi düşün
emotion_profile_df = pd.DataFrame({
"time_step": time_steps,
"true_label_idx": y_true,
"pred_label_idx": y_pred
})
emotion_profile_df["true_label"] = emotion_profile_df["true_label_idx"].apply(lambda i:
class_names[i])
emotion_profile_df["pred_label"] = emotion_profile_df["pred_label_idx"].apply(lambda i:
class_names[i])
# 2) Toplam tahminler içinde her duygunun yüzdesi (distribution)
print("Predicted Emotion Distribution (%):\n")
emotion_distribution = (
emotion_profile_df["pred_label"]
.value_counts(normalize=True)
.sort_index() * 100
)
print(emotion_distribution.round(2))
# 3) Zaman ekseninde tahmin edilen duygu (çok basit bir timeline)
plt.figure(figsize=(10, 4))
# Predicted label'ı sayıya çevirip çizelim
emotion_idx_seq = emotion_profile_df["pred_label_idx"].values
plt.plot(emotion_profile_df["time_step"], emotion_idx_seq, marker='o', linestyle='-')
plt.yticks(ticks=range(len(class_names)), labels=class_names, rotation=45)
plt.xlabel("Time step (test sample index)")
plt.ylabel("Predicted Emotion")
plt.title("Emotion Timeline - Simple Learning Profile Demo")
plt.tight_layout()
plt.show()
10. Algorithm Summary (Pseudocode)
High-level procedure implemented by the notebook:
Algorithm 1: Facial Emotion Recognition with MobileNetV2 Transfer Learning
1: Load image dataset from DATASET_DIR using image_dataset_from_directory
2: Extract class_names and n_classes
3: Split dataset into train_ds, val_ds, test_ds
4: Optimize pipelines: cache → shuffle (train) → prefetch (all)
5: Define data_augmentation (flip/rotate/zoom/contrast)
6: Load base_model = MobileNetV2(pretrained on ImageNet)
7: Build classification head: GAP → Dense(128, ReLU) → Dropout(0.4) →
Dense(n_classes, softmax)
8: Compile model with Adam(lr=0.001) and SparseCategoricalCrossentropy
9: Train with callbacks: EarlyStopping, ReduceLROnPlateau, ModelCheckpoint
10: Evaluate on test set; compute classification report and confusion matrix
11: (Optional) Build emotion timeline DataFrame for learning-profile
visualization
11. Practical Notes and Recommended Improvements
Make DATASET_DIR configurable (argument, env var, or relative path).
Save the trained model in SavedModel format (model.save('saved_model/')) for
deployment.
Report dataset statistics: number of images per class, train/val/test sizes.
Include training curves (accuracy/loss vs epoch) for the Results section.
If classes are imbalanced, consider class weights or balanced sampling.
For reproducibility, set seeds for Python/NumPy/TensorFlow and log package versions.
0
You can add this document to your study collection(s)
Sign in Available only to authorized usersYou can add this document to your saved list
Sign in Available only to authorized users(For complaints, use another form )