← Back to Home

Assignment 1

Deep Learning & SVM Classification on MNIST & FashionMNIST

Overview

Objective

Train ResNet-18, ResNet-50 and SVM classifiers with various hyperparameters on MNIST and FashionMNIST datasets.

Key Tasks

  • Q1(a): Deep learning classification with ResNet
  • Q1(b): SVM classification with poly & rbf kernels
  • Q2: CPU vs GPU performance analysis

Technologies

PyTorch CUDA Scikit-learn AMP

Key Results

99.20%
Best MNIST (ResNet-18)
B16, SGD, LR=0.001
92.60%
Best FashionMNIST (ResNet-50)
B16, Adam, LR=0.0001
97.61%
Best SVM MNIST
poly kernel, C=10.0
30x
GPU Speedup
vs CPU training

Q1(a) Deep Learning Results - MNIST

Batch Size Optimizer Learning Rate ResNet-18 (%) ResNet-50 (%)
16 SGD 0.001 99.20 99.11
16 SGD 0.0001 97.70 97.27
16 Adam 0.001 98.19 98.46
16 Adam 0.0001 99.15 98.24
32 SGD 0.001 98.95 98.67
32 SGD 0.0001 96.59 94.21
32 Adam 0.001 98.96 98.75
32 Adam 0.0001 98.71 98.02

* Results with pin_memory=True, Epochs=5

Q1(a) Deep Learning Results - FashionMNIST

Batch Size Optimizer Learning Rate ResNet-18 (%) ResNet-50 (%)
16 SGD 0.001 92.06 91.97
16 SGD 0.0001 90.41 89.63
16 Adam 0.001 92.11 89.11
16 Adam 0.0001 92.43 92.60
32 SGD 0.001 90.59 89.71
32 SGD 0.0001 88.86 85.04
32 Adam 0.001 91.12 50.81
32 Adam 0.0001 92.06 92.43

* Results with pin_memory=False, Epochs=10

Q1(b) SVM Classification Results

Dataset Kernel C Accuracy (%) Train Time (ms)
MNIST poly 10.0 97.61 286,093
MNIST rbf 10.0 96.99 368,092
FashionMNIST rbf 10.0 89.89 479,740
FashionMNIST poly 10.0 89.84 367,438

Q2 CPU vs GPU Performance (FashionMNIST)

Compute Model Accuracy (%) Train Time (ms) FLOPs
GPU ResNet-18 87.24 62,192 1.824G
CPU ResNet-18 87.40 1,149,299 1.824G
GPU ResNet-50 85.69 137,549 4.132G
CPU ResNet-50 83.47 4,234,740 4.132G

Key Insights

Optimizer Choice

SGD with LR=0.001 provides best results for quick convergence. Adam with lower LR (0.0001) offers more stable training.

Batch Size Impact

Smaller batch size (16) consistently yields better accuracy. Larger batches train faster but may require LR tuning.

GPU Acceleration

GPU provides 18-30x speedup. Larger models benefit more from GPU parallelization.