Assignment 2 - MLOps Portfolio

Overview

Train a CNN on CIFAR-10 with custom dataloader, visualize gradient flow and weight updates using W&B.

PyTorch W&B ptflops CUDA

SimpleCNN

Model Architecture

3 Conv Blocks + 2 FC Layers

~500K

Parameters

Lightweight for fast training

30

Epochs

With cosine annealing

W&B

Logging

All visualizations tracked

Layer	Type	Output Shape	Parameters
Input	-	3 × 32 × 32	-
Conv Block 1	Conv2d + BN + ReLU + MaxPool	32 × 16 × 16	~1K
Conv Block 2	Conv2d + BN + ReLU + MaxPool	64 × 8 × 8	~18K
Conv Block 3	Conv2d + BN + ReLU + MaxPool	128 × 4 × 4	~74K
FC1	Linear + ReLU + Dropout	256	~524K
FC2 (Output)	Linear	10	~2.5K

Bar charts showing max and average gradient magnitudes per layer. Helps detect vanishing/exploding gradients.

Distribution of weights in each layer, logged every 5 epochs to W&B for analysis.

Tracking how much weights change between epochs, showing learning dynamics.

Loss and accuracy curves for training and validation, plus learning rate schedule.

Gradient Flow: Gradients flow properly through all layers without vanishing/exploding issues
BatchNorm Effect: Helps maintain stable gradient magnitudes across layers
Weight Updates: FC layers show larger updates than conv layers
Learning Dynamics: Cosine annealing provides smooth convergence
Data Augmentation: RandomCrop and HorizontalFlip improve generalization

Complete code with custom dataloader, model training, and all visualizations

Interactive visualizations: gradient histograms, weight updates, training curves

* Update W&B link after running the notebook