Coursework Project

AI vs Real Image Classifier

Computer vision project focused on classifying whether an image is AI-generated or real. The workflow covers preprocessing, augmentation, CNN baseline modeling, and hyperparameter experiments with TensorFlow/Keras.

Problem

As synthetic images become more common, distinguishing generated images from real ones is an important applied classification task with practical trust and quality implications.

Approach

Built a binary classifier pipeline using TensorFlow/Keras with train/validation/test splits, augmentation (brightness, contrast, flips), and iterative architecture tuning.

Outcome

Established a baseline CNN achieving around 68% test accuracy on the held-out subset and used additional experiments to understand generalization and overfitting tradeoffs.

Pipeline Details

Data + Preprocessing

Loaded image classes with binary labels (`fake` and `real`).
Created smaller randomized subsets for efficient iteration.
Applied train/validation/test splitting in notebook workflow.
Saved processed subsets to `.npz` for reproducible modeling runs.

Modeling + Evaluation

Baseline CNN: Conv2D + pooling + dense binary classifier.
Tracked training, validation, and test accuracy across runs.
Compared baseline against tuned architecture variants.
Used metrics to diagnose overfitting and robustness limits.

Key Results

In the recorded notebook outputs, the baseline CNN reached approximately 0.996training accuracy, 0.672 validation accuracy, and 0.681 test accuracy. This performance profile suggests the model learned strong class boundaries on training data while still leaving room to improve generalization.

A tuned variant did not outperform baseline on validation/test metrics, which reinforced the importance of careful architecture and regularization decisions for this dataset.

Artifacts

GitHub Repository