Person Re-Identification using ResNet18 on Market-1501 Dataset

Project Overview

This project implements a person re-identification (ReID) system using a modified ResNet18 architecture on the Market-1501 dataset. The goal is to accurately identify and match individuals across multiple non-overlapping camera views in a multi-camera surveillance system.

Key Features

Model Architecture: Modified ResNet18 backbone
Dataset: Market-1501
Evaluation Metrics: mean Average Precision (mAP), Cumulative Matching Characteristic (CMC)
Loss Function: Cross-Entropy Loss with Label Smoothing

Dataset

The Market-1501 dataset consists of 32,668 annotated bounding boxes of 1,501 identities. Images are captured from 6 different cameras, with 5 high-resolution cameras and 1 low-resolution camera. The dataset is split into training and testing sets.

Methodology

Model Architecture

We use a modified ResNet18 as our backbone network. The final fully connected layer is replaced with a new embedding layer followed by a classification layer. This allows the network to learn discriminative features for person re-identification while maintaining a relatively lightweight architecture.

Training Process

Data Preprocessing: Images are resized to 256x128 pixels and normalized.
Data Augmentation: Random horizontal flipping and color jittering are applied during training.
Optimization: Adam optimizer with a learning rate of 0.001 and weight decay of 5e-4.
Learning Rate Schedule: Step decay, reducing the learning rate by a factor of 0.1 every 20 epochs.

Loss Function

We employ Cross-Entropy Loss with Label Smoothing. Label smoothing helps to prevent the model from becoming overconfident and improves generalization. The loss function is defined as:

class CrossEntropyLabelSmooth(nn.Module):
    def __init__(self, num_classes, epsilon=0.1):
        super(CrossEntropyLabelSmooth, self).__init__()
        self.num_classes = num_classes
        self.epsilon = epsilon
        self.logsoftmax = nn.LogSoftmax(dim=1)

    def forward(self, inputs, targets):
        log_probs = self.logsoftmax(inputs)
        targets = torch.zeros_like(log_probs).scatter_(1, targets.unsqueeze(1), 1)
        targets = (1 - self.epsilon) * targets + self.epsilon / self.num_classes
        loss = (-targets * log_probs).mean(0).sum()
        return loss

This loss function distributes a small probability (ε) uniformly among all classes, which helps to reduce overfitting and improve the model's ability to generalize.

Results

After training, our model achieved the following performance on the test set:

mAP (mean Average Precision): 0.7837
Rank-1 Accuracy (CMC@1): 0.9348
Rank-5 Accuracy (CMC@5): 0.9825
Rank-10 Accuracy (CMC@10): 0.9903

These results demonstrate strong performance, with the model correctly identifying the same person in the top-1 match 93.48% of the time, and in the top-5 matches 98.25% of the time.

The training curves show consistent improvement in both training and validation metrics over epochs, indicating good model convergence and generalization.

The sample results image demonstrates the model's ability to match query images with correct gallery images across different camera views and poses.

Conclusion

This project demonstrates the effectiveness of a modified ResNet18 architecture for person re-identification on the Market-1501 dataset. The use of label smoothing in the loss function and careful data preprocessing contributed to achieving competitive results while maintaining a relatively lightweight model architecture.

Future Work

Experiment with more advanced architectures like ResNet50 or DenseNet
Implement triplet loss or other metric learning approaches
Explore attention mechanisms to focus on discriminative parts of the person
Investigate domain adaptation techniques for cross-dataset generalization

Requirements

Python 3.7+
PyTorch 2.0+
torchvision
numpy
matplotlib

Acknowledgements

Market-1501 Dataset:

@inproceedings{zheng2015scalable,
  title={Scalable Person Re-identification: A Benchmark},
  author={Zheng, Liang and Shen, Liyue and Tian, Lu and Wang, Shengjin and Wang, Jingdong and Tian, Qi},
  booktitle={Computer Vision, IEEE International Conference on},
  year={2015}
}

ResNet Architecture:

He, K., Zhang, X., Ren, S., & Sun, J. (2016). Deep Residual Learning for Image Recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (pp. 770-778).

Name		Name	Last commit message	Last commit date
Latest commit History 17 Commits
models		models
output		output
utils		utils
.gitignore		.gitignore
README.md		README.md
test.py		test.py
train.py		train.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Person Re-Identification using ResNet18 on Market-1501 Dataset

Project Overview

Key Features

Dataset

Methodology

Model Architecture

Training Process

Loss Function

Results

Conclusion

Future Work

Requirements

Acknowledgements

About

Releases

Packages

Languages

owenstrength/Person-ReID

Folders and files

Latest commit

History

Repository files navigation

Person Re-Identification using ResNet18 on Market-1501 Dataset

Project Overview

Key Features

Dataset

Methodology

Model Architecture

Training Process

Loss Function

Results

Conclusion

Future Work

Requirements

Acknowledgements

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages