Skip to content

This repository provides a summary of algorithms from our review paper

Notifications You must be signed in to change notification settings

jsrdcht/ASurveyonSelf-SupervisedLearning

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

14 Commits
 
 

Repository files navigation

A Survey on Self-Supervised Learning

This repository provides a summary of algorithms from our review paper链接.

We classify self-supervised algorithms into different pretext tasks. A popular solution of SSL is to propose a pretext task for networks to solve and networks will be trained by learning objective functions of pretext tasks. The features are learned through this process. Existing pretext tasks can be roughly classified into three categories: context based, contrastive learning (CL), and generative algorithms.

See our paper for more details.

Algorithms

Context Based Methods

  • (Rotation): Unsupervised representation learning by predicting image rotations. [paper] [code]

  • (Colorization): Colorful Image Colorization. [paper] [code]

  • (Jigsaw): Scaling and Benchmarking Self-Supervised Visual Representation Learning. [paper] [code]

Contrastive Learning

  • (MoCo v1): Momentum Contrast for Unsupervised Visual Representation Learning. [paper] [code]

  • (MoCo v2): Improved Baselines with Momentum Contrastive Learning. [paper] [code]

  • (MoCo v3): An Empirical Study of Training Self-Supervised Vision Transformers. [paper] [code]

  • (SimCLR V1): A Simple Framework for Contrastive Learning of Visual Representations. [paper] [code]

  • (SimCLR V2): Big Self-Supervised Models are Strong Semi-Supervised Learners. [paper] [code]

  • (BYOL): Bootstrap Your Own Latent A New Approach to Self-Supervised Learning. [paper] [code]

  • (SimSiam): Exploring Simple Siamese Representation Learning A New Approach to Self-Supervised Learning. [paper] [code]

  • (Barlow Twins): Barlow Twins: Self-Supervised Learning via Redundancy Reduction. [paper] [code]

  • (VICReg): Vicreg: Variance-invariancecovariance regularization for self-supervised learning. [paper] [code]

Generative Algorithms

  • (BEiT): Beit: Bert pre-training of image transformers. [paper] [code]

  • (MAE): Masked Autoencoders Are Scalable Vision Learners. [paper] [code]

  • (iBOT): iBOT: Image BERT Pre-Training with Online Tokenizer. [paper] [code]

  • (CAE): Context Autoencoder for Self-Supervised Representation Learning. [paper] [code]

  • (SimMIM): SimMIM: a Simple Framework for Masked Image Modeling. [paper] [code]

Applications

4.1 Sequential data

Natural language processing (NLP)

  • (Skip-Gram): Distributed Representations of Words and Phrases and their Compositionality. [paper] [code]

  • (BERT): BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. [paper] [code]

  • (GPT): Improving Language Understanding by Generative Pre-Training. [paper]

Sequential models for image processing and computer vision

  • (CPC): Representation learning with contrastive predictive coding. [paper]

  • (Image GPT): Distributed Representations of Words and Phrases and their Compositionality. [paper] [code]

4.2 Image processing and computer vision

video

  • (MIL-NCE): End-to-End Learning of Visual Representations From Uncurated Instructional Videos. [paper] [code]

  • Unsupervised Learning of Visual Representations using Videos. [paper]

  • Unsupervised Learning of Video Representations using LSTMs. [paper] [code]

1. Temporal information in videos:

The order of the frames:

  • Shuffle and Learn: Unsupervised Learning using Temporal Order Verification. [paper]

  • Self-Supervised Video Representation Learning With Odd-One-Out Networks. [paper]

Video playing direction:

  • Learning and Using the Arrow of Time. [paper]

Video playing speed:

  • (SpeedNet): SpeedNet: Learning the Speediness in Videos. [paper]

2. Motion of objects such as optical flow:

  • (DynamoNet): DynamoNet: Dynamic Action and Motion Network. [paper]

  • (CoCLR): Self-supervised Co-training for Video Representation Learning. [paper] [code]

3. Multi-modal(ality) data such as RGB, audio, and narrations

  • Cooperative Learning of Audio and Video Models from Self-Supervised Synchronization. [paper]

  • Time-Contrastive Networks: Self-Supervised Learning from Video. [paper]

4. Spatial-temporal coherence of objects such as colours and shapes

  • Learning Correspondence from the Cycle-Consistency of Time. [paper]

  • (VCP): Video Cloze Procedure for Self-Supervised Spatio-Temporal Learning. [paper]

  • Joint-task Self-supervised Learning for Temporal Correspondence. [paper] [code]

Other fields

  • medical field: Preservational Learning Improves Self-supervised Medical Image Models by Reconstructing Diverse Contexts. [paper] [code]

  • medical image segmentation: Contrastive learning of global and local features for medical image segmentation with limited annotations. [paper] [code]

  • 3D medical image analysis: Rubik’s Cube+: A self-supervised feature learning framework for 3D medical image analysis. [paper]

Summary

context based contrastive learning masked image modeling
Rotation
Colorization
Jigsaw

Contact

If you have any suggestions or find our work helpful, feel free to contact us

Email: {guijie,tchen}@seu.edu.cn

About

This repository provides a summary of algorithms from our review paper

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published