Mitigating Feature Overfitting in Barlow Twins via Mixed-Sample Regularization for Stable Long-Horizon Representation Learning

Authors

  • Arwa Saad Nahda University image/svg+xml Author
    Competing Interests

    No competing interests this author may have with the research subject.

  • Prasun Chakrabarti ITM University image/svg+xml Author
    Competing Interests

    The author declares no competing interests

  • Mona Ali Abdelrahman American University in the Emirates image/svg+xml Author
    Competing Interests

    The author declares no competing interests

  • Vinayakumar Ravi Prince Mohammad bin Fahd University image/svg+xml Author
    Competing Interests

    The author declares no competing interests

DOI:

https://doi.org/10.66279/7scztv14

Keywords:

Self-supervised learning, Barlow Twins, Redundancy Reduction, Mixed Sample Regularization, Feature Overfitting

Abstract

In self-supervised learning, feature overfitting during extended training is still a significant problem, especially in redundancy-reduction frameworks like Barlow Twins. Barlow Twins performs well at first, but after prolonged training (e.g., after 600 epochs), its representation quality deteriorates. This is primarily because of limited data diversity and overfitting to feature correlations. An improved Mixed Barlow Twins framework that incorporates mixed-sample regularization via linear interpolation in the input space is presented to overcome this restriction. This method facilitates simpler feature matrices and mitigates redundancy-induced overfitting by ensuring consistency between mixed inputs and their corresponding embeddings. Stable optimization without performance degradation is demonstrated by extensive experiments on CIFAR-10 with a ResNet-50 backbone over 1000 training epochs. In longer-term scenarios, the proposed approach outperforms the standard Barlow Twins baseline, achieving a k-NN classification accuracy of 92.1%. With only 7.2 GB of GPU memory and about 15 hours of training time, the technique also maintains high computational efficiency with little overhead. These findings indicate that mixed-sample regularization is a simple yet effective method for improving representation robustness and training stability in self-supervised learning.

Downloads

Download data is not yet available.

Author Biographies

  • Prasun Chakrabarti, ITM University

    ITM SLS Baroda University, 391510, Vadodara,India

  • Mona Ali Abdelrahman, American University in the Emirates

    Department chair, Mass Communications College, American University in the Emirates,

  • Vinayakumar Ravi, Prince Mohammad bin Fahd University

    Center for Artificial Intelligence, Prince Mohammad Bin Fahd University, Khobar, Saudi Arabia

References

[1] S. U. Amin, A. Hussain, B. Kim, and S. Seo, “Deep learning based active learning technique for data annotation and improve the overall performance of classification models,” Expert Syst. Appl., vol. 228, p. 120391, Oct. 2023, doi: 10.1016/j.eswa.2023.120391. DOI: https://doi.org/10.1016/j.eswa.2023.120391

[2] T. Chen, S. Kornblith, M. Norouzi, and G. Hinton, “A Simple Framework for Contrastive Learning of Visual Representations,” Nov. 21, 2020, PMLR. Accessed: Mar. 26, 2026. [Online]. Available: https://proceedings.mlr.press/v119/chen20j.html

[3] N. Hossain, A. Al Thaki, Md. Mamun-Or-Rashid, and Md. Mosaddek Khan, “Graph Contrastive Learning: A Comprehensive Review of Methodologies, Applications, and Future Directions,” IEEE Access, vol. 14, pp. 40571–40604, 2026, doi: 10.1109/ACCESS.2026.3672509. DOI: https://doi.org/10.1109/ACCESS.2026.3672509

[4] K. He, H. Fan, Y. Wu, S. Xie, and R. Girshick, “Momentum Contrast for Unsupervised Visual Representation Learning,” 2020. Accessed: Mar. 26, 2026. [Online]. Available: https://github.com/facebookresearch/moco DOI: https://doi.org/10.1109/CVPR42600.2020.00975

[5] J. Gui, T. Chen, J. Zhang, Q. Cao, Z. Sun, and H. Luo, “A survey on self-supervised learning: Algorithms, applications, and future trends,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 46, no. 12, pp. 9052–9071, 2024, doi: 10.1109/TPAMI.2024.3415112. [6] C. F. G. Dos Santos and J. P. Papa, “Avoiding Overfitting: A Survey on Regularization Methods for Convolutional Neural Networks,” ACM Comput. Surv., vol. 54, no. 10 s, Jan. 2022, doi: 10.1145/3510413. DOI: https://doi.org/10.1109/TPAMI.2024.3415112

[7] C. Cao, F. Zhou, Y. Dai, J. Wang, and K. Zhang, “A Survey of Mix-based Data Augmentation: Taxonomy, Methods, Applications, and Explainability,” ACM Comput. Surv., vol. 57, no. 2, p. 38, Oct. 2024, doi: 10.1145/3696206. DOI: https://doi.org/10.1145/3696206

[8] J.-B. Grill, F. Strub, F. Altché, C. Tallec, P. Richemond, E. Buchatskaya, C. Doersch, B. A. Pires, Z. Guo, M. G. Azar, B. Piot, K. Kavukcuoglu, R. Munos, and M. Valko, “Bootstrap your own latent: A new approach to self-supervised learning,” in Proc. Adv. Neural Inf. Process. Syst. (NeurIPS), 2020.

[9] X. Chen and K. He, “Exploring Simple Siamese Representation Learning,” 2021. Accessed: Mar. 26, 2026. [Online]. Available: https://github.com/facebookresearch/simsiam

[10] J. Zbontar, L. Jing, I. Misra, Y. LeCun, and S. Deny, “Barlow Twins: Self-Supervised Learning via Redundancy Reduction,” Jul. 01, 2021, PMLR. Accessed: Mar. 26, 2026. [Online]. Available: https://proceedings.mlr.press/v139/zbontar21a.html

[11] A. Bardes, J. Ponce, and Y. LeCun, “VICReg: Variance-Invariance-Covariance Regularization for Self-Supervised Learning,” ICLR 2022 - 10th International Conference on Learning Representations, Jan. 2022, Accessed: Mar. 26, 2026. [Online]. Available: http://arxiv.org/abs/2105.04906

[12] A. Ermolov, A. Siarohin, E. Sangineto, and N. Sebe, “Whitening for Self-Supervised Representation Learning,” Jul. 01, 2021, PMLR. Accessed: Mar. 26, 2026. [Online]. Available: https://proceedings.mlr.press/v139/ermolov21a.html

[13] H. Zhang, M. Cisse, Y. N. Dauphin, and D. Lopez-Paz, “mixup: Beyond Empirical Risk Minimization,” 6th International Conference on Learning Representations, ICLR 2018 - Conference Track Proceedings, Apr. 2018, Accessed: Mar. 26, 2026. [Online]. Available: http://arxiv.org/abs/1710.09412

[14] S. Yun, D. Han, S. J. Oh, S. Chun, J. Choe, and Y. Yoo, “CutMix: Regularization Strategy to Train Strong Classifiers With Localizable Features,” 2019. Accessed: Mar. 26, 2026. [Online]. Available: https://github.com/clovaai/CutMix-PyTorch. DOI: https://doi.org/10.1109/ICCV.2019.00612

[15] A. Krizhevsky, “Learning Multiple Layers of Features from Tiny Images,” 2009.

[16] V. Hondru, F. A. Croitoru, S. Minaee, R. T. Ionescu, and N. Sebe, “Masked Image Modeling: A Survey,” International Journal of Computer Vision 2025 133:10, vol. 133, no. 10, pp. 7154–7200, Jul. 2025, doi: 10.1007/s11263-025-02524-1. DOI: https://doi.org/10.1007/s11263-025-02524-1

[17] W. Guo, T. Xu, B. Li, Y. Fan, Z. Yu, and K. Jing, “Deep Embedding with Adversarial Convolutional Autoencoder for Image Clustering,” pp. 195–198, Jan. 2026, doi: 10.1109/icicml67980.2025.11333458. DOI: https://doi.org/10.1109/ICICML67980.2025.11333458

[18] M. E. Ram and G. Manju, “VIOLET: Vectorized Invariance Optimization for Language Embeddings Using Twins,” IEEE Access, vol. 13, pp. 136312–136319, 2025, doi: 10.1109/ACCESS.2025.3590971. DOI: https://doi.org/10.1109/ACCESS.2025.3590971

[19] A. Abdallah, M. S. Kasem, I. Abdelhalim, N. S. Alghamdi, and A. El-Baz, “Improving BI-RADS Mammographic Classification With Self-Supervised Vision Transformers and Cascade Learning,” IEEE Access, vol. 13, pp. 135500–135514, 2025, doi: 10.1109/ACCESS.2025.3581582. DOI: https://doi.org/10.1109/ACCESS.2025.3581582

[20] H. Saeed, M. Adel, Y. Ataa, M. Mohamed, H. Ahmed, T. W. Hong, and N. Jayarajan, “Reliable drug–target interaction prediction using convolutional neural networks with robust negative sample generation,” Journal of Smart Algorithms and Applications, vol. 2, no. 2, pp. 34–48, Feb. 2026.

[21] Y. Jiang, J. Li, Y. Tian, J. Yao, X. Yu, W. Ye, and X. Cao, “Positional relation contextual mixing for imbalanced classification,” Machine Learning, vol. 115, no. 3, Mar. 2026, doi: 10.1007/s10994-026-07004-2 DOI: https://doi.org/10.1007/s10994-026-07004-2

[22] A. A. Wani, “Comprehensive review of dimensionality reduction algorithms: challenges, limitations, and innovative solutions,” PeerJ Comput. Sci., vol. 11, p. e3025, Jul. 2025, doi: 10.7717/peerj-cs.3025. DOI: https://doi.org/10.7717/peerj-cs.3025

[23] Chuhan Zhang, “Enhanced Multi-Modal Feature Fusion Algorithm for Early-Stage Cancer Detection: A Comparative Study of Optimization Strategies,” Chinese Control Conference, CCC, vol. 2018-July, pp. 9428–9433, Oct. 2018, doi: 10.23919/ChiCC.2018.8483140. DOI: https://doi.org/10.23919/ChiCC.2018.8483140

[24] K. Vinters, “Evaluating the generalizability of a panorama-point cloud encoder trained without supervision,” 2025, Accessed: Mar. 27, 2026. [Online]. Available: https://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-372284

[25] X. He, “APPLICATION OF MACHINE LEARNING TO METAMATERIALS FOR INVERSE DESIGN AND TOPOLOGICAL CLASSIFICATION”.

[26] R. Kumar, Y. W. Kim, and Y. C. Byun, “Hybrid Framework Combining Diffusion-Based Image Augmentation and Feature Level SMOTE for Addressing Extreme Class Imbalance,” IEEE Access, vol. 13, pp. 154623–154646, 2025, doi: 10.1109/ACCESS.2025.3600622. DOI: https://doi.org/10.1109/ACCESS.2025.3600622

Downloads

Published

25-04-2026

Data Availability Statement

The CIFAR-10 dataset used in this study is publicly available. 

How to Cite

Mitigating Feature Overfitting in Barlow Twins via Mixed-Sample Regularization for Stable Long-Horizon Representation Learning. (2026). Computational Discovery and Intelligent Systems (CDIS), 3(1), 71-90. https://doi.org/10.66279/7scztv14

Most read articles by the same author(s)

Similar Articles

11-12 of 12

You may also start an advanced similarity search for this article.