Mitigating Feature Overfitting in Barlow Twins via Mixed-Sample Regularization for Stable Long-Horizon Representation Learning
DOI:
https://doi.org/10.66279/7scztv14Keywords:
Self-supervised learning, Barlow Twins, Redundancy Reduction, Mixed Sample Regularization, Feature OverfittingAbstract
In self-supervised learning, feature overfitting during extended training is still a significant problem, especially in redundancy-reduction frameworks like Barlow Twins. Barlow Twins performs well at first, but after prolonged training (e.g., after 600 epochs), its representation quality deteriorates. This is primarily because of limited data diversity and overfitting to feature correlations. An improved Mixed Barlow Twins framework that incorporates mixed-sample regularization via linear interpolation in the input space is presented to overcome this restriction. This method facilitates simpler feature matrices and mitigates redundancy-induced overfitting by ensuring consistency between mixed inputs and their corresponding embeddings. Stable optimization without performance degradation is demonstrated by extensive experiments on CIFAR-10 with a ResNet-50 backbone over 1000 training epochs. In longer-term scenarios, the proposed approach outperforms the standard Barlow Twins baseline, achieving a k-NN classification accuracy of 92.1%. With only 7.2 GB of GPU memory and about 15 hours of training time, the technique also maintains high computational efficiency with little overhead. These findings indicate that mixed-sample regularization is a simple yet effective method for improving representation robustness and training stability in self-supervised learning.
Downloads
References
[1] S. U. Amin, A. Hussain, B. Kim, and S. Seo, “Deep learning based active learning technique for data annotation and improve the overall performance of classification models,” Expert Syst. Appl., vol. 228, p. 120391, Oct. 2023, doi: 10.1016/j.eswa.2023.120391. DOI: https://doi.org/10.1016/j.eswa.2023.120391
[2] T. Chen, S. Kornblith, M. Norouzi, and G. Hinton, “A Simple Framework for Contrastive Learning of Visual Representations,” Nov. 21, 2020, PMLR. Accessed: Mar. 26, 2026. [Online]. Available: https://proceedings.mlr.press/v119/chen20j.html
[3] N. Hossain, A. Al Thaki, Md. Mamun-Or-Rashid, and Md. Mosaddek Khan, “Graph Contrastive Learning: A Comprehensive Review of Methodologies, Applications, and Future Directions,” IEEE Access, vol. 14, pp. 40571–40604, 2026, doi: 10.1109/ACCESS.2026.3672509. DOI: https://doi.org/10.1109/ACCESS.2026.3672509
[4] K. He, H. Fan, Y. Wu, S. Xie, and R. Girshick, “Momentum Contrast for Unsupervised Visual Representation Learning,” 2020. Accessed: Mar. 26, 2026. [Online]. Available: https://github.com/facebookresearch/moco DOI: https://doi.org/10.1109/CVPR42600.2020.00975
[5] J. Gui, T. Chen, J. Zhang, Q. Cao, Z. Sun, and H. Luo, “A survey on self-supervised learning: Algorithms, applications, and future trends,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 46, no. 12, pp. 9052–9071, 2024, doi: 10.1109/TPAMI.2024.3415112. [6] C. F. G. Dos Santos and J. P. Papa, “Avoiding Overfitting: A Survey on Regularization Methods for Convolutional Neural Networks,” ACM Comput. Surv., vol. 54, no. 10 s, Jan. 2022, doi: 10.1145/3510413. DOI: https://doi.org/10.1109/TPAMI.2024.3415112
[7] C. Cao, F. Zhou, Y. Dai, J. Wang, and K. Zhang, “A Survey of Mix-based Data Augmentation: Taxonomy, Methods, Applications, and Explainability,” ACM Comput. Surv., vol. 57, no. 2, p. 38, Oct. 2024, doi: 10.1145/3696206. DOI: https://doi.org/10.1145/3696206
[8] J.-B. Grill, F. Strub, F. Altché, C. Tallec, P. Richemond, E. Buchatskaya, C. Doersch, B. A. Pires, Z. Guo, M. G. Azar, B. Piot, K. Kavukcuoglu, R. Munos, and M. Valko, “Bootstrap your own latent: A new approach to self-supervised learning,” in Proc. Adv. Neural Inf. Process. Syst. (NeurIPS), 2020.
[9] X. Chen and K. He, “Exploring Simple Siamese Representation Learning,” 2021. Accessed: Mar. 26, 2026. [Online]. Available: https://github.com/facebookresearch/simsiam
[10] J. Zbontar, L. Jing, I. Misra, Y. LeCun, and S. Deny, “Barlow Twins: Self-Supervised Learning via Redundancy Reduction,” Jul. 01, 2021, PMLR. Accessed: Mar. 26, 2026. [Online]. Available: https://proceedings.mlr.press/v139/zbontar21a.html
[11] A. Bardes, J. Ponce, and Y. LeCun, “VICReg: Variance-Invariance-Covariance Regularization for Self-Supervised Learning,” ICLR 2022 - 10th International Conference on Learning Representations, Jan. 2022, Accessed: Mar. 26, 2026. [Online]. Available: http://arxiv.org/abs/2105.04906
[12] A. Ermolov, A. Siarohin, E. Sangineto, and N. Sebe, “Whitening for Self-Supervised Representation Learning,” Jul. 01, 2021, PMLR. Accessed: Mar. 26, 2026. [Online]. Available: https://proceedings.mlr.press/v139/ermolov21a.html
[13] H. Zhang, M. Cisse, Y. N. Dauphin, and D. Lopez-Paz, “mixup: Beyond Empirical Risk Minimization,” 6th International Conference on Learning Representations, ICLR 2018 - Conference Track Proceedings, Apr. 2018, Accessed: Mar. 26, 2026. [Online]. Available: http://arxiv.org/abs/1710.09412
[14] S. Yun, D. Han, S. J. Oh, S. Chun, J. Choe, and Y. Yoo, “CutMix: Regularization Strategy to Train Strong Classifiers With Localizable Features,” 2019. Accessed: Mar. 26, 2026. [Online]. Available: https://github.com/clovaai/CutMix-PyTorch. DOI: https://doi.org/10.1109/ICCV.2019.00612
[15] A. Krizhevsky, “Learning Multiple Layers of Features from Tiny Images,” 2009.
[16] V. Hondru, F. A. Croitoru, S. Minaee, R. T. Ionescu, and N. Sebe, “Masked Image Modeling: A Survey,” International Journal of Computer Vision 2025 133:10, vol. 133, no. 10, pp. 7154–7200, Jul. 2025, doi: 10.1007/s11263-025-02524-1. DOI: https://doi.org/10.1007/s11263-025-02524-1
[17] W. Guo, T. Xu, B. Li, Y. Fan, Z. Yu, and K. Jing, “Deep Embedding with Adversarial Convolutional Autoencoder for Image Clustering,” pp. 195–198, Jan. 2026, doi: 10.1109/icicml67980.2025.11333458. DOI: https://doi.org/10.1109/ICICML67980.2025.11333458
[18] M. E. Ram and G. Manju, “VIOLET: Vectorized Invariance Optimization for Language Embeddings Using Twins,” IEEE Access, vol. 13, pp. 136312–136319, 2025, doi: 10.1109/ACCESS.2025.3590971. DOI: https://doi.org/10.1109/ACCESS.2025.3590971
[19] A. Abdallah, M. S. Kasem, I. Abdelhalim, N. S. Alghamdi, and A. El-Baz, “Improving BI-RADS Mammographic Classification With Self-Supervised Vision Transformers and Cascade Learning,” IEEE Access, vol. 13, pp. 135500–135514, 2025, doi: 10.1109/ACCESS.2025.3581582. DOI: https://doi.org/10.1109/ACCESS.2025.3581582
[20] H. Saeed, M. Adel, Y. Ataa, M. Mohamed, H. Ahmed, T. W. Hong, and N. Jayarajan, “Reliable drug–target interaction prediction using convolutional neural networks with robust negative sample generation,” Journal of Smart Algorithms and Applications, vol. 2, no. 2, pp. 34–48, Feb. 2026.
[21] Y. Jiang, J. Li, Y. Tian, J. Yao, X. Yu, W. Ye, and X. Cao, “Positional relation contextual mixing for imbalanced classification,” Machine Learning, vol. 115, no. 3, Mar. 2026, doi: 10.1007/s10994-026-07004-2 DOI: https://doi.org/10.1007/s10994-026-07004-2
[22] A. A. Wani, “Comprehensive review of dimensionality reduction algorithms: challenges, limitations, and innovative solutions,” PeerJ Comput. Sci., vol. 11, p. e3025, Jul. 2025, doi: 10.7717/peerj-cs.3025. DOI: https://doi.org/10.7717/peerj-cs.3025
[23] Chuhan Zhang, “Enhanced Multi-Modal Feature Fusion Algorithm for Early-Stage Cancer Detection: A Comparative Study of Optimization Strategies,” Chinese Control Conference, CCC, vol. 2018-July, pp. 9428–9433, Oct. 2018, doi: 10.23919/ChiCC.2018.8483140. DOI: https://doi.org/10.23919/ChiCC.2018.8483140
[24] K. Vinters, “Evaluating the generalizability of a panorama-point cloud encoder trained without supervision,” 2025, Accessed: Mar. 27, 2026. [Online]. Available: https://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-372284
[25] X. He, “APPLICATION OF MACHINE LEARNING TO METAMATERIALS FOR INVERSE DESIGN AND TOPOLOGICAL CLASSIFICATION”.
[26] R. Kumar, Y. W. Kim, and Y. C. Byun, “Hybrid Framework Combining Diffusion-Based Image Augmentation and Feature Level SMOTE for Addressing Extreme Class Imbalance,” IEEE Access, vol. 13, pp. 154623–154646, 2025, doi: 10.1109/ACCESS.2025.3600622. DOI: https://doi.org/10.1109/ACCESS.2025.3600622
Downloads
Published
Data Availability Statement
The CIFAR-10 dataset used in this study is publicly available.
Issue
Section
Categories
License
Copyright (c) 2026 Computational Discovery and Intelligent Systems (CDIS)

This work is licensed under a Creative Commons Attribution 4.0 International License.
Computational Discovery and Intelligent Systems (CDIS) content is published under a Creative Commons Attribution License (CCBY). This means that content is freely available to all readers upon publication, and content is published as soon as production is complete.
Computational Discovery and Intelligent Systems (CDIS) seeks to publish the most influential papers that will significantly advance scientific understanding. Selected articles must present new and widely significant data, syntheses, or concepts. They should merit recognition by the wider scientific community and the general public through publication in a reputable scientific journal.









