Self-Supervised Learning is Approximately Supervised Learning DisCO: Reinforcing Large Reasoning Models with Discriminative Constrained Optimization The Fair Language Model Paradox Type-II Saddles and Probabilistic Stability of Stochastic Gradient Descent Generalization Bounds for Transfer Learning with Pretrained Classifiers |