Contrastive Learning for Document Understanding

Pooja Naveen Malhotra

doi:10.15662/IJRAI.2021.0401001

Authors

Pooja Naveen Malhotra Dr. B.R. Ambedkar College, Sri Ganganagar, Rajasthan, India Author

DOI:

https://doi.org/10.15662/IJRAI.2021.0401001

Keywords:

Contrastive Learning, Document Understanding, Self-Supervised Learning, Negative Sampling, Sentence Representations, Adversarial Sampling

Abstract

Contrastive learning is a self-supervised methodology that enables models to learn robust representations by bringing similar samples together in embedding space while pushing dissimilar ones apart. The origins of contrastive learning in NLP trace back to 2013, where Mikolov et al. introduced it via word embeddings using co-occurrence and negative sampling, significantly improving representation quality in a computationally efficient manner.Encyclopedia Pub Notably, Logeswaran and Lee (2018) extended contrastive ideas to sentence-level representation by casting context prediction as classification: distinguishing the true context sentence from contrastive alternatives. This enabled learning high-quality sentence embeddings from unlabeled text and led to superior performance on downstream tasks with remarkable training speed.arXiv Additionally, Bose et al. (2018) proposed Adversarial Contrastive Estimation, which enhances contrastive learning by incorporating an adversarially trained negative sampler, resulting in harder negative examples. This accelerated convergence and improved embedding representations for word embeddings, order embeddings, and knowledge graph embeddings.arXiv These foundational works highlight methodologies applicable to document understanding: contrastively learning representations through context vs. non-context sentences, using adversarial negatives to strengthen embedding quality, and scaling from words to sentences—forming a basis for document-level applications.

References

1. Mikolov et al. (2013) – Word embeddings with negative sampling (skip-gram).Encyclopedia Pub

2. Arora et al. – Theoretical framework for contrastive representation learning on unlabeled data.Encyclopedia Pub

3. Logeswaran, L., & Lee, H. (2018) – Efficient contrastive sentence representation via context classification.arXiv

4. Bose, A. J., Ling, H., & Cao, Y. (2018) – Adversarial Contrastive Estimation for representation learning.

Contrastive Learning for Document Understanding

Authors

DOI:

Keywords:

Abstract

References

Downloads

Published

Issue

Section

How to Cite

Make a Submission

images

Call for Paper

Submission

Quick Contact

Open Access

License

Information

Keywords

Latest publications