Skip to content

This project investigates similar word prediction using pre-trained word embeddings from the Gensim library.

Notifications You must be signed in to change notification settings

im-dpaul/NLP-Similar-Word-Prediction-Pretrained-Embeddings

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 

Repository files navigation

Exploring Similar Words with Pre-trained Embeddings

This project utilizes pre-trained embeddings from the Gensim library to perform similar word prediction and word analogy tasks. By leveraging the pre-trained Word2Vec and GloVe embeddings, we can efficiently discover semantic relationships between words, enhancing natural language processing applications.

Leveraging Pre-trained Embeddings

  • Bypasses the need to train a Word2Vec model from scratch, saving time and resources.
  • Utilizes pre-trained models like Word2Vec-Google-News-300 and GloVe-Twitter-25, which capture semantic relationships learned from massive text corpora.

Implementation

Library Installation

Installs necessary libraries (nltk, gensim) using pip (!pip install ...).

Importing Libraries

Importing required libraries.

Pre-trained Embeddings Listing

Retrieves and displays a list of available pre-trained embedding models from Gensim.

Word2Vec Embedding Usage

  • Loads the pre-trained Word2Vec-Google-News-300 model.
  • Finds the most similar words for a sample word ("dog").
  • Performs a word analogy task, finding the word similar to "woman" in the way "man" is similar to "king".

GloVe Embedding Usage

  • Loads the pre-trained GloVe-Twitter-25 model.
  • Finds the most similar words for a sample word ("hello").

Conclusion

This project demonstrates how to use pre-trained Word2Vec and GloVe embeddings from the Gensim library to find similar words and perform word analogy tasks. By leveraging these powerful embeddings, we can enhance natural language processing applications and discover intricate relationships between words.

About

This project investigates similar word prediction using pre-trained word embeddings from the Gensim library.

Topics

Resources

Stars

Watchers

Forks