Skip to content

ketkiambekar/word2vec

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

14 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

word2vec

In this project, we will use the gensim library to demonstrate the use of word2vec.

TL;DR

The program learns the text corpus and finds the top 5 most probable related words to the words we input to the algorithm. This is the underlying algorithm for text prediction.

What is word2vec?

The word2vec algorithm uses a neural network model to learn word associations from a large corpus of text. Once trained, such a model can detect words with highest cosine similarity from the text.

Software Prequisites:

Python 3.x

Instructions to Execute

  1. Clone the Repo
  2. Run the commands as below:

When running for the first time use the following command to to install gensim:
$ pip install -U gensim

Command to run the consumer python program:
$ python test.py fake_or_real_news.csv mymodel query_words.txt

NOTE: For changing words in query_words.txt, be careful to enter words exactly in the format below:

word1
word2
word3
wordn

Next steps for the project:

Build a flask app implementation of autocomplete

About

Using word2vec to find correlated words

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages