Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Errors due to Word2Vec being deprecated #12

Open
saxenapriyansh opened this issue Oct 7, 2018 · 3 comments
Open

Errors due to Word2Vec being deprecated #12

saxenapriyansh opened this issue Oct 7, 2018 · 3 comments

Comments

@saxenapriyansh
Copy link

The vocab dictionary mapping words to slots/counts/etc has been moved to a KeyedVectors object used by the model, and held in the wv property.

So model.vocab should be replaced by model.wv.vocab.
So in In[19] it should be : print("Word2Vec vocabulary length:", len(thrones2vec.wv.vocab))

@saxenapriyansh
Copy link
Author

Since, the syntax for word2vec training is:
train(sentences=None, corpus_file=None, total_examples=None, total_words=None, epochs=None, start_alpha=None, end_alpha=None, word_count=0, queue_factor=2, report_delay=1.0, compute_loss=False, callbacks=())

Therefore, to avoid common mistakes around the model’s ability to do multiple training passes itself, an explicit epochs argument MUST be provided. In the common and recommended case where train() is only called once, you can set epochs=self.iter.

So instead of
In [20]:thrones2vec.train(sentences)
it should be
thrones2vec.train(sentences, total_words=token_count, epochs=self.iter)

@saxenapriyansh saxenapriyansh changed the title Word2Vec.vocab Errors due to Word2Vec being deprecated Oct 7, 2018
@saxenapriyansh
Copy link
Author

In [25]:all_word_vectors_matrix = thrones2vec.syn0
Throes an error : AttributeError: 'Word2Vec' object has no attribute 'syn0'

Solution:
Use : model.wv.syn0 instead of model.syn0

The error was due to,
Two methods and several attributes in word2vec class have been deprecated. The methods are load_word2vec_format and save_word2vec_format. The attributes are syn0norm, syn0, vocab, index2word . They have been moved to KeyedVectors class.

@bgloger3489
Copy link

Since, the syntax for word2vec training is:
train(sentences=None, corpus_file=None, total_examples=None, total_words=None, epochs=None, start_alpha=None, end_alpha=None, word_count=0, queue_factor=2, report_delay=1.0, compute_loss=False, callbacks=())

Therefore, to avoid common mistakes around the model’s ability to do multiple training passes itself, an explicit epochs argument MUST be provided. In the common and recommended case where train() is only called once, you can set epochs=self.iter.

So instead of
In [20]:thrones2vec.train(sentences)
it should be
thrones2vec.train(sentences, total_words=token_count, epochs=self.iter)

It should actually be thrones2vec.train(sentences,total_words=thrones2vec.corpus_count,epochs=thrones2vec.epochs)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants