在gensim
中的Word2Vec对象有null_word
参数,在文档中没有对此进行说明。什么是gensim Word2Vec中的`null_word`参数?
类gensim.models.word2vec.Word2Vec(句子=无,大小= 100,α-= 0.025,窗口= 5,min_count = 5,max_vocab_size =无,样品= 0.001,种子= 1,工人= 3 ,min_alpha = 0.0001,SG = 0,HS = 0,负= 5,cbow_mean = 1,hashfxn =,ITER = 5,null_word = 0,trim_rule =无,sorted_vocab = 1,batch_words = 10000)
什么是null_word
参数用于?
在https://github.com/RaRe-Technologies/gensim/blob/develop/gensim/models/word2vec.py#L680检查代码,它指出:
if self.null_word:
# create null pseudo-word for padding when using concatenative L1 (run-of-words)
# this word is only ever input – never predicted – so count, huffman-point, etc doesn't matter
word, v = '\0', Vocab(count=1, sample_int=0)
v.index = len(self.wv.vocab)
self.wv.index2word.append(word)
self.wv.vocab[word] = v
什么是 “拼接L1”?