2017-05-22 41 views
2

paper on fasttext的监督分类,作者指定的不同数量的隐藏单元通过改变某些参数(h是在3,4页的一个 - 你看在表1中“它有10个隐藏的单位,我们有和没有双字母组对其进行评估。“),但读the documentation后它不会出现有一个‘隐藏单位’参数来改变。有没有办法指定隐藏单位的数量?或者这与指定-dim选项相同?指定的隐藏单元#Facebook中fasttext

+0

从Facebook群组页面:我想问一下,使用Fasttext进行分类时,在神经网络SOFTMAX使用什么节点的数量? --number在SOFTMAX层节点是相同的类(或更多一点分层SOFTMAX)号码。再有就是它的大小是由你(机智-dim)设置隐藏层.--这是下面这样接受的答案是一致的。 –

回答

0

k是否定的。类

https://arxiv.org/pdf/1607.01759v3.pdf

第2.1节更准确地说,计算复杂度是O(KH),其中k是类和h的文本表示的维数。


当在文本分类预测类,从docs

的参数k是可选的,并且是通过默认等于1。 为了获得K中某一段文字,最有可能的标签,使用:

$ ./fasttext预测model.bin的test.txtķ


当训练模型,执行与__label__*标签指导训练时,这是隐含在训练数据中指定。

example tutorial

$ wget https://s3-us-west-1.amazonaws.com/fasttext-vectors/cooking.stackexchange.tar.gz && tar xvzf cooking.stackexchange.tar.gz 
--2017-05-23 09:03:26-- https://s3-us-west-1.amazonaws.com/fasttext-vectors/cooking.stackexchange.tar.gz 
Resolving s3-us-west-1.amazonaws.com... 54.231.236.45 
Connecting to s3-us-west-1.amazonaws.com|54.231.236.45|:443... connected. 
HTTP request sent, awaiting response... 200 OK 
Length: 457609 (447K) [application/x-gzip] 
Saving to: ‘cooking.stackexchange.tar.gz.1’ 

cooking.stackexchange.tar.gz.1  100%[================================================================>] 446.88K 385KB/s in 1.2s  

2017-05-23 09:03:28 (385 KB/s) - ‘cooking.stackexchange.tar.gz.1’ saved [457609/457609] 

x cooking.stackexchange.id 
x cooking.stackexchange.txt 
x readme.txt 


$ cat readme.txt 
The data in this archive is derived from the user-contributed content on the 
Cooking Stack Exchange website (https://cooking.stackexchange.com/), used under 
CC-BY-SA 3.0 (http://creativecommons.org/licenses/by-sa/3.0/). 

The original data dump can be downloaded from: 
https://archive.org/download/stackexchange/cooking.stackexchange.com.7z 
and details about the dump obtained from: 
https://archive.org/details/stackexchange 

We distribute two files, under CC-BY-SA 3.0: 

- cooking.stackexchange.txt, which contains all question titles and 
    their associated tags (one question per line, tags are prefixed by 
    the string "__label__") ; 

- cooking.stackexchange.id, which contains the corresponding row IDs, 
    from the original data dump. 
相关问题