蟒蛇for循环使用的文件，而不是字典

我使用的，而不是一个Python字典我自己的文件时，但是当我在该文件上应用for环路我收到此错误：蟒蛇for循环使用的文件，而不是字典

TypeError: string indices must be integers, not str

我的代码在下面给出其中“sai.json”是包含字典的文件。

import json 
from naiveBayesClassifier import tokenizer 
from naiveBayesClassifier.trainer import Trainer 
from naiveBayesClassifier.classifier import Classifier 

nTrainer = Trainer(tokenizer) 

ofile = open("sai.json","r") 

dataset=ofile.read() 
print dataset 

for n in dataset: 
    nTrainer.train(n['text'], n['category']) 

nClassifier = Classifier(nTrainer.data, tokenizer) 

unknownInstance = "Even if I eat too much, is not it possible to lose some weight" 

classification = nClassifier.classify(unknownInstance) 
print classification

来源

2015-11-07 Neha

'N'是一个字符串，而不是一本字典。请做一些关于如何解析json的研究。使用'json'模块 – Pynchia

您正在导入json模块，但您没有使用它！

您可以使用json.load从打开的文件中加载JSON数据转换为Python dict，或者您也可以读取文件转换成字符串，然后使用json.loads将数据加载到dict。

例如，

ofile = open("sai.json","r") 
data = json.load(ofile) 
ofile.close()

甚至更好

with open("sai.json", "r") as ifile: 
    data = json.load(ofile)

或者，使用json.loads：

with open("sai.json", "r") as ifile: 
    dataset=ofile.read() 
data = json.loads(dataset)

然后你就可以用data['text']和
访问data内容data['category']，假设字典有这些键。

你得到一个错误，因为dataset是一个字符串，因此

for n in dataset: 
    nTrainer.train(n['text'], n['category'])

环比由字符字符串的字符，把每个字符为一个元素字符串。字符串只能由整数，而不是其他的字符串进行索引，但没有太多的点索引到一个元素串，因为如果s是一个元素串，然后s[0]具有相同内容s

这里的数据你在评论中。我假定你的数据是一个包装在字典中的列表，但是可以将一个普通列表作为JSON对象。我使用print json.dumps(dataset, indent=4)来格式化它。请注意，文件中最后一个}后面没有逗号：在Python中没问题，但是它在JSON中是错误的。

sai.json

[ 
    { 
     "category": "NO", 
     "text": "hello everyone" 
    }, 
    { 
     "category": "YES", 
     "text": "dont use words like jerk" 
    }, 
    { 
     "category": "NO", 
     "text": "what the hell." 
    }, 
    { 
     "category": "yes", 
     "text": "you jerk" 
    } 
]

现在，如果我们在json.load阅读你的代码应该正常工作。但这里有一个简单的演示，只是打印内容：

with open("sai.json", "r") as f: 
    dataset = json.load(f) 

for n in dataset: 
    print "text: '%s', category: '%s'" % (n['text'], n['category'])

输出

text: 'hello everyone', category: 'NO' 
text: 'dont use words like jerk', category: 'YES' 
text: 'what the hell.', category: 'NO' 
text: 'you jerk', category: 'yes'

来源

2015-11-07 07:18:18

我仍然收到错误“TypeError：字符串索引必须是整数，而不是str”。我如何将n从字符串更改为此循环中的整数。 – Neha

我在这个for循环中出现错误： - 对于数据集中的n： nTrainer.train（n ['text']，n ['category']） – Neha

@Neha：该代码看起来完全像您问题中的代码。但是，也许你误解了我之前说过的话。该代码**不正确**。你有没有尝试用我读到的“sai.json”文件转换成Python字典的三种方法之一？如果该文件不包含有效的JSON数据，那么'json'模块会在尝试加载时发出错误消息。一旦你加载了，那么访问你需要的数据字段的正确方式取决于JSON数据的结构。也许你应该将你的数据样本发布到问题中。 –

蟒蛇for循环使用的文件，而不是字典

回答

相关问题