2015-08-20 63 views
2

我想解决与Pandas remove null values when to_json类似的问题。json dumps TypeError:键必须是带字典的字符串

我的解决办法是

  1. NaN值转换数据帧时,快译通,然后
  2. 转换字典使用json.dumps()

这里是我的代码和错误JSON:

In [9]:df 

Out[9]: 
    101 102 
    a 123 NaN 
    b 234 234 
    c NaN 456 

In [10]:def to_dict_dropna(data): 
      return dict((k, v.dropna().to_dict()) for k, v in compat.iteritems(data)) 

In [47]:k2 = to_dict_dropna(df) 
In [48]:k2 
Out[48]:{101: {'a': 123.0, 'b': 234.0}, 102: {'b': 234.0, 'c': 456.0}} 
In [49]:json.dumps(k2) 
--------------------------------------------------------------------------- 
TypeError         Traceback (most recent call last) 
<ipython-input-76-f0159cf5a097> in <module>() 
----> 1 json.dumps(k2) 

C:\Python27\lib\json\__init__.pyc in dumps(obj, skipkeys, ensure_ascii, check_circular, allow_nan, cls, indent, separators, encoding, default, sort_keys, **kw) 
    241   cls is None and indent is None and separators is None and 
    242   encoding == 'utf-8' and default is None and not sort_keys and not kw): 
--> 243   return _default_encoder.encode(obj) 
    244  if cls is None: 
    245   cls = JSONEncoder 

C:\Python27\lib\json\encoder.pyc in encode(self, o) 
    205   # exceptions aren't as detailed. The list call should be roughly 
    206   # equivalent to the PySequence_Fast that ''.join() would do. 
--> 207   chunks = self.iterencode(o, _one_shot=True) 
    208   if not isinstance(chunks, (list, tuple)): 
    209    chunks = list(chunks) 

C:\Python27\lib\json\encoder.pyc in iterencode(self, o, _one_shot) 
    268     self.key_separator, self.item_separator, self.sort_keys, 
    269     self.skipkeys, _one_shot) 
--> 270   return _iterencode(o, 0) 
    271 
    272 def _make_iterencode(markers, _default, _encoder, _indent, _floatstr, 

TypeError: keys must be a string 

但它工作,如果我直接初始化ialize字典:

In [65]:k = {101: {'a': 123.0, 'b': 234.0}, 102: { 'b': 234.0, 'c': 456.0}} 
In [66]:k == k2 
Out[66]:True 
In [63]:json.dumps(k) 
Out[63]:'{"101": {"a": 123.0, "b": 234.0}, "102": {"c": 456.0, "b": 234.0}}' 

我的代码有什么问题?

+0

有趣的是,我曾预料这两本词典都会失败。解决方法是使用'dict((str(k),v.dropna()。to_dict())for k,v in compat.iteritems(data))'(或'{str(k) :v.dropna()。to_dict())for k,v in compat.iteritems(data)}'使用dict理解符号)。 –

+1

JSON C源代码显式测试'int','long','float'和'bool'键,将所有这些键转换为字符串。这意味着你的键不是真正的整数,而只是* mimic *整数(它们的表示是相同的,它们测试相等,但是'isinstance(int,key)'失败)。 –

回答

2

你的熊猫数据框中的“整数”并不是真正的整数。它们是float64对象,请参见Pandas Gotchas documentation

你必须将它们转换回到int()对象,或将其直接转换为字符串:

def to_dict_dropna(data): 
    return {int(k): v.dropna().astype(int).to_dict() for k, v in compat.iteritems(data)} 

不前。

+0

谢谢Martijin。这回答了我的问题。 – cssmlulu

相关问题