Python和IBM DB2：UnicodeDecodeError错误

我收到此错误信息Python和IBM DB2：UnicodeDecodeError错误

UnicodeDecodeError: 'ascii' codec can't decode byte 0xc8 in position 38: ordinal not in range(128)

，当我尝试执行Python中的任何SQL查询，像这样的：

>>> import ibm_db 
>>> conn = ibm_db.connect("sample","root","root") 
>>> ibm_db.exec_immediate(conn, "select * from act")

我检查默认编码和这似乎是“UTF-8”：

>>> import sys 
>>> sys.getdefaultencoding() 
'utf-8'

我也知道this线，那里的人们正在讨论颇为类似PROBL EM。其中一个建议是：

Have you applied the required database PTFs (SI57014 and SI57015 for 7.1 and SI57146 and SI57147 for 7.2)? They are included as a distreq, so they should have been in the order with your PTFs, but won't be automatically applied.

但是，我不知道什么是数据库PTF以及如何应用它。需要帮忙。

PS。我使用的是Windows 10

编辑

这是我如何得到我的错误消息：

>>> print(ibm_db.stmt_errormsg()) 
Traceback (most recent call last): 
File "<stdin>", line 1, in <module> 
UnicodeDecodeError: 'ascii' codec can't decode byte 0xc8 in position 38:  
ordinal not in range(128)

但DB2 CLP，当我运行相同的查询 “SELECT * FROM行为”，那么没关系。这是驱动程序的信息，whcih我得到了在Python运行这段代码：

if client: 
    print("DRIVER_NAME: string(%d) \"%s\"" % (len(client.DRIVER_NAME), client.DRIVER_NAME)) 
    print("DRIVER_VER: string(%d) \"%s\"" % (len(client.DRIVER_VER), client.DRIVER_VER)) 
    print("DATA_SOURCE_NAME: string(%d) \"%s\"" % (len(client.DATA_SOURCE_NAME), client.DATA_SOURCE_NAME)) 
    print("DRIVER_ODBC_VER: string(%d) \"%s\"" % (len(client.DRIVER_ODBC_VER), client.DRIVER_ODBC_VER)) 
    print("ODBC_VER: string(%d) \"%s\"" % (len(client.ODBC_VER), client.ODBC_VER)) 
    print("ODBC_SQL_CONFORMANCE: string(%d) \"%s\"" % (len(client.ODBC_SQL_CONFORMANCE), client.ODBC_SQL_CONFORMANCE)) 
    print("APPL_CODEPAGE: int(%s)" % client.APPL_CODEPAGE) 
    print("CONN_CODEPAGE: int(%s)" % client.CONN_CODEPAGE) 
    ibm_db.close(conn) 
else: 
    print("Error.")

它打印：

DRIVER_NAME: string(10) "DB2CLI.DLL" 
DRIVER_VER: string(10) "10.05.0007" 
DATA_SOURCE_NAME: string(6) "SAMPLE" 
DRIVER_ODBC_VER: string(5) "03.51" 
ODBC_VER: string(10) "03.01.0000" 
ODBC_SQL_CONFORMANCE: string(8) "EXTENDED" 
APPL_CODEPAGE: int(1251) 
CONN_CODEPAGE: int(1208) 
True

编辑

我也试过这样：

>>> cnx = ibm_db.connect("sample","root","root") 
>>> query = "select * from act" 
>>> query.encode('ascii') 
b'select * from act' 
>>> ibm_db.exec_immediate(cnx, query) 
Traceback (most recent call last): 
File "<stdin>", line 1, in <module> 
Exception 
>>> print(ibm_db.stmt_errormsg()) 
Traceback (most recent call last): 
File "<stdin>", line 1, in <module> 
UnicodeDecodeError: 'ascii' codec can't decode byte 0xc8 in position 38: 
ordinal not in range(128)

正如你所看到的，在这种情况下，我是一个也得到了非常相同的错误信息。

摘要

下面是我所有的attemts：

C:\Windows\system32>chcp 
Active code page: 65001 

C:\Windows\system32>python 
Python 3.4.4 (v3.4.4:737efcadf5a6, Dec 20 2015, 20:20:57) [MSC v.1600 64 bit (AMD64)] on win32 
Type "help", "copyright", "credits" or "license" for more information. 
>>> import ibm_db 
>>> cnx = ibm_db.connect("sample","root","root") 
>>> ibm_db.exec_immediate(cnx, "select * from act") 
Traceback (most recent call last): 
    File "<stdin>", line 1, in <module> 
Exception 
>>> print(ibm_db.stmt_errormsg()) 
Traceback (most recent call last): 
    File "<stdin>", line 1, in <module> 
UnicodeDecodeError: 'ascii' codec can't decode byte 0xc8 in position 38: ordinal not in range(128) 
>>> ibm_db.exec_immediate(cnx, b"select * from act") 
Traceback (most recent call last): 
    File "<stdin>", line 1, in <module> 
Exception: statement must be a string or unicode 
>>> query = "select * from act" 
>>> query = query.encode() 
>>> ibm_db.exec_immediate(cnx, query) 
Traceback (most recent call last): 
    File "<stdin>", line 1, in <module> 
Exception: statement must be a string or unicode 
>>> ibm_db.exec_immediate(cnx, "select * from act").decode('cp-1251') 
Traceback (most recent call last): 
File "<stdin>", line 1, in <module> 
Exception

来源

2016-05-23 Jacobian

什么平台和版本的DB2？ PTF和版本（7.1和7.2）是IBM i的DB2。 – Charles

你的数据库配置是什么？连接到数据库时尝试使用'get db cfg'来获取该信息。 –

当我做'获得db cfg'时，我得到一长串信息。在这个例子中，我看到默认的数据库编码是'UTF-8'。顺便说一下，我应该补充一点，我可以在控制台中使用数据库 - 我可以连接到数据库实例并执行简单的查询。整个问题都与Python驱动程序有关。 – Jacobian

在这种情况下，使用UTF8环境，需要一个ASCII一个东西;我使用解码方法。

'ascii' codec can't decode byte 0xc8

好吧，这是正常的，这不是ascii而是utf8字符串：你应该用utf8编码来解码它。

... 
query.decode('utf8') 
ibm_db.exec_immediate(cnx, query)

之后，您可能需要重新编码结果来编写或打印它们。

来源

2016-05-31 08:19:37 mquantin

我会在一分钟内检查它。 – Jacobian

它不工作。 'query.decode（'utf8'）'结果到'AttributeError：'str'对象没有属性'decode'' – Jacobian

问题在于DB2服务器在配置输出中返回了CP-1251（也称为Windows-1251）文本（如APPL_CODEPAGE: int(1251)所示）。 Python（特别是交互式Python REPL）需要UTF-8或ASCII输出，因此会导致问题。

解决的办法是要做到：

ibm_db.exec_immediate(conn, "select * from act").decode('cp-1251')

此外，你需要确保你的终端的文本编码设置为UTF-8。有关更改该设置的详细信息取决于您正在使用的特定终端。既然你说过你使用的是cmd，那么适当的命令是chcp 65001。

来源

2016-06-01 07:52:44 Mego

谢谢！我会在一分钟内检查它！ – Jacobian

我希望它的工作，但它不工作。当我运行'ibm_db.exec_immediate（cnx，b“select * from act”）时，我得到这个错误信息：'Exception：statement must be a string or unicode' – Jacobian

做'query = query.encode（）'然后“将查询字符串传递给ibm_db.exec_immediate“也不起作用。并且错误消息再一次是'Exception：statement must be a string or unicode' – Jacobian

这里所说的是您的客户端代码（ibm_db）和DB2服务器之间的不兼容性。正如您在client code中看到的那样，查询的逻辑基本上是：

提取并检查传入的参数（第4873至4918行）。
为查询分配本地对象（最多4954）。
做查询和解码结果（功能的其余部分）。

根据我们迄今为止的调查结果，您知道您传递给查询的数据是格式良好的（因此它不是第1步）。查看步骤2中的错误路径，您会看到简单的错误消息来解释这些故障。因此，您在步骤3中失败。

您将在查询中获得一个空异常，并且当您尝试获取错误的详细信息时，将获得另一个Unicode解码异常。这看起来像是ibm_db中的一个错误，或者是一个配置错误，这意味着您的DB2安装不兼容。那么我们如何才能找出哪些......？

正如其他地方标记的，这个问题从根本上与代码页有关。所有ibm_db代码基本上都将字符串解释为ASCII（通过将它们转换为StringOBJ_FromASCII，将其映射为调用到坚持接收ASCII字符的Python API中，并且如果不是，则会抛出unicode异常）。

根据您的问题，您可以尝试证明/反驳此问题，方法是安装/配置您的系统（客户端和DB2服务器）以使用美国英语。这应该让你通过代码页不兼容来找到真正的错误。

如果查询确实通过网络传出，则可能只是获取显示从服务器返回的响应的网络跟踪。但是，根据您在日志中没有看到任何内容的事实，我不相信这会带来任何成果。

如果您不需要修补ibm_db代码来处理非ASCII内容 - 可以通过向维护人员报告错误报告或自己尝试（如果您知道如何构建和调试C扩展）。

来源

2016-06-01 09:08:52

感谢您的帮助。但是现在看起来ibm_db库与Python 3完全不兼容。您是否尝试过从Python 3代码连接到DB2？ – Jacobian

我没有亲自使用它，但ibm_db库与Python 3兼容。在https://pypi.python.org/pypi/ibm_db/中清楚地记录了这一点，并且证明该库是为Python编写的2和3.你的问题不是Python 3.我怀疑，维护者，但只有在美国英语系统上使用它。 –

嗯，你是对的。谢谢！ – Jacobian

Python和IBM DB2：UnicodeDecodeError错误

回答

相关问题