我有一堆需要加载到MySQL数据库中的CSV数据。那么,也许是CSV-ISH。 (编辑:actually, it looks like the stuff described in RFC 4180)从双引号用作转义字符的CSV文件中加载数据
每行是逗号分隔双引号字符串的列表。为了避免出现在列值内的任何双引号,使用双重双引号。反斜杠允许表示自己。
例如,该行:
"", "\wave\", ""hello,"" said the vicar", "what are ""scare-quotes"" good for?", "I'm reading ""Bossypants"""
如果解析成JSON应该是:
[ "", "\\wave\\", "\"hello,\" said the vicar", "what are \"scare-quotes\" good for?", "I'm reading \"Bossypants\"" ]
我试图使用LOAD DATA
阅读的CSV,但我跑陷入一些奇怪的行为。
作为一个例子,考虑如果我有一个简单的两列的表
shell% mysql exampledb -e "describe person"
+-------+-----------+------+-----+---------+-------+
| Field | Type | Null | Key | Default | Extra |
+-------+-----------+------+-----+---------+-------+
| ID | int(11) | YES | | NULL | |
| UID | char(255) | YES | | NULL | |
+-------+-----------+------+-----+---------+-------+
shell%
如果我的输入文件的第一非报头行上""
结束:
shell% cat temp-1.csv
"ID","UID"
"9",""
"0","Steve the Pirate"
"1","\Alpha"
"2","Hoban ""Wash"" Washburne"
"3","Pastor Veal"
"4","Tucker"
"10",""
"5","Simon"
"6","Sonny"
"7","Wat\"
我可以加载每个非标题行但第一个:
mysql> DELETE FROM person;
Query OK, 0 rows affected (0.00 sec)
mysql> LOAD DATA
LOCAL INFILE 'temp-1.csv'
INTO TABLE person
FIELDS
TERMINATED BY ','
ENCLOSED BY '"'
ESCAPED BY '"'
LINES
TERMINATED BY '\n'
IGNORE 1 LINES
;
Query OK, 9 rows affected (0.00 sec)
Records: 9 Deleted: 0 Skipped: 0 Warnings: 0
mysql> SELECT * FROM person;
+------+------------------------+
| ID | UID |
+------+------------------------+
| 0 | Steve the Pirate |
| 10 | |
| 1 | \Alpha |
| 2 | Hoban "Wash" Washburne |
| 3 | Pastor Veal |
| 4 | Tucker |
| 5 | Simon |
| 6 | Sonny |
| 7 | Wat\ |
+------+------------------------+
9 rows in set (0.00 sec)
或者我可以加载所有线路包括标题:
mysql> DELETE FROM person;
Query OK, 9 rows affected (0.00 sec)
mysql> LOAD DATA
LOCAL INFILE 'temp-1.csv'
INTO TABLE person
FIELDS
TERMINATED BY ','
ENCLOSED BY '"'
ESCAPED BY '"'
LINES
TERMINATED BY '\n'
IGNORE 0 LINES
;
Query OK, 11 rows affected, 1 warning (0.01 sec)
Records: 11 Deleted: 0 Skipped: 0 Warnings: 1
mysql> show warnings;
+---------+------+--------------------------------------------------------+
| Level | Code | Message |
+---------+------+--------------------------------------------------------+
| Warning | 1366 | Incorrect integer value: 'ID' for column 'ID' at row 1 |
+---------+------+--------------------------------------------------------+
1 row in set (0.00 sec)
mysql> SELECT * FROM person;
+------+------------------------+
| ID | UID |
+------+------------------------+
| 0 | UID |
| 9 | |
| 0 | Steve the Pirate |
| 10 | |
| 1 | \Alpha |
| 2 | Hoban "Wash" Washburne |
| 3 | Pastor Veal |
| 4 | Tucker |
| 5 | Simon |
| 6 | Sonny |
| 7 | Wat\ |
+------+------------------------+
11 rows in set (0.00 sec)
如果我输入文件结束对""
没有台词:
shell% cat temp-2.csv
"ID","UID"
"0","Steve the Pirate"
"1","\Alpha"
"2","Hoban ""Wash"" Washburne"
"3","Pastor Veal"
"4","Tucker"
"5","Simon"
"6","Sonny"
"7","Wat\"
然后我可以加载没有台词:
mysql> DELETE FROM person;
Query OK, 11 rows affected (0.00 sec)
mysql> LOAD DATA
LOCAL INFILE 'temp-2.csv'
INTO TABLE person
FIELDS
TERMINATED BY ','
ENCLOSED BY '"'
ESCAPED BY '"'
LINES
TERMINATED BY '\n'
IGNORE 1 LINES
;
Query OK, 0 rows affected (0.00 sec)
Records: 0 Deleted: 0 Skipped: 0 Warnings: 0
mysql> SELECT * FROM person;
Empty set (0.00 sec)
或者我可以加载包括标题在内的所有行:
mysql> DELETE FROM person;
Query OK, 0 rows affected (0.00 sec)
mysql> LOAD DATA
LOCAL INFILE 'temp-2.csv'
INTO TABLE person
FIELDS
TERMINATED BY ','
ENCLOSED BY '"'
ESCAPED BY '"'
LINES
TERMINATED BY '\n'
IGNORE 0 LINES
;
Query OK, 9 rows affected, 1 warning (0.03 sec)
Records: 9 Deleted: 0 Skipped: 0 Warnings: 1
mysql> show warnings;
+---------+------+--------------------------------------------------------+
| Level | Code | Message |
+---------+------+--------------------------------------------------------+
| Warning | 1366 | Incorrect integer value: 'ID' for column 'ID' at row 1 |
+---------+------+--------------------------------------------------------+
1 row in set (0.00 sec)
mysql> SELECT * FROM person;
+------+------------------------+
| ID | UID |
+------+------------------------+
| 0 | UID |
| 0 | Steve the Pirate |
| 1 | \Alpha |
| 2 | Hoban "Wash" Washburne |
| 3 | Pastor Veal |
| 4 | Tucker |
| 5 | Simon |
| 6 | Sonny |
| 7 | Wat\ |
+------+------------------------+
9 rows in set (0.00 sec)
因此,现在我发现了很多错误的方法,我怎样才能使用LOAD DATA
将这些文件中的数据导入到我的数据库中?
+1您的建议帮助我解决了另一个问题。我在csv中用双引号括住了所有的字段,并且如果该字段为空,那么csv将只包含空的两个双引号“” - 它假定它是转义字符,而我的数据导入命令不起作用。把ESCAPED BY''完成了这项工作。谢谢。 – Aakash
我的数据完全是rfc 4180,因为没有逃逸角色。如果在一个字段中有一个逗号,那么它必须用双引号括起来。在这种情况下是否使'ESCAPED BY''工作? – CMCDragonkai