随着csv
这是可以做到如下:
from urllib.request import urlopen
import csv
import io
url = 'http://archive.ics.uci.edu/ml/machine-learning-databases/wine-quality/winequality-red.csv'
x = urlopen(url)
csv_data = x.read().decode('utf-8')
csv_input = csv.reader(io.StringIO(csv_data), delimiter=';')
header = next(csv_input)
print("Header is:", header)
data = list(csv_input)
# Display start of data
for row in data[:5]:
print(row)
这将使你:
Header is: ['fixed acidity', 'volatile acidity', 'citric acid', 'residual sugar', 'chlorides', 'free sulfur dioxide', 'total sulfur dioxide', 'density', 'pH', 'sulphates', 'alcohol', 'quality']
['7.4', '0.7', '0', '1.9', '0.076', '11', '34', '0.9978', '3.51', '0.56', '9.4', '5']
['7.8', '0.88', '0', '2.6', '0.098', '25', '67', '0.9968', '3.2', '0.68', '9.8', '5']
['7.8', '0.76', '0.04', '2.3', '0.092', '15', '54', '0.997', '3.26', '0.65', '9.8', '5']
['11.2', '0.28', '0.56', '1.9', '0.075', '17', '60', '0.998', '3.16', '0.58', '9.8', '6']
['7.4', '0.7', '0', '1.9', '0.076', '11', '34', '0.9978', '3.51', '0.56', '9.4', '5']
欣赏。但是使用熊猫会自动将第一行作为索引,而我宁愿自己做。 –
你可以添加参数'header = None',以确保第一行不会变成列名(而不是索引) – Mathias711
太棒了,它的工作原理。看起来熊猫仍然是处理数据的最佳选择,尽管我正在寻找自己的方式来实现相同的输出。 –