在Matplotlib中绘制瑕疵

使用x，y坐标的散点图建议Matplotlib中的绘图与使用其他程序获得的绘图不同。例如，下面是一些PCA在两个合格分数上的结果。使用R和相同数据的同一图形提供了不同的显示...我也使用Excell和Libreoffice进行了检查：它们提供了与R相同的显示。在对Matplotlib进行咆哮之前或报告错误时，我希望获得其他意见并检查是否我做得很好。我的缺点是什么？在Matplotlib中绘制瑕疵

我检查了花车都没有问题，检查，协调秩序同样，... 所以有R剧情：由R做出

mydata = read.csv("C:/Users/Anon/Desktop/data.txt") # read csv file 
summary(mydata) 
attach(mydata) 
plot(mydata)

散点图与Matplotlib enter image description here

相同的数据绘制：

import matplotlib.pyplot as mpl 
import numpy as np 
import os 
# open the file with PCA results and convert it into float 
file_data = os.getcwd() + "\\data.txt" 
F = open(file_data, 'r') 
DATA=F.readlines() 
F.close() 
for x in range(len(DATA)) : 
    a = DATA[x] 
    b = a.split(',') 
    DATA[x] = b 
for i in xrange(len(DATA)): 
    for j in xrange(len(DATA[i])): 
     DATA[i][j] = float(DATA[i][j]) 
print DATA[0] 
X_train = np.mat(DATA) 
print "X_train\n",X_train 

mpl.scatter(X_train[:, 0], X_train[:, 1], c='white') 
mpl.show()

scatter plot made by Matplotlib 和结果印刷X_train的（这样你就可以验证数据是相同的） enter image description here 用的Excell：

数据：（我不能把所有的数据，请告诉我如何加入* .txt文件〜40.5柯）

0.02753547770433 -0.037999362802379 
0.05179194064903 0.0257492713593311 
-0.0272928319004863 0.0065143681863637 
0.0891355504379135 -0.00801696955147688 
0.0946809371499167 -0.00502202338807476 
-0.0445799941736001 -0.0435759273767196 
-0.333617999778119 -0.204222004815357 
-0.127212025425053 -0.110264460064754 
-0.0243459270896855 -0.0622273166478512 
0.0497080821876597 0.0272080474151131 
-0.181221703468915 -0.134945934382777 
-0.0699503258694739 -0.0835239795690277

编辑：所以我还远销PCA数据（从SciPy的）到一个文本文件，并打开与Python/matplotlib和R这个常见的文本文件，以避免与PCA一些prblms。绘图后处理（和PCA看起来像一个圆顶之前的图）

edit2：使用numpy.loadtxt（），它显示为R但我的自定义方法和numpy.loadtxt（）提供相同的数据形状，大小，类型和价值观，那么涉及的机制是什么？

X_train numpy.loadtxt() 
[[ 0.02753548 -0.03799936] 
[ 0.05179194 0.02574927] 
[-0.02729283 0.00651437] 
..., 
[ 0.02670961 -0.00696177] 
[ 0.09011859 -0.00661216] 
[-0.04406559 0.09285291]] 
shape and size 
(1039L, 2L) 2078 

X_train custom-method 
[[ 0.02753548 -0.03799936] 
[ 0.05179194 0.02574927] 
[-0.02729283 0.00651437] 
..., 
[ 0.02670961 -0.00696177] 
[ 0.09011859 -0.00661216] 
[-0.04406559 0.09285291]] 
shape and size 
(1039L, 2L) 2078

来源

2013-04-18 sol

你在R中使用了什么功能？ –

'这里是一些PCA的两个适合评分的结果'我确定问题出现在PCA（或在您的输入中），而不是绘图。你能提供一个可重复的例子吗？ –

你可以发布数据吗？ –

问题是，您将X_train表示为矩阵而不是2维数组。这意味着当你用X_train[:, 0]对它进行子集化时，你没有得到一维数组 - 你得到一列有一列的矩阵（然后matplotlib试图分散）。您可以通过打印X_train[:, 0]看到自己*

您可以通过更改线路简单地解决这个问题。

X_train = np.mat(DATA)

到

X_train = np.array(DATA)

*例如，在数据你发布了，X_train[:, 0]是：

[[ 0.02753548] 
[ 0.05179194] 
[-0.02729283] 
[ 0.08913555] 
[ 0.09468094] 
[-0.04457999] 
[-0.333618 ] 
[-0.12721203] 
[-0.02434593] 
[ 0.04970808] 
[-0.1812217 ] 
[-0.06995033]]

来源

2013-04-18 19:00:15

感谢您的明确解释！我将看看矩阵和二维数组之间的区别Best，Sol ;-) – sol

在我看来，问题是我读取数组中的代码。你得到了错误的维度。尝试使用numpy.loadtxt来代替。 http://docs.scipy.org/doc/numpy/reference/generated/numpy.loadtxt.html

来源

2013-04-18 18:16:29

很好用！但是，当我打印数据表时，可能有什么机制，它们具有相同的形状，类型和值？是否在转换成numpy矩阵的过程中？ X_train numpy.loadtxt（） [[0.02753548 -0.03799936] [0.05179194 0.02574927] [-0.02729283 0.00651437] ...， [0.02670961 -0.00696177] [0.09011859 -0.00661216] [-0.04406559 0.09285291]]（1039L ，2L）2078 X_train定制方法 [[0.02753548 -0.03799936] [0.05179194 0.02574927] [-0.02729283 0.00651437] ...， [0.02670961 -0.00696177] [0.09011859 -0.00661216] [-0.04406559 0.09285291] ]（1039L，2L）2078 – sol

@sol：查看我的答案，解释为什么 - 这是因为它是一个矩阵，而不是二维数组，并且对矩阵进行子集化可以得到另一个矩阵，而不是一维数组。 –

在Matplotlib中绘制瑕疵

回答

相关问题