2015-06-12 64 views
-2

我在做任何更改之前汇总表。然后,我提取排除具有空值和“I或II NOS”的数据并分别分配给a1和a2的数据。 a1拥有正确的数据。但它表明a2仍然有4个“I或II NOS”数据。当我试图索引原表的“I或II NOS”数据时,它给出10行,但4行的值不是“I或II NOS”。这是如何发生的?有人能帮助我吗?我没有足够的声誉来粘贴结果screenprint图片,所以我只粘贴代码。提前致谢。索引值在r中不匹配

a1 = a[AJCC_PATHOLOGIC_TUMOR_STAGE!='',] 

a2 = a1[AJCC_PATHOLOGIC_TUMOR_STAGE!='I or II NOS',] 

对不起,我更新了问题并粘贴了整个代码。

library("cgdsr", lib.loc="~/R/win-library/3.1") 
library("R.oo", lib.loc="~/R/win-library/3.1") 
library("R.methodsS3", lib.loc="~/R/win-library/3.1") 
# Create CGDS object 
mycgds = CGDS("http://www.cbioportal.org/public-portal/") 
test(mycgds) 
# Get list of cancer studies at server 
getCancerStudies(mycgds)[, c(1,2)] 

mycancerstudy = getCancerStudies(mycgds)[78,1] 
# Get available case lists (collection of samples) for a given cancer study 
getCaseLists(mycgds,mycancerstudy)[,1] 

mycaselist = getCaseLists(mycgds,mycancerstudy)[2,1] 

# Get available genetic profiles 
getGeneticProfiles(mycgds,mycancerstudy)[,1] 

mygeneticprofile = getGeneticProfiles(mycgds,mycancerstudy)[2,1] 

# Get clinical data for the case list 
myclinicaldata = getClinicalData(mycgds,mycaselist) 

# skcm_tcga_rna_seq_v2_mrna_median_Zscores 
z_score_caselist = getCaseLists(mycgds,mycancerstudy)[7,1] 

# Get data slices for a specified list of genes, genetic profile and case list 
WNT5A = getProfileData(mycgds,c('WNT5A'),mygeneticprofile,mycaselist) 

# documentation 
help('cgdsr') 
help('CGDS') 

WNT5A_stage = merge(WNT5A,myclinicaldata, by = 'row.names') 
WNT5A_stage_table = WNT5A_stage[, c(2, 6)] 
a = na.omit(WNT5A_stage_table) 
a1 = a[a$AJCC_PATHOLOGIC_TUMOR_STAGE!=''] 
a2 = a1[AJCC_PATHOLOGIC_TUMOR_STAGE!='I or II NOS',] 

只是更新部分结果如下。您可以看到该值与索引不同。

>a1[AJCC_PATHOLOGIC_TUMOR_STAGE=='I or II NOS',] 
     WNT5A   AJCC_PATHOLOGIC_TUMOR_STAGE 
    8  712.1645     I or II NOS 
28  7.5434     I or II NOS 
33  3.6290     I or II NOS 
34  8.7881     I or II NOS 
38 150.3167     I or II NOS 
47  34.3643     I or II NOS 
180 19.1529     Stage IB 
304 20.1072     Stage IIC 
324 44.0167     Stage IB 
337 19142.6676     Stage IIIC 
+2

你需要给一些样本数据和可能的重复的例子。 –

+0

但是,它看起来好像这是一个标准数据框,您实际上并未适当地进行子设置。您需要'a2 < - a1 [a1 $ AJCC_PATHOLOGIC_TUMOR_STAGE!='I或II NOS',]''或使用'subset' –

+0

需要示例数据,代码和预期/实际输出以完全帮助您解决问题。 –

回答

0

正如我的评论所指出的那样,您不是使用新数据框中的列进行子设置。您需要:

a2 = a1[a1$AJCC_PATHOLOGIC_TUMOR_STAGE!='I or II NOS',] 

a2 = subset(a1, AJCC_PATHOLOGIC_TUMOR_STAGE != 'I or II NOS')