2014-12-05 56 views
1

这里是我使用的代码:线函数read.table 15不包含23个要素 - R的

d = read.table("Movies.txt", 
      sep="\t", 
      col.names=c("id", "name", "date", "link", "c1", "c2", "c3","c4", "c5", "c6","c7", "c8", "c9","c10", "c11", "c12","c13", "c14", "c15","c16", "c17", "c18", "c19"), 
      fill=FALSE, 
      strip.white=TRUE) 

,这里是文本文件:

1 Toy Story (1995) 01-Jan-95 http://us.imdb.com/M/title-exact?Toy%20Story%20(1995) 0 0 0 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 
2 GoldenEye (1995) 01-Jan-95 http://us.imdb.com/M/title-exact?GoldenEye%20(1995) 0 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 
3 Four Rooms (1995) 01-Jan-95 http://us.imdb.com/M/title-exact?Four%20Rooms%20(1995) 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 
4 Get Shorty (1995) 01-Jan-95 http://us.imdb.com/M/title-exact?Get%20Shorty%20(1995) 0 1 0 0 0 1 0 0 1 0 0 0 0 0 0 0 0 0 0 
5 Copycat (1995) 01-Jan-95 http://us.imdb.com/M/title-exact?Copycat%20(1995) 0 0 0 0 0 0 1 0 1 0 0 0 0 0 0 0 1 0 0 
6 Shanghai Triad (Yao a yao yao dao waipo qiao) (1995) 01-Jan-95 http://us.imdb.com/Title?Yao+a+yao+yao+dao+waipo+qiao+(1995) 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 
7 Twelve Monkeys (1995) 01-Jan-95 http://us.imdb.com/M/title-exact?Twelve%20Monkeys%20(1995) 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 1 0 0 0 
8 Babe (1995) 01-Jan-95 http://us.imdb.com/M/title-exact?Babe%20(1995) 0 0 0 0 1 1 0 0 1 0 0 0 0 0 0 0 0 0 0 
9 Dead Man Walking (1995) 01-Jan-95 http://us.imdb.com/M/title-exact?Dead%20Man%20Walking%20(1995) 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 
10 Richard III (1995) 22-Jan-96 http://us.imdb.com/M/title-exact?Richard%20III%20(1995) 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 1 0 
11 Seven (Se7en) (1995) 01-Jan-95 http://us.imdb.com/M/title-exact?Se7en%20(1995) 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 1 0 0 
12 "Usual Suspects, The (1995)" 14-Aug-95 "http://us.imdb.com/M/title-exact?Usual%20Suspects,%20The%20(1995)" 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 1 0 0 
13 Mighty Aphrodite (1995) 30-Oct-95 http://us.imdb.com/M/title-exact?Mighty%20Aphrodite%20(1995) 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 
14 "Postino, Il (1994)" 01-Jan-94 "http://us.imdb.com/M/title-exact?Postino,%20Il%20(1994)" 0 0 0 0 0 0 0 0 1 0 0 0 0 0 1 0 0 0 0 
15 Mr. Holland's Opus (1995) 29-Jan-96 http://us.imdb.com/M/title-exact?Mr.%20Holland's%20Opus%20(1995) 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 
16 French Twist (Gazon maudit) (1995) 01-Jan-95 http://us.imdb.com/M/title-exact?Gazon%20maudit%20(1995) 0 0 0 0 0 1 0 0 0 0 0 0 0 0 1 0 0 0 0 
17 From Dusk Till Dawn (1996) 05-Feb-96 http://us.imdb.com/M/title-exact?From%20Dusk%20Till%20Dawn%20(1996) 0 1 0 0 0 1 1 0 0 0 0 1 0 0 0 0 1 0 0 
18 "White Balloon, The (1995)" 01-Jan-95 http://us.imdb.com/M/title-exact?Badkonake%20Sefid%20(1995) 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 
19 Antonia's Line (1995) 01-Jan-95 http://us.imdb.com/M/title-exact?Antonia%20(1995) 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 
20 Angels and Insects (1995) 01-Jan-95 http://us.imdb.com/M/title-exact?Angels%20and%20Insects%20(1995) 0 0 0 0 0 0 0 0 1 0 0 0 0 0 1 0 0 0 0 

很明显,第15行有23个元素。在文本编辑器中,它们应该是标签时更加清晰。为什么我会收到此错误消息?

+0

尝试删除第15行,看看它是否在Antonia's Line上失败。我在想它的撇号 – JasonAizkalns 2014-12-05 18:03:08

+0

我只收到一条警告消息。但是,没有。 19应该是“Antonia's Line(1995)”,但是我得到整个系列的名字如下: Antonias Line(1995)\ t01-Jan-95 \ thttp://us.imdb.com/M/冠军精确?安东尼%20(1995)\ T0 \ T0 \ T0 \ T0 \ T0 \ T0 \ T0 \ T0 \ T1 \ T0 \ T0 \ T0 \ T0 \ T0 \ T0 \ T0 \ T0 \ T0 \ T0 \ N20 \ tAngels and Insects(1995)\ t01-Jan-95 \ thttp://us.imdb.com/M/title-exact?Angels%20and%20Insects%20(1995)\ t0 \ t0 \ t0 \ t0 \ t0 \ t0 \ t0 \ t0 \ t0 \ t0 \ t0 \ t1 \ t0 \ t0 \ t0 \ t0 – 2014-12-05 18:07:52

+0

其余的工作正常。所以Antonia's Line显然是一个问题。但我不能说它是否与“荷兰先生的作品”相同的问题 – 2014-12-05 18:09:20

回答

0

通过将参数quote = ""添加到read.table()来禁用引用。

我认为问题是整个文件中的各种单一的'和双引号"

查看?read.table了解更多信息。根据文档,您还可以查看?scan了解如何处理引号中嵌入的引号行为。