0
试图快速进入以下页面上MTA旋转门数据:EOF错误?
http://web.mta.info/developers/turnstile.html
我已经打算通过页码循环运行的fread或download.file来存储数据和绑定,但在一些我得到的文件和错误。这里有两个例子,一个可行,一个不可行。我注意到第二个文件看起来有点不同:
test_mta_works = fread("http://web.mta.info/developers/data/nyct/turnstile/turnstile_161224.txt", sep = ',')
test_mta_wont_work = fread("http://web.mta.info/developers/data/nyct/turnstile/turnstile_140419.txt", sep = ',')
错误我收到的第二个:
Error in fread("http://web.mta.info/developers/data/nyct/turnstile/turnstile_140419.txt", :
Expected sep (',') but new line, EOF (or other non printing character) ends field 12 when detecting types from point 0: A002,R051,02-00-00,04-18-14,16:00:00,REGULAR,004575433,001558298,04-18-14,20:00:00,REGULAR,004575838,001558374
任何想法的问题可能是和/或如何解决这个问题?我尝试使用fill = T
,但它造成了数据问题。
谢谢!
编辑
使用补= T我得到输出中为以下时:
V1 V2 V3 V4 V5 V6 V7 V8 V9 V10 V11 V12 V13 V14 V15 V16 V17 V18 V19 V20
1: A002 R051 02-00-00 04-12-14 00:00:00 REGULAR 4566812 1555499 04-12-14 04:00:00 REGULAR 4566850 1555508 04-12-14 08:00:00 REGULAR 4566875 1555536 04-12-14 12:00:00
2: A002 R051 02-00-00 04-13-14 08:00:00 REGULAR 4567968 1555789 04-13-14 12:00:00 REGULAR 4568069 1555842 04-13-14 16:00:00 REGULAR 4568278 1555903 04-13-14 20:00:00
3: A002 R051 02-00-00 04-14-14 16:00:00 REGULAR 4569148 1556362 04-14-14 20:00:00 REGULAR 4569786 1556420 04-15-14 00:00:00 REGULAR 4569949 1556447 04-15-14 04:00:00
4: A002 R051 02-00-00 04-16-14 00:00:00 REGULAR 4571423 1556965 04-16-14 04:00:00 REGULAR 4571442 1556966 04-16-14 08:00:00 REGULAR 4571486 1557049 04-16-14 12:00:00
5: A002 R051 02-00-00 04-17-14 08:00:00 REGULAR 4573294 1557587 04-17-14 12:00:00 REGULAR 4573469 1557848 04-17-14 16:00:00 REGULAR 4573800 1557901 04-17-14 20:00:00
6: A002 R051 02-00-00 04-18-14 16:00:00 REGULAR 4575433 1558298 04-18-14 20:00:00 REGULAR 4575838 1558374 NA NA
同时,这并不第一个文件需要填写= T给了我下面的:
C/A UNIT SCP STATION LINENAME DIVISION DATE TIME DESC ENTRIES EXITS
1: A002 R051 02-00-00 59 ST NQR456W BMT 12/17/2016 03:00:00 REGULAR 5967477 2022101
2: A002 R051 02-00-00 59 ST NQR456W BMT 12/17/2016 07:00:00 REGULAR 5967485 2022116
3: A002 R051 02-00-00 59 ST NQR456W BMT 12/17/2016 11:00:00 REGULAR 5967553 2022233
4: A002 R051 02-00-00 59 ST NQR456W BMT 12/17/2016 15:00:00 REGULAR 5967790 2022331
5: A002 R051 02-00-00 59 ST NQR456W BMT 12/17/2016 19:00:00 REGULAR 5968186 2022421
当使用'fill = T'时,您对数据有什么问题?我能够使用'fill'参数读取数据,它对我来说看起来很好。 – krish
添加上面的输出作为编辑 – LoF10
test_mta_wont_work = fread(“http://web.mta.info/developers/data/nyct/turnstile/turnstile_140419.txt”,sep =',',fill = TRUE,na.strings =“”,NA) – krish