2013-05-22 29 views
0

我有一个奇怪的问题R.到达内存限制R中

我有一个大data.table dataTs1:

Classes ‘data.table’ and 'data.frame': 419172 obs. of 5 variables: 
$ TimeStamp: chr "01MAR13:07:15:00" "01MAR13:07:16:00" "01MAR13:07:18:00" ... 
$ col1  : chr "ALL1" "ALL1" "ALL1" "ALL1" ... 
$ col2  : int NA NA NA NA NA NA NA NA NA NA ... 
$ col3  : int 4 4 4 4 4 4 4 4 4 4 ... 
$ col4  : int 621 810 4 4 8 1 3 1 1 1 ... 

我装使用fread功能此表。

内存分配似乎没问题。

> memory.size(max=TRUE) 
[1] 82.94 

我想修改类的第一线,以POSIX所以我写了:

dataTs1 $时间戳< - strptime(dataTs1 $时间戳,“%d%B%Y:%H :%M:%S“)

而且这条线,我得到达到16G的我的记忆极限......但是当我写:

test <- 1:length(dataTs1$TimeStamp) 
dataTs1$TimeStamp <- test 

它完美的工作,没有任何内存过载。

我对R很新,我很感激如果你能帮我弄清楚我在这里做错了什么。

THX


编辑:

其实我得到一个奇怪的警告,有时当我没有得到一个内存过载:

>dataTs1[,TimeStamp:=strptime(TimeStamp,"%d%b%y:%H:%M:%S")] 
Warning messages: 
1: In `[<-.data.table`(x, j = name, value = value) : 
    Supplied 9 items to be assigned to 419172 items of column 'TimeStamp' (recycled leaving remainder of 6 items). 
2: In `[<-.data.table`(x, j = name, value = value) : 
    Coerced 'list' RHS to 'character' to match the column's type. Either change the target column to 'list' first (by creating a new 'list' vector length 419172 (nrows of entire table) and assign that; i.e. 'replace' column), or coerce RHS to 'character' (e.g. 1L, NA_[real|integer]_, as.*, etc) to make your intent clear and for speed. Or, set the column type correctly up front when you create the table and stick to it, please. 
> str(dataTs1) 
Classes ‘data.table’ and 'data.frame': 419172 obs. of 5 variables: 
$ TimeStamp: chr "c(NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, N"| __truncated__ "c(NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, N"| __truncated__ "c(NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, N"| __truncated__ "c(NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, N"| __truncated__ ... 
$ V6FCDSB : chr "ALL1" "ALL1" "ALL1" "ALL1" ... 
$ V6FCDTD : int NA NA NA NA NA NA NA NA NA NA ... 
$ _TYPE_ : int 4 4 4 4 4 4 4 4 4 4 ... 
$ N  : int 621 810 4 4 8 1 3 1 1 1 ... 
- attr(*, ".internal.selfref")=<externalptr> 
+0

您使用的是哪个版本的R?曾经有'strptime'的内存泄漏。 – James

+0

您应该通过引用来指定:'dataTs1 [,TimeStamp:= strptime(TimeStamp,“%d%b%y:%H:%M:%S”)]' – Roland

+0

@James我使用3.0.0版本 –

回答