2014-03-31 13 views
0

我在尝试R代码与雪包。 我有功能使用clusterApply在雪包?

小鬼<功能全(X,Y)

我如何使用clusterApply这个功能吗?

cl <- makeCluster(c("localhost","localhost"), type = "SOCK") 
clusterApply(cl, 1:6, get("+"), 3) 
stopCluster(cl) 

,而不是这个,我想用我的功能

cl <- makeCluster(c("localhost","localhost"), type = "SOCK") 
clusterApply(cl, imp(dataset,3), 3) 
stopCluster(cl) 

想这是我的功能我如何能在使用此功能并行和分布式系统运行..

impap<-function(x,y) 
{ 
data<-as(x,"matrix") 

t<-data+y 

print(t) 
} 
+0

你试图在多台机器上运行它还是在同一台机器的多个内核上并行运行? –

+0

多台机器.. – vct

+0

@LucasFortini我认为makecluster部分会改变,当我使用多台机器..我对吗? – vct

回答

1

我倾向于喜欢并行和分布式计算的降雪。这是一个通用的代码,在两种情况下都能很好地进行并行处理,并且还可以为每个实例输出日志文件,以获得更好的进度和错误跟踪。

rm(list = ls()) #remove all past worksheet variables 
n_projs=5 #this is the number of iterations. Each of them gets sent to an available CPU core 
proj_name_root="model_run_test" 
proj_names=paste0(proj_name_root,"__",c(1:n_projs)) 

#FUNCTION TO RUN 
project_exec=function(proj_name){ 
    cat('starting with', proj_name, '\n') 
    ##ADD CODE HERE 
    cat('done with ', proj_name, '\n') 
} 

require(snowfall) 
# Init Snowfall with settings from sfCluster 
cpucores=as.integer(Sys.getenv('NUMBER_OF_PROCESSORS')) 

#TWO WAYS TO RUN (CLUSTER OR SINGLE MACHINE) 
#hosts=c(commandArgs(TRUE)) #list of strings with computer names in cluster 
sfInit(socketHosts=hosts, parallel=T, cpus=cpucores, type="SOCK", slaveOutfile="/home/cluster_user/output.log") 

##BELOW IS THE CODE IF YOU ARE RUNNING PARALLEL IN THE SAME MACHINE (MULTI CORE) 
#sfInit(parallel=T, cpus=cpucores) #This is where you would need to configure snowfall to create a cluster with the AWS instances 

#sfLibrary(sp) ##import libraries used in your function here into your snowfall instances 
sfExportAll() 
all_reps=sfLapply(proj_names,fun=project_exec) 
sfRemoveAll() 
sfStop() 
+0

在我的项目代码中做了哪些修改? – vct

+0

我可以在这里添加我的代码吗?.... sink(文件(paste0(working_dir,proj_name,“_log.txt”),open =“wt”))#log文件进度 cat('starting with',proj_name ,'\ n') ##在此处添加代码 cat('done with',proj_name,'\ n') #Stop sinks – vct

+0

请问您可以举个例子,而不是放置名称吗? – vct