R foreach loop - package load fails

BizForecaster Source

I'm unable to load any packages to parallel processes in a foreach %dopar% loop.

I successfully create 4 SOCK clusters using the foreach and doSNOW packages, then try to run a trivial parallel process. It works with %dopar% without using any packages, but loading an arbitrary package throws an error.

Sample code below, followed by session info showing the packages in use.

I have used this type of code before without issues. Some recent network changes at my company meant I had to change some settings (default library path, etc), may be related to that change. Not sure where to start in troubleshooting what the issue is - any help greatly appreciated!

#load foreach and doSNOW packages, setup 4 clusters#

> require(foreach)

> require(doSNOW)

> registerDoSNOW(makeCluster(4, type = "SOCK"))

> getDoParWorkers()
[1] 4

> getDoParName()
[1] "doSNOW"

# %dopar% loop without loading any packages -- works OK #

> foreach(i=1:2) %dopar%

{
    i+1
}

[[1]]
[1] 2

[[2]]
[1] 3

# %dopar% loop with loading a package -- does not work #

> foreach(i=1:2,.packages="forecast") %dopar%

  {

      i+1

  }

Error in e$fun(obj, substitute(ex), parent.frame(), e$data) : 
  worker initialization failed: package or namespace load failed for        'forecast'

Session Info: R version 3.2.1 (2015-06-18) Platform: x86_64-w64-mingw32/x64 (64-bit) Running under: Windows 7 x64 (build 7601) Service Pack 1

locale: [1] LC_COLLATE=English_United States.1252 LC_CTYPE=English_United States.1252 LC_MONETARY=English_United States.1252 LC_NUMERIC=C
[5] LC_TIME=English_United States.1252

attached base packages: [1] stats graphics grDevices utils datasets methods base

other attached packages: [1] WriteXLS_3.6.1 rlist_0.4 timeSeries_3012.99 reshape2_1.4.1 plyr_1.8.3 lubridate_1.3.3 lmtest_0.9-34 lattice_0.20-31 knitr_1.10.5
[10] hts_4.5 Matrix_1.2-1 SparseM_1.6 ggplot2_1.0.1 data.table_1.9.4 car_2.0-25 forecast_6.1 timeDate_3012.100 zoo_1.7-12
[19] doSNOW_1.0.12 snow_0.3-13 iterators_1.0.7 foreach_1.4.2

loaded via a namespace (and not attached): [1] Rcpp_0.11.6 compiler_3.2.1 nloptr_1.0.4 tseries_0.10-34 tools_3.2.1 lme4_1.1-8 digest_0.6.8 memoise_0.2.1 gtable_0.1.2 nlme_3.1-120
[11] mgcv_1.8-6 parallel_3.2.1 proto_0.3-10 stringr_1.0.0 grid_3.2.1 nnet_7.3-9 minqa_1.2.4 magrittr_1.5 scales_0.2.5 codetools_0.2-11 [21] MASS_7.3-40 splines_3.2.1 pbkrtest_0.4-2 colorspace_1.2-6 fracdiff_1.4-2 quantreg_5.11 quadprog_1.5-5 stringi_0.5-5 munsell_0.4.2 chron_2.3-47

Other info that may be useful- .libPaths() location and the details from the makeCluster function.

> .libPaths()
 [1] "C:/Users/G082580/Documents/My Documents/R/R-3.2.1" "C:/R-3.2.1/library"                               

> makeCluster(4,type="SOCK",manual=TRUE)

Manually start worker on localhost with C:/R-32~1.1/bin/Rscript.exe "C:/Users/G082580/Documents/My Documents/R/R-3.2.1/snow/RSOCKnode.R" MASTER=localhost PORT=11535 OUT=/dev/null SNOWLIB=C:/Users/G082580/Documents/My Documents/R/R-3.2.1

rforeachparallel-processing

Answers

answered 3 years ago Steve Weston #1

If you need to use the .libPaths function in order to load packages on the master, you'll need to call it on the workers as well. This example uses the clusterCall function to initialize the workers to be the same as the master:

library(doSNOW)
cl <- makeSOCKcluster(4)
registerDoSNOW(cl)
clusterCall(cl, function(x) .libPaths(x), .libPaths())

comments powered by Disqus