2013-02-01 110 views
24

是否有一种简单的方法可以获得给定软件包的R软件包依赖关系(所有递归依赖关系)列表,而无需安装软件包及其依赖关系?类似于伪装在portupgrade或apt中的东西。清单R无需安装软件包的软件包依赖关系

+3

'工具:: dependsOnPkgs' – hadley

+3

谢谢,这会的救了我一些时间:),因为它不是在文档中明确,对于一个例子假设ggplot是dependsOnPkgs(“ggplot2”,installed = available.packages()) –

+0

如果有某个辅助函数('utils','tools'?)从本地'DESCRIPTION'非递归地提取所有的deps文件,那么它将很高兴地发布为答案。否则,在'read.dcf'提取各种dep类型和剥离空格的包装器可以实现这一点。 – jangorecki

回答

28

您可以使用available.packages函数的结果。例如,看什么ggplot2取决于:

pack <- available.packages() 
pack["ggplot2","Depends"] 

其中给出:

[1] "R (>= 2.14), stats, methods" 

请注意,这取决于你想要达到的目标,你可能需要检查Imports场了。

+0

酷 - 我总是喜欢发现有用的工具。可悲的是,这对于我们这些被困在公司防火墙后面的人来说并不适用。我们可能会停滞不前,像'browseURL('http://cran.r-project.org/web/packages/package.name')' –

+0

谢谢,这有很大的帮助,我确实改变了问题范围,但通过递归搜索取决于和导入的列表,我可以构建出完整的列表。 –

+0

@CarlWitthoft如果你在windows上,'setInternet2()'可能会有所帮助。 – hadley

5

我没有安装R,我需要找出哪些R软件包依赖于在我公司使用的R软件包列表。

我写了一个bash脚本,它遍历文件中R包的列表,并递归发现依赖关系。

脚本使用名为的文件rinput_orig.txt作为输入(下面的示例)。该脚本将在创建文件时创建一个名为rinput.txt的文件。

该脚本将创建下列文件:

  • rdepsfound.txt - 发现解释依赖关系包括R包中依赖于它(下面的例子)。
  • routput.txt - 列出所有R软件包(来自原始列表和依赖项列表)以及许可证和CRAN URL(以下示例)。
  • r404.txt - 尝试卷曲时收到404的R软件包列表。如果您的原始列表有任何错别字,这很方便。

bash脚本:

#!/bin/bash 

# CLEANUP 
rm routput.txt 
rm rdepsfound.txt 
rm r404.txt 

# COPY ORIGINAL INPUT TO WORKING INPUT 
cp rinput_orig.txt rinput.txt 

IFS="," 
while read PACKAGE; do 
    echo Processing $PACKAGE... 

    PACKAGEURL="http://cran.r-project.org/web/packages/${PACKAGE}/index.html" 

    if [ `curl -o /dev/null --silent --head --write-out '%{http_code}\n' ${PACKAGEURL}` != 404 ]; then 
     # GET LICENSE INFO OF PACKAGE 
     LICENSEINFO=$(curl ${PACKAGEURL} 2>/dev/null | grep -A1 "License:" | grep -v "License:" | gawk 'match($0, /<a href=".*">(.*)<\/a>/, a) {print a[0]}' | sed "s/|/,/g" | sed "s/+/,/g") 
     for x in ${LICENSEINFO[*]} 
     do 
      # SAVE LICENSE 
      LICENSE=$(echo ${x} | gawk 'match($0, /<a href=".*">(.*)<\/a>/, a) {print a[1]}') 
      break 
     done 

     # WRITE PACKAGE AND LICENSE TO OUTPUT FILE 
     echo $PACKAGE $LICENSE $PACKAGEURL >> routput.txt 

     # GET DEPENDENCIES OF PACKAGE 
     DEPS=$(curl ${PACKAGEURL} 2>/dev/null | grep -A1 "Depends:" | grep -v "Depends:" | gawk 'match($0, /<a href=".*">(.*)<\/a>/, a) {print a[0]}') 
     for x in ${DEPS[*]} 
     do 
      FOUNDDEP=$(echo "${x}" | gawk 'match($0, /<a href=".*">(.*)<\/a>/, a) {print a[1]}' | sed "s/<\/span>//g") 
      if [ "$FOUNDDEP" != "" ]; then 
       echo Found dependency $FOUNDDEP for $PACKAGE... 
       grep $FOUNDDEP rinput.txt > /dev/null 
       if [ "$?" = "0" ]; then 
        echo $FOUNDDEP already exists in package list... 
       else 
        echo Adding $FOUNDDEP to package list... 
        # SAVE FOUND DEPENDENCY BACK TO INPUT LIST 
        echo $FOUNDDEP >> rinput.txt 
        # SAVE FOUND DEPENDENCY TO DEPENDENCY LIST FOR EASY VIEWING OF ALL FOUND DEPENDENCIES 
        echo $FOUNDDEP is a dependency of $PACKAGE >> rdepsfound.txt 
       fi 
      fi 
     done 
    else 
     echo Skipping $PACKAGE because 404 was received... 
     echo $PACKAGE $PACKAGEURL >> r404.txt 
    fi 

done < rinput.txt 
echo -e "\nRESULT:" 
sort -u routput.txt 

例rinput_orig.txt:

shiny 
rmarkdown 
xtable 
RODBC 
RJDBC 
XLConnect 
openxlsx 
xlsx 
Rcpp 

例控制台输出运行脚本时:

Processing shiny... 
Processing rmarkdown... 
Processing xtable... 
Processing RODBC... 
Processing RJDBC... 
Found dependency DBI for RJDBC... 
Adding DBI to package list... 
Found dependency rJava for RJDBC... 
Adding rJava to package list... 
Processing XLConnect... 
Found dependency XLConnectJars for XLConnect... 
Adding XLConnectJars to package list... 
Processing openxlsx... 
Processing xlsx... 
Found dependency rJava for xlsx... 
rJava already exists in package list... 
Found dependency xlsxjars for xlsx... 
Adding xlsxjars to package list... 
Processing Rcpp... 
Processing DBI... 
Processing rJava... 
Processing XLConnectJars... 
Processing xlsxjars... 
Found dependency rJava for xlsxjars... 
rJava already exists in package list... 

例rdepsfound.txt:

DBI is a dependency of RJDBC 
rJava is a dependency of RJDBC 
XLConnectJars is a dependency of XLConnect 
xlsxjars is a dependency of xlsx 

例routput.txt:

shiny GPL-3 http://cran.r-project.org/web/packages/shiny/index.html 
rmarkdown GPL-3 http://cran.r-project.org/web/packages/rmarkdown/index.html 
xtable GPL-2 http://cran.r-project.org/web/packages/xtable/index.html 
RODBC GPL-2 http://cran.r-project.org/web/packages/RODBC/index.html 
RJDBC GPL-2 http://cran.r-project.org/web/packages/RJDBC/index.html 
XLConnect GPL-3 http://cran.r-project.org/web/packages/XLConnect/index.html 
openxlsx GPL-3 http://cran.r-project.org/web/packages/openxlsx/index.html 
xlsx GPL-3 http://cran.r-project.org/web/packages/xlsx/index.html 
Rcpp GPL-2 http://cran.r-project.org/web/packages/Rcpp/index.html 
DBI LGPL-2 http://cran.r-project.org/web/packages/DBI/index.html 
rJava GPL-2 http://cran.r-project.org/web/packages/rJava/index.html 
XLConnectJars GPL-3 http://cran.r-project.org/web/packages/XLConnectJars/index.html 
xlsxjars GPL-3 http://cran.r-project.org/web/packages/xlsxjars/index.html 

我希望这可以帮助别人!

1

另一个简洁明了的解决方案是库packrat的内部函数recursivePackageDependencies。但是,该软件包必须安装在您的机器上的一些库中。优点是它也可以与自制的非CRAN软件包一起使用。例如:

packrat:::recursivePackageDependencies("ggplot2",lib.loc = .libPaths()[1]) 

,并提供:?

[1] "R6"   "RColorBrewer" "Rcpp"   "colorspace" "dichromat" "digest"  "gtable"  
[8] "labeling"  "lazyeval"  "magrittr"  "munsell"  "plyr"   "reshape2"  "rlang"  
[15] "scales"  "stringi"  "stringr"  "tibble"  "viridisLite" 
相关问题