从knitr儿童文档中剥离YAML

我正在通过jekyll编写一个关于rmarkdown的相关文档，我将编译成一个网站。在这样做的过程中，我遇到了一个问题：从knitr儿童文档中剥离YAML

我使用的一些Rmd文件调用其他Rmd文件作为子文档。当我使用knitr进行渲染时，生成的文档包含来自父文档和子文档的yaml前端问题。下面给出一个例子。

到目前为止，我没有看到任何方式来指定当文档是Rmd时只有子文档的一部分。是否有人知道在knit（）期间读入父Rmd时可以将子文档从子文档中删除的方法？

我很乐意考虑R之外的答案，最好是我可以嵌入到rake文件中的东西。尽管如此，我并不想永久性地修改子文档。所以剥离洋葱不可能是永久性的。最后，在YAML从文件到文件长度发生变化，所以我猜，任何解决方案必须能够找到YAML开始和结束中美战略经济对话的regex/grep的/的/ etc ...

例：

%%%% Parent_Doc.rmd %%%%

--- 
title: parent doc 
layout: default 
etc: etc 
--- 
This is the parent... 

```{r child import, child="./child_doc."} 
```

%%%% child_doc.rmd %%%%

--- 
title: child doc 
layout: default 
etc: etc 
--- 

lorem ipsum etc

%%%% output.md %%%%

--- 
title: parent doc 
layout: default 
etc: etc 
--- 
This is the parent... 
--- 
title: child doc 
layout: default 
etc: etc 
--- 

lorem ipsum etc

%%%%理想Output.md %%%%

--- 
title: parent doc 
layout: default 
etc: etc 
--- 
This is the parent... 

lorem ipsum etc

来源

2014-05-13 Tom

我可以认为这是在下一个版本的knitr中的功能请求，如果你将它提交到https://github.com/yihui/knitr/issues –

@Yihuri：我会提出一个功能请求，但它是可能不值得你的结局。我的用例可能相当具体。谢谢你的回应。 – Tom

不是。我不介意小功能请求:) –

与此同时，也许下面就为你工作;这是一种丑陋和低效的解决方法（我对编程者来说是新手，而不是一个真正的程序员），但它实现了我相信你想要做的事情。

我写了一个function类似的个人用途，其中包括以下relevant bit;原来是在西班牙，所以我把它翻译如下一些：

extraction <- function(matter, escape = FALSE, ruta = ".", patron) { 

    require(yaml) 

    # Gather together directory of documents to be processed 

    doc_list <- list.files(
    path = ruta, 
    pattern = patron, 
    full.names = TRUE 
    ) 

    # Extract desired contents 

    lapply(
    X = doc_list, 
    FUN = function(i) { 
     raw_contents <- readLines(con = i, encoding = "UTF-8") 

     switch(
     EXPR = matter, 

     # !YAML (e.g., HTML) 

     "no_yaml" = { 

      if (escape == FALSE) { 

      paste(raw_contents, sep = "", collapse = "\n") 

      } else if (escape == TRUE) { 

      require(XML) 
      to_be_escaped <- paste(raw_contents, sep = "", collapse = "\n") 
      xmlTextNode(value = to_be_escaped) 

      } 

     }, 

     # YAML header and Rmd contents 

     "rmd" = { 
      yaml_pattern <- "[-]{3}|[.]{3}" 
      limits_yaml <- grep(pattern = yaml_pattern, x = raw_contents)[1:2] 
      indices_yaml <- seq(
      from = limits_yaml[1] + 1, 
      to = limits_yaml[2] - 1 
      ) 
      yaml <- mapply(
      FUN = function(i) {yaml.load(string = i)}, 
      raw_contents[indices_yaml], 
      USE.NAMES = FALSE 
      ) 
      indices_rmd <- seq(
      from = limits_yaml[2] + 1, 
      to = length(x = raw_contents) 
      ) 
      rmd<- paste(raw_contents[indices_rmd], sep = "", collapse = "\n") 
      c(yaml, "contents" = rmd) 
     }, 

     # Anything else (just in case) 

     { 
      stop("Matter not extractable") 
     } 

    ) 

    } 
    ) 

}

说我的主要RMD文件main.Rmd生活my_directory和我的孩子文件，01-abstract.Rmd，02-intro.Rmd，...，06-conclusion.Rmd被安置在./sections;请注意，对于我的业余功能，最好将子文档按照它们将被传入主文档的顺序保存（见下文）。我有我的功能extraction.R在./assets。这是我的例子目录结构：

. 
+--assets 
| +--extraction.R 
+--sections 
| +--01-abstract.Rmd 
| +--02-intro.Rmd 
| +--03-methods.Rmd 
| +--04-results.Rmd 
| +--05-discussion.Rmd 
| +--06-conclusion.Rmd 
+--stats 
| +--analysis.R 
+--main.Rmd

在main.Rmd导入我的子文档从./sections：

--- 
title: Main 
author: me 
date: Today 
output: 
    html_document 
--- 

```{r, 'setup', include = FALSE} 
opts_chunk$set(autodep = TRUE) 
dep_auto() 
``` 

```{r, 'import_children', cache = TRUE, include = FALSE} 
source('./assets/extraction.R') 
rmd <- extraction(
    matter = 'rmd', 
    ruta = './sections', 
    patron = "*.Rmd" 
) 
``` 

# Abstract 

```{r, 'abstract', echo = FALSE, results = 'asis'} 
cat(x = rmd[[1]][["contents"]], sep = "\n") 
``` 

# Introduction 

```{r, 'intro', echo = FALSE, results = 'asis'} 
cat(x = rmd[[2]][["contents"]], sep = "\n") 
``` 

# Methods 

```{r, 'methods', echo = FALSE, results = 'asis'} 
cat(x = rmd[[3]][["contents"]], sep = "\n") 
``` 

# Results 

```{r, 'results', echo = FALSE, results = 'asis'} 
cat(x = rmd[[4]][["contents"]], sep = "\n") 
``` 

# Discussion 

```{r, 'discussion', echo = FALSE, results = 'asis'} 
cat(x = rmd[[5]][["contents"]], sep = "\n") 
``` 

# Conclusion 

```{r, 'conclusion', echo = FALSE, results = 'asis'} 
cat(x = rmd[[6]][["contents"]], sep = "\n") 
``` 

# References

我再编织这个文件，只有我的子文档的内容纳入到其中，例如：

--- 
title: Main 
author: me 
date: Today 
output: 
    html_document 
--- 





# Abstract 


This is **Child Doc 1**, my abstract. 

# Introduction 


This is **Child Doc 2**, my introduction. 

- Point 1 
- Point 2 
- Point *n* 

# Methods 


This is **Child Doc 3**, my "Methods" section. 

| method 1 | method 2 | method *n* | 
|---------------|---------------|----------------| 
| fffffffffffff | fffffffffffff | fffffffffffff d| 
| fffffffffffff | fffffffffffff | fffffffffffff d| 
| fffffffffffff | fffffffffffff | fffffffffffff d| 

# Results 


This is **Child Doc 4**, my "Results" section. 

## Result 1 

```{r} 
library(knitr) 
``` 

```{r, 'analysis', cache = FALSE} 
source(file = '../stats/analysis.R') 
``` 

# Discussion 


This is **Child Doc 5**, where the results are discussed. 

# Conclusion 


This is **Child Doc 6**, where I state my conclusions. 

# References

前述文件是main.Rmd针织版本，即main.md。注意## Result 1在我的孩子文档04-results.Rmd中，我提供了一个外部R脚本，./stats/analysis.R，它现在被编入我的针织文档中作为一个新的编织块;因此，我现在需要再次编织文件。

当子文档还包括块，而不是编织成.md我编织的主要文件到另一个.Rmd很多次我都块嵌套，例如，继续上面的例子：

使用knit(input = './main.Rmd', output = './main_2.Rmd')，而不是编织main.Rmd到main.md，我会编织成另一个.RMD，以便能够编织包含新导入的块的结果文件，例如，我的R脚本analysis.R上面。
我现在可以将我的main_2.Rmd编织成main.md或通过rmarkdown::render(input = './main_2.Rmd', output_file = './main.html')呈现为main.html。

注意：在上面的main.md的例子中，路径到我的[R脚本是../stats/analysis.R。这是相对于源自它的子文档的路径，./sections/04-results.Rmd。一旦我将子文档导入位于my_directory根目录的主文档，即./main.md或./main_2.Rmd，路径就会出错;因此，我必须在下一针织之前手动将其更正为./stats/analysis.R。

我在上面提到过，最好将子文档保存为与导入到主文档中的顺序相同的顺序。这是因为我的简单功能extraction()只是将指定给它的所有文件的内容存储在一个未命名的列表中，因此我必须通过编号访问main.Rmd中的每个文件，即rmd[[5]][["contents"]]指的是子文档./sections/05-discussion.Rmd;考虑：

> str(rmd) 
List of 6 
$ :List of 4 
    ..$ title  : chr "child doc 1" 
    ..$ layout : chr "default" 
    ..$ etc  : chr "etc" 
    ..$ contents: chr "\nThis is **Child Doc 1**, my abstract." 
$ :List of 4 
    ..$ title  : chr "child doc 2" 
    ..$ layout : chr "default" 
    ..$ etc  : chr "etc" 
    ..$ contents: chr "\nThis is **Child Doc 2**, my introduction.\n\n- Point 1\n- Point 2\n- Point *n*" 
$ :List of 4 
    ..$ title  : chr "child doc 3" 
    ..$ layout : chr "default" 
    ..$ etc  : chr "etc" 
    ..$ contents: chr "\nThis is **Child Doc 3**, my \"Methods\" section.\n\n| method 1 | method 2 | method *n* |\n|--------------|--------------|----"| __truncated__ 
$ :List of 4 
    ..$ title  : chr "child doc 4" 
    ..$ layout : chr "default" 
    ..$ etc  : chr "etc" 
    ..$ contents: chr "\nThis is **Child Doc 4**, my \"Results\" section.\n\n## Result 1\n\n```{r}\nlibrary(knitr)\n```\n\n```{r, cache = FALSE}\nsour"| __truncated__ 
$ :List of 4 
    ..$ title  : chr "child doc 5" 
    ..$ layout : chr "default" 
    ..$ etc  : chr "etc" 
    ..$ contents: chr "\nThis is **Child Doc 5**, where the results are discussed." 
$ :List of 4 
    ..$ title  : chr "child doc 6" 
    ..$ layout : chr "default" 
    ..$ etc  : chr "etc" 
    ..$ contents: chr "\nThis is **Child Doc 6**, where I state my conclusions."

所以，extraction()这里实际上是两个存储指定的子文档中的R降价内容，以及他们YAML，如果你有这方面的一个应用，以及（我自己做的）。

来源

2014-06-25 01:18:33 user109114

从knitr儿童文档中剥离YAML

回答

相关问题