Welcome to OStack Knowledge Sharing Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
153 views
in Technique[技术] by (71.8m points)

r - In `knitr` how can I test for if the output will be PDF or word?

I would like to include specific content based on which format is being created. In this specific example, my tables look terrible in MS word output, but great in HTML. I would like to add in some test to leave out the table depending on the output.

Here's some pseudocode:

output.format <- opts_chunk$get("output")

if(output.format != "MS word"){
print(table1)
}

I'm sure this is not the correct way to use opts_chunk, but this is the limit of my understanding of how knitr works under the hood. What would be the correct way to test for this?

Question&Answers:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Answer

0 votes
by (71.8m points)

Short Answer

In most cases, opts_knit$get("rmarkdown.pandoc.to") delivers the required information.

Otherwise, query rmarkdown::all_output_formats(knitr::current_input()) and check if the return value contains word_document:

if ("word_document" %in% rmarkdown::all_output_formats(knitr::current_input()) {
  # Word output
}

Long answer

I assume that the source document is RMD because this is the usual/most common input format for knitting to different output formats such as MS Word, PDF and HTML.

In this case, knitr options cannot be used to determine the final output format because it doesn't matter from the perspective of knitr: For all output formats, knitr's job is to knit the input RMD file to a MD file. The conversion of the MD file to the output format specified in the YAML header is done in the next stage, by pandoc.

Therefore, we cannot use the package option knitr::opts_knit$get("out.format") to learn about the final output format but we need to parse the YAML header instead.

So far in theory. Reality is a little bit different. RStudio's "Knit PDF"/"Knit HTML" button calls rmarkdown::render which in turn calls knit. Before this happens, render sets a (undocumented?) package option rmarkdown.pandoc.to to the actual output format. The value will be html, latex or docx respectively, depending on the output format.

Therefore, if (and only if) RStudio's "Knit PDF"/"Knit HTML" button is used, knitr::opts_knit$get("rmarkdown.pandoc.to") can be used to determine the output format. This is also described in this answer and that blog post.

The problem remains unsolved for the case of calling knit directly because then rmarkdown.pandoc.to is not set. In this case we can exploit the (unexported) function parse_yaml_front_matter from the rmarkdown package to parse the YAML header.

[Update: As of rmarkdown 0.9.6, the function all_output_formats has been added (thanks to Bill Denney for pointing this out). It makes the custom function developed below obsolete – for production, use rmarkdown::all_output_formats! I leave the remainder of this answer as originally written for educational purposes.]

---
output: html_document
---
```{r}
knitr::opts_knit$get("out.format") # Not informative.

knitr::opts_knit$get("rmarkdown.pandoc.to") # Works only if knit() is called via render(), i.e. when using the button in RStudio.

rmarkdown:::parse_yaml_front_matter(
    readLines(knitr::current_input())
    )$output
```

The example above demonstrates the use(lesness) of opts_knit$get("rmarkdown.pandoc.to") (opts_knit$get("out.format")), while the line employing parse_yaml_front_matter returns the format specified in the "output" field of the YAML header.

The input of parse_yaml_front_matter is the source file as character vector, as returned by readLines. To determine the name of the file currently being knitted, current_input() as suggested in this answer is used.

Before parse_yaml_front_matter can be used in a simple if statement to implement behavior that is conditional on the output format, a small refinement is required: The statement shown above may return a list if there are additional YAML parameters for the output like in this example:

---
output: 
  html_document: 
    keep_md: yes
---

The following helper function should resolve this issue:

getOutputFormat <- function() {
  output <- rmarkdown:::parse_yaml_front_matter(
    readLines(knitr::current_input())
    )$output
  if (is.list(output)){
    return(names(output)[1])
  } else {
    return(output[1])
  }
}

It can be used in constructs such as

if(getOutputFormat() == 'html_document') {
   # do something
}

Note that getOutputFormat uses only the first output format specified, so with the following header only html_document is returned:

---
output:
  html_document: default
  pdf_document:
    keep_tex: yes
---

However, this is not very restrictive. When RStudio's "Knit HTML"/"Knit PDF" button is used (along with the dropdown menu next to it to select the output type), RStudio rearranges the YAML header such that the selected output format will be the first format in the list. Multiple output formats are (AFAIK) only relevant when using rmarkdown::render with output_format = "all". And: In both of these cases rmarkdown.pandoc.to can be used, which is easier anyways.


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome to OStack Knowledge Sharing Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...