Reproducible research with markdown, knitr and pandoc

Over the last few weeks I was trying to optimise my workflow using markdow in combination with knitr and pandoc. Knitr is a grea new package by Yihui, expanding R’s capabilities for reproducible research.

I will illustrate my work flow with the following example, where I have a small R-script (script.r) that I want to embed into a report. However, I do not want to write LaTeX, nor I want/can specify my final output format in the beginning. That is where were pandoc comes in. Pandoc is the swiss-army knife if come to convert between markup languages.

The file script.r:


## @knitr gen-dat
a <- matrix(rnorm(100), nrow=10)

## @knitr plot
 image(a)

The report is written with  pandoc flavored markdown. The file (report_knit_.md) contains


% A sample report
% The author
% `r date()`

<!-- Setting up R -->
`ro warning=FALSE, dev="png", fig.cap="", cache=FALSE or`

<!-- read external r code -->
```{r reading, echo=FALSE}
read_chunk("script.r")
```

# The first part of my R script
Here I can generate my data
```{r}
<<gen-dat>>
```

# Results
An now the reults are plotted
```{r plot-fig, result="asis"}
<<plot>>
```

# More
Of course I can use inline elemtnts: 3 + 3 = `r 3+3`.

For each chunk there are plenty of options to modify it (see options).

To render my report, I need to first knit it in R and then use pandoc to convert it to the final format. This can be done with


Rscript -e "library(knitr); knit('report_knit_.md')"

This results in a pandoc flavored markdown document. Now I can use pandoc to convert this document into all by pandoc supported output format (list of formats):

  • A pdf file: pandoc -s report.md -t latex -o report.pdf
  • A html file: pandoc -s report.md -o report.html (with the -c flag html files can be added easily)
  • Openoffice: pandoc report.md -o report.odt
  • Word docx: pandoc report.md -o report.docx

Files are available on github.

About these ads
This entry was posted in Computer, R, Uncategorized and tagged , , . Bookmark the permalink.

11 Responses to Reproducible research with markdown, knitr and pandoc

  1. Drew says:

    Hi, great post on an up and coming tool. It’d be great if you linked to a full version (maybe in gist) of the code you are using for people to play with themselves without copy/paste.

  2. divin sorcier says:

    Seems it does not work for me on Windows XP with R version 2.15.0 and knitr installed.

    Errors happen when execution Rscript
    # /cygdrive/c/softs/elaboration/R-2.15.0/bin/Rscript.exe -e “library(knitr); kn
    it(‘report.md’)”

    processing file: report.md
    |>>>>>>>>>>>>> | 20%
    inline R code fragments

    |>>>>>>>>>>>>>>>>>>>>>>>>>> | 40%
    label: reading (with options)
    List of 1
    $ echo: logi FALSE

    |>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> | 60%
    ordinary text without R code

    |>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> | 80%
    label: plot-fig (with options)
    List of 1
    $ result: chr “asis”

    label: plot-fig (with options)
    List of 1
    $ result: chr “asis”

    Error in process_file(text) :
    Quitting from lines 13-14: Error in res$text.tidy : $ operator is invalid for
    atomic vectors
    Calls: knit -> process_file

    Execution halted

    I checked the files and seems good. Ideas welcome …

  3. kariert says:

    Did you run:
    /cygdrive/c/softs/elaboration/R-2.15.0/bin/Rscript.exe -e “library(knitr); knit(‘report.md’)”
    or:
    /cygdrive/c/softs/elaboration/R-2.15.0/bin/Rscript.exe -e “library(knitr); knit(‘report_knit_.md’)”
    In case you used the first, have a try with the second.

    • divin sorcier says:

      I used the first form. You just have to know that I named the file report.md instead of report_knit.md.

      I copied it to your name and ran the second command, and I got exactly the same error message.

      Error messages says “Quitting from lines 13-14: Error in res$text.tidy : $ operator is invalid for
      atomic vectors”

      Source code for line 13 and 14 is
      “`{r plot-fig, result=”asis”}
      <>

      I see nothing wrong here but as this is the 1st time I use knitr, I am really not sure of parameter “asis” ?
      Thanks for your help.

      • kariert says:

        Hi,

        I can think of two possible reasons:

        1) you are mixing a file. The reason I used report_knit_.md is, that knitr recognizes it automatically and saves the output as report.md. You can of course change that to what ever you like, but then you probably want to adjust the command to something like this:
        /cygdrive/c/softs/elaboration/R-2.15.0/bin/Rscript.exe -e “library(knitr); knit(‘report.md’, ‘report_knitted.md’)”
        and then run pandoc on report_knitted.md
        2) Did you copy the command from the blog? If you did so, maybe the “` in the code changed to something else.

        Hope this helps

      • divin sorcier says:

        Thanks Kariert.

        Your suggestions gave me a hint. I downloaded the code from git and run the commands exactly as shown in the article. It works. Investigating around, I discovered that although the file content are the same, the filenames were not.

        Articles uses script.r and report_knit_.md, I used trial.txt and report.md. Apparently, those 2 changes are sufficient to break the chain. Don’t know why but I am now able to produce such docs.

        Still need to install pdflatex on windows to be able to produce pdf files. HTML and docs productions are good ones.

        Many thanks for your support.

      • kariert says:

        Great to hear that, glad it worked!

  4. Pingback: Useful for referring–5-20-2012 « Honglang Wang's Blog

  5. Great post.
    I just wanted to add that when using pandoc for the final conversion you can add citations, as long as you have a bibtex file of your citations (which you do if you use mendeley, for example – look for it in the settings).
    In the text you put citation keys like so: [@Drake1998]
    And when you do the conversion you add a bibliography option to pandoc:

    pandoc -s report.md -t latex -o report.pdf –bibliography d:\library.bib

    You can also use the –csl option to define a citation style.
    See examples on my blog post on the subject: http://blog.yoavram.com/creating-pdfs-with-pandoc/

  6. Pingback: Write MS-Word documents using R (with as little overhead as possible)R-statistics blog | R-statistics blog

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s