Automating R-markdown Tables With Hooks

KableExtra was chosen as the primary way to format tabulated data in the HTML R markdown templates used at Telethon Kids Institute, see the Biometrics R package and this article. I like the tables produced by kableExtra: they look tidy and the package has a feature that highlights table rows that the pointer is hovering over. Moreover, it is very easy to use the package, which is invoked by adding the following 2 lines of code (via. dplyr syntax with the R iris data set):

Note: this article was first posted on the Telethon Kids blog in January, 2019, but has been re-posted as a contribution to R-bloggers.

head(iris) %>%
kable("html") %>%
kable_styling("hover", full_width = F)
Sepal.LengthSepal.WidthPetal.LengthPetal.WidthSpecies
5.13.51.40.2setosa
4.93.01.40.2setosa
4.73.21.30.2setosa
4.63.11.50.2setosa
5.03.61.40.2setosa
5.43.91.70.4setosa

This is all well and good, and I would be mostly happy to leave the template here and move on. BUT. The motivation behind these R markdown templates was to streamline as much of the formatting code as is possible; this is to help to maintain a constant theme between the reports/communications produced by Telethon Kids’ researchers, staff and students. I also like my code to be as DRY as possible.

The requirement of adding these 2 lines of code at the end of each table chunk must be fixed!

As a starting point, the default output of a data.frame() in an R markdown document is like this:

##   Sepal.Length Sepal.Width Petal.Length Petal.Width Species
## 1          5.1         3.5          1.4         0.2  setosa
## 2          4.9         3.0          1.4         0.2  setosa
## 3          4.7         3.2          1.3         0.2  setosa
## 4          4.6         3.1          1.5         0.2  setosa
## 5          5.0         3.6          1.4         0.2  setosa
## 6          5.4         3.9          1.7         0.4  setosa

I approached the problem by using code suggested in this old stack overflow answer by Konrad Rudolph. Konrad used the the knitr::knit_hooks() function to re-parse the chunk output and prepare the string for further manipulation.

The difficulty in formatting the chunk output is that character data is passed to the hook. The workflow first needed to remove the ##’s from each line with gsub() then re-parse the table with read.table(), before adding in the formatting code. This feels messy to me and it fails for wide tables that have rows broken into multiple lines (which was acknowledged in the original answer from 2013).

With a night to dwell on my problem, I realised/remembered that R markdown also sends the chunk source code to knitr for display, which can also be picked up and manipulated by a hook!

My solution, which is very similar to that on stack overflow (but better), takes the source code and uses eval() to evaluate the code and append the formatting lines from kableExtra. This is my knitr::knit_hooks() code:

default_source_hook <- knit_hooks$get("source")
knit_hooks$set(
source = function(x, options) {
if(is.null(options$table))
default_source_hook(x, options)
else {
eval(parse(text = x)) %>%
kable("html") %>%
kable_styling("hover", full_width = F)
}
}
)

which is invoked by:


    ```{r table_format, results = "hide", table = T, eval = F}
      head(iris)

    ```

and produces a table identical to the first one illustrated in this article.

Enabled via the chunk option table = T, the hook applies formatting only to the chunks that I want formatted with kableExtra. Since the source code is re-evaluated in the knitr hook, the option eval = F should be included in the chunk header so the chunk isn’t evaluated twice; however, this double evaluation wouldn’t be a problem for small tables.

Finally, because the table is displayed using source, and output isn’t touched, the option results = "hide" should be included so that the default table is excluded from the final knit.

Mission accomplished!? No… Unfortunately there are still a couple of problems with this solution:

  1. You are unable to display the code used to generate the table in knitted document (echo = T), because the source code string was changed, not the output string.
  2. I’m replacing 2 lines of code with 3 chunk options – although not a deal breaker (as you may have set these options anyway) it’s not quite as auto-magical as I would have liked.

At the time of writing I still haven’t got an accepted answer to this problem, which I posted on stack overflow. If you have a better solution – or want to up-vote my question – then you know what to do.

This post has been shared with http://www.R-bloggers.com

Full example .Rmd file


---
title: "Untitled"
author: "Paul"
date: "27 September 2018"
output: html_document
---

```{r setup, include = F}

library(dplyr)
library(ggplot2)
library(kableExtra)
library(knitr)

default_source_hook <- knit_hooks$get('source')

knit_hooks$set(
  source = function(x, options) {
    if(is.null(options$table))
      default_source_hook(x, options)
    else {
      eval(parse(text = x)) %>%
        kable("html") %>%
        kable_styling("hover", full_width = F)
    }
    
  }
)

data(iris)

```

## Normal
With no chunk options:

```{r normal}
head(iris)

```

## The desired ouptut
With chunk options `results = "hide"` and `table = T`:

```{r table_format, results = "hide", table = T, eval = F}
head(iris)

```

## It still work as normal for other other output types
With no chunk options:

```{r image}
iris %>%
  ggplot(aes(x = Sepal.Length, y = Sepal.Width, group = Species)) +
  geom_point()

```

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s