R Markdown To Html



  1. Writing prose with Markdown formatting, and; Running each code chunk interactively by clicking the; icon within RStudio.You can also click 'Knit to HTML' again to render the full document with all code chunks. For more help getting started in R Markdown, please see the R Markdown website or use the 'Get Started' links at the top of this page.
  2. When you click the Knit HTML button, a window will open in your console titled R Markdown. This pane shows the knitting progress. The output (html in this case) file will automatically be saved in the current working directory.
  1. Convert Markdown Html
  2. Rmd File To Pdf
  3. Convert R Markdown To Html
  4. R Markdown Html To Pdf
  • 3.1 The HTML resume in pagedown.
  • 3.2 An example of side notes.
  • 3.3 Allow page breaks in a subsection.
  • 3.4 Do not allow page breaks in a subsection.

When talking about PDF and printing, we often think of tools like LaTeX and Microsoft Word. When talking about HTML and CSS, we may have never imagined their possible off-screen use such as printing to PDF.

The knitr and rmarkdown packages are used in conjunction with pandoc to convert R code and figures to a variety of formats including PDF, and word. Here, I’m exploring how to convert HTML back to markdown format. This post came about when I was searching how to convert XML to markdown, which I still haven’t found an easy way to do. Becoming familiar with LaTeX will give you a lot more options to make your R Markdown.pdf look pretty, as LaTeX commands are mostly compatible with R Markdown, though some googling is often required. To compile a.pdf instead of a.html document, change output: from htmldocument to pdfdocument, or use the dropdown menu from the “Knit. R Markdown is a great way to integrate R code into a document. An example of the default theme used in R Markdown HTML documents is shown below.

Can we print a book with HTML and CSS? W3C published the first working draft on “Paged Media Properties for CSS(3)”, which was last updated in 2013. Although the working draft has been there for nearly two decades, it is still not common to see authors write or print books with HTML and CSS. The main reason is that the W3C specs are still in the draft mode, so most web browsers have not really implemented them.

HTML and CSS still cannot beat other dominating tools like Word or LaTeX when it comes to typesetting content under the constraint of “pages”. You may be disappointed by a lot of typesetting details on a paged HTML page. However, HTML and CSS can be extremely powerful and flexible in other aspects, especially when combined with the power of JavaScript. By the way, HTML works almost anywhere because it only requires a web browser.

Convert Markdown Html

Although most web browsers have not implemented the W3C specs for Paged Media, a JavaScript polyfill library named “paged.js” is currently being developed to fill the gap. This library is still experimental and has many rough edges, but looks promising enough to us, so we created an R package pagedown(Xie et al. 2021) based on this JavaScript library to paginate the HTML output of R Markdown documents. You can install the package from Github:

To learn more about paged.js and CSS for Paged Media, you may check out the cheatsheet of paged.js.

The pagedown package contains output formats for paged HTML documents, letters, resumes, posters, business cards, and so on. Usually there is an R Markdown template for each output format, which you can access from RStudio’s menu File -> New File -> R Markdown -> From Template.

To create a paged HTML document, you can use the output format pagedown::html_paged, e.g.,

2.1 Preview paged HTML documents

This format is based on paged.js. Some other formats in this package are extensions of html_paged, such as html_letter and html_resume. Please note that you need a web server to view the output pages of these formats, because paged.js requires a web server. The web server could be either a local server or a remote one. When you compile an R Markdown document to HTML in RStudio, RStudio will display the HTML page through a local web server, so paged.js will work in RStudio Viewer. However, when you view such pages in a real web browser, you will need a separate web server. The easiest way to preview these HTML pages in a web browser may be through the RStudio addin “Infinite Moon Reader”, which requires the xaringan package (Xie 2021b). Or equivalently, you can call the function xaringan::inf_mr(). This will launch a local web server via the servr package (Xie 2021a).

Please note that the layout of the pages is very sensitive to the zoom level in your browser. Elements on a page are often not zoomed strictly linearly, e.g., as you zoom out, certainly elements may start to collapse into each other. The 100% zoom level usually gives the best result (press Ctrl + 0 or Command + 0). You are strongly recommended to use this level when printing the page to PDF.

2.2 The CSS overriding mechanism

We have provided a few default CSS stylesheets in this output format:

To find the actual CSS files behind these names, use pagedown:::list_css(). For example, default-fonts means the file resources/css/default-fonts.css in the installation directory of pagedown. The stylesheet default-fonts defines the typefaces of the document, default-page defines some page properties (such as the page size, running headers and footers, and rules for page breaks), and default defines the style of various elements (such as the table of contents).

If you do not like any of these default stylesheets, you can use a subset of them, or override certain CSS rules. For example, if you do not like the default typeface, you may create a CSS file my-fonts.css (assuming it is under the same directory of your Rmd file):

Then include this CSS file via the css option:

Note that this overriding mechanism also works for other output formats in pagedown.

2.3 Print to PDF

There are three ways to print to PDF:

  1. with Google Chrome or Chromium using the menu “Print” or by pressing Ctrl + P (or Command + P on macOS). Remember to allow background graphics to be printed.

  2. using the function chrome_print(). Its first argument (input) accepts a path to a local Rmd or HTML file or an URL. Google Chrome or Chromium must be installed on your system.

  3. in RStudio, adding the following line in the YAML header of your Rmd file:

    With this metadata parameter, the behavior of the “Knit” button of RStudio is modified: it produces both the HTML document and the PDF with Chrome. This functionality is suitable for any R Markdown HTML output format and is mainly convenient for small documents like presentations or notes.
    If chrome_print() cannot find Chrome or Chromium automatically, set the PAGEDOWN_CHROME environment variable to the path to Chrome or Chromium executable.

2.3.1 Print to PDF on a server

If you want to use chrome_print() on a server (an RStudio server, for instance), Chromium or Chrome has to be present on this server and available from the PATH or PAGEDOWN_CHROME environment variables.

Be sure that the local IP address 127.0.0.1 is referenced in the no_proxy environment variable. Otherwise the R session won’t be able to connect to Chrome.

2.3.2 Print to PDF with CI/CD services

It is possible to produce a PDF with Chrome using a continuous integration service.

With Travis, activate the Chrome addon by adding theses lines in .travis.yml file:

With GitLab CI, you have to use a Docker image with R, Pandoc, pagedown and Chromium or Chrome.see an example below Depending on the base image, you may have to install some extra fonts.

Travis and GitLab CI are container-based environments running as root. As explained in the Travis documentation, you have to pass the --no-sandbox argument to chrome_print() (this is required for both Travis and GitLab CI):

Since the --no-sandbox option can lead to major security threats, do not use these CI/CD services to print untrusted web pages.

2.3.3 Print to PDF using Docker

Here is a minimal Dockerfile using a popular image from the Rocker project which uses RStudio:

If you save this Dockerfile in your current directory, you can build the image with:

If you intend to use pagedown::chrome_print() in a container running this image, do not launch the container as usual.

You have to use Jessie Frazelle’s seccomp file for Chrome in Dockerdownload here https://raw.githubusercontent.com/jessfraz/dotfiles/master/etc/docker/seccomp/chrome.json as follows:

With this seccomp file, you do not have to use the '--no-sandbox' option: this is much more secure!

2.3.4 Troubleshooting with large PDF files generation

In rare circumstances and if your document contains a lot of images, chrome_print() may fail to generate your PDF.

On Linux environments with minimal resources (like a container), you may get this error message:

Here, you just have to follow the advice.

If you use an old version of Chrome, you may obtain the following error message:

In that case, you must install a more recent version of Chrome.

3.1 Resume

Currently pagedown has one resume format: pagedown::html_resume. See https://pagedown.rbind.io/html-resume/ for an example.

The R Markdown source document should contain two parts titled “Aside” (for the sidebar) and “Main” (for the main body), respectively. Each part can contain any numbers of sections (e.g., education background and working experience). Each section can contain any numbers of subsections, and you can write more content in a subsection.In case you are not very familiar with the Markdown syntax for section headings, a series of = under a heading is equivalent to # before a heading, and a series of - is equivalent to ##. See Pandoc’s manual for details: https://pandoc.org/MANUAL.html#headers. Below is a quick example:

The “Aside” part will be displayed in the right sidebar. This part can contain arbitrary content (not necessarily structured). For example, you can include a picture in the beginning. The “Disclaimer” section will be placed at the bottom of the first page. All icons for this resume template are from Font Awesome. For example, <i></i> will generate an envelope icon. You can look up all icon names on the Font Awesome website if you want to use other icons.

For the “Main” part, all sections must follow a specific structure except the first section. The first section usually shows your name and information that you want to highlight. For the rest of sections, they should contain a title and a number of subsections. You can specify an icon for a section title via the attribute data-icon, e.g., {data-icon=graduation-cap} means an icon of a graduation cap (which can be used as a symbol for education). For each subsection, it should contain a title, followed by at least three paragraphs:

  • The first paragraph is a brief description of the subsection.

  • The second paragraph is the location.

  • The third paragraph is the time period. If this subsection has both a starting and ending time, separate them by a dash, e.g., 2014 - 2015 or 2014/07/27 - 2015/07/23.

The description, location, and time period can each be N/A if the relevant information is not available.

You can write arbitrary content after the third paragraph (e.g., more paragraphs, bullet lists, and so on). If you want to write content in two columns, you may use a “concise” block, e.g.,

If you want to write a side note, use an “aside” block, e.g.,

There is a caveat about page breaks. By default, we allow page breaks within a subsection. Sometimes this may lead to odd output like the example in Figure 3.3. The first bullet should not be split into two columns, and the rest of bullets should have a larger left margin.

Figure 3.3: Allow page breaks in a subsection.

If you want to avoid layout problems like this, you may disallow page breaks via CSS:

However, this may lead to a new issue demonstrated in Figure 3.4: there may be a large bottom margin on the previous page. At this point, you may start to miss your old friends, Word and LaTeX.

Figure 3.4: Do not allow page breaks in a subsection.

3.2 Poster

You can create a poster with the output format pagedown::poster_relaxed or pagedown::poster_jacobs. See https://pagedown.rbind.io/poster-relaxed/ and https://pagedown.rbind.io/poster-jacobs/ for examples.

We do not have time to document the poster formats yet, but here is a caveat: the layout of poster sections is hardcoded in CSS styleseets, which means you cannot add/delete sections unless you know CSS (in particular, CSS Grid Layout). If you are interested in learning CSS Grid Layout, you may take a look at the CSS of poster_jacobs.

3.2.2 The Jacobs University style

3.3 Business card

To create a simple business card, you can use the pagedown::business_card format. See https://pagedown.rbind.io/business-card/ for an example.

3.3.1 Single card

A single business card can be created with the following R Markdown file (this file contains only a YAML header).

You can repeat the card on multiple pages using the repeat variable. The following example produces as many pages as cards (12).

In order to print the cards, you may prefer a layout with several cards on the same page: you can adjust the paper size with the paperwidth and paperheight variables and define a grid layout with the cols and rows variables (you may test some combinations of these parameters to find the most appropriate one).

Html

You also can use markdown to define a card. Be aware to use the slot attributes as follows.

3.3.2 Different cards with shared informations

You can produce business cards for members of an organization sharing some informations (address, website…).
Common informations are declared as top level variables in the YAML header. Custom cards are defined using the person variable: each key: value pair of a person block overrides the corresponding top level pair.

If you prefer, you can use markdown to create a card as follows.

3.3.3 Styling business cards

3.3.3.1 Fonts

You can change the text font with the mainfont and/or googlefonts top level YAML variables:

  • mainfont will use the local font installed on your computer, e.g.

  • googlefonts to use fonts from https://fonts.google.com, e.g.

3.3.3.2 Card sizing

You can modify the card size with the cardwidth and cardheight variables. You can get a landscape card with:

If you render this card, you will see that the default style does not suit well with a landscape card. Read the next section to find an example of a landscape card with a better style.

3.3.3.3 CSS

Finally, you can modify the style of the card using CSS rules.
The markup of a card can be represented as followsThis is technically incorrect since the card template use a shadow DOM.:

You can use these built-in classes to style your card with CSS.
A landscape card could be styled like this:

3.4 Letter

You can write a letter with the pagedown::html_letter format. See https://pagedown.rbind.io/html-letter/ for an example.

3.5 Thesis

You can write a thesis with the pagedown::thesis_paged format created by BrentThorne. See https://pagedown.rbind.io/thesis-paged/ for an example.

3.6 A Journal of Statistical Software article

You can write an article for the Journal of Statistical Sofware. See https://pagedown.rbind.io/jss-paged/ for an example.

4.1 Lists of tables and figures

Lists of tables and/or figures can be inserted in the document using the lot and lof variables in the YAML metadata. You also can customize their titles with the lot-title and lof-title variables. By default, theses lists are referenced in the table of contents, if any. You can use the lot-unlisted and lof-unlisted options to remove them. For instance:

Rmd File To Pdf

4.2 List of abbreviations

A list of abbreviations is automatically built if the document contains at least one HTML abbr element.

For instance, if the R Markdown document contains <abbr>CSS</abbr>, a list of abbreviations is built with the CSS definition.

List of abbreviations example

Html
CSS
Cascading Style Sheets

The title of this list of abbreviations can be customized using the loa-title field in the YAML header.

4.3 Front matter

By default, the front matter is composed of the cover, the table of contents and the lists of figures, tables and abbreviations if any. The only difference between the front matter pages and the main content pages is the style of the page numbers.

You can add extra sections to the front matter using the front-matter class. For instance, if you want to add a preface to the front matter, you need to write:

4.4 Chapters prefix

The word “Chapter” can be prepended to chapter numbers in chapter titles using the chapter class :

4.4.1 Internationalization

The chapter title prefix can be customized using the chapter_name field in the YAML header or in _bookdown.yml filesee https://bookdown.org/yihui/bookdown/internationalization.html

When defined in the YAML header the chapter_name field is parsed by Pandoc. Therefore, special characters like spaces have to be escapedsee https://pandoc.org/MANUAL#backslash-escapes. For instance, if you want to use 'CHAPTER ' as a chapter title prefix, you have to write:

A suffix string can be added after the chapter number. For instance, to add a dot (.) after the chapter number, use the following value for the chapter_name field:

If defined in _bookdown.yml file, the chapter_name field will override a chapter_name field declared in the YAML header.
Note that contrary to the bookdown HTML formats, a custom function is not allowed as a value for the chapter_name field.

4.5 Links

Convert R Markdown To Html

In Markdown, the usual ways to insert links are automatic and inline links.

4.5.1 Automatic links

Automatic links are created using pointy brackets, e.g., <https://bookdown.org>. The full URL is then inserted in the final document https://bookdown.org. This is convenient for a short and meaningful URL.

4.5.2 Inline links

Inline links are useful when you do not want to show the full URL but an alternative text (because URLs are usually long and ugly). In Markdown, the link text is inserted in square brackets and the URL in parentheses, e.g. [bookdown website](https://bookdown.org). On a website, the URL is hidden and replaced by the link text: bookdown website. The user can interactively access the URL by clicking on the link.

When printing a document, we lose interactivity. So we need to show the hidden URLs. By default, the html_paged format adds the URLs after the link text in parentheses, for instance [bookdown website](https://bookdown.org) is rendered as bookdown website.

You also can use the links-to-footnotes top-level YAML parameter: it transforms all the URLs to footnotes. You will get the same result as bookdown website^[https://bookdown.org]. To activate the links-to-footnotes option, insert links-to-footnotes: true in the YAML header. For instance:

4.6 Footnotes

The default behavior of pagedown is to render notes as endnotes because Paged.js does not natively support footnotes for now. However, we introduced an experimental support for footnotes. You can test it by including paged-footnotes: true in the YAML header of your document. Please, note that the paged-footnotes option only supports inline contentsee https://github.com/rstudio/pagedown/issues/156. If you get any trouble with this experimental feature, please open an issue in the pagedown repository on GitHub.

If you need to override the default footnotes style, you should use an important rule on elements of class footnote. For example,

4.7 Custom running headers

Sometimes a section title may be too long to fit the header or footer area. In this case, you can specify a shorter title for the running headers/footers via the attribute data-short-title after a title, e.g.,

4.8 Covers

Covers images can be added using the front_cover and back_cover parameters of pagedown::html_paged(). You can pass any path to a file or an url.

Several paths or links can be passed to the front_cover and back_cover parameters. For each image declared in the front_cover or back_cover parameter, a CSS variable is created: --front-cover, --back-cover, --front-cover-2, --back-cover-2, etc.
They can be used as value for the background-image CSS property:

You also can add textual content on the front and back covers using two special divs of classes front-cover and back-cover:

If the background properties of the default template does not suit your needs, here is a small hack to modify them.
The following lines are used to position the pagedown hex logo on the front page of the current document:

4.9 Page references

Internal links will be followed by page numbers by default. For example, you can refer to another section using either [Section Title] or [Link Text](#section-id) (where section-id is the ID of the section header).

Do you still remember Paged.js that we mentioned earlier?

4.10 Line numbering

For templates built on top of html_paged, line numbering can be added using the top-level YAML parameter number-lines. For example:

The line numbers can be reset on each page using the reset-page option:

You also can select the HTML elements by passing a CSS selector to the selector parameter. To number the lines of all the paragraphs and headers, you need to write:

The default CSS selector is '.level1:not(.front-matter) h1, .level1 h2, .level1 h3, .level1 p'. Line numbering is deactivated for display math environments.

Be aware that the value 'normal' of the CSS line-height property is not supported: elements with a normal line height are not numbered. Since 'normal' is the default value for the line-height property, the CSS stylesheets must define a different value. If your template relies on custom CSS files, you can add for example:

You can modify the horizontal positioning and the font size of the lines numbers using two CSS variables: --line-numbers-padding-right (default value 10px) and --line-numbers-font-size (default value 8pt). For further customisation, you can modify the style of the elements of class maintextlinenumbers. Here is an example:

Please note that this feature is sensitive to elements which break the vertical rythm of the text like inline maths.

4.11 Page breaks

There are two ways to force a page break:

  • with the newpage(LaTeX) command (pagebreak also works)

  • using one of these two CSS classes: page-break-before or page-break-after
    For example, to force a page break before a given section, use:

4.12 MathJax

The following test comes from http://www.cs.toronto.edu/~yujiali/test/mathjax.html.

Some RBM stuff:

[ E(mathbf{v}, mathbf{h}) = -sum_{i,j}w_{ij}v_i h_j - sum_i b_i v_i - sum_j c_j h_j]

Multiline equations:

[ begin{align} p(v_i=1|mathbf{h}) & = sigmaleft(sum_j w_{ij}h_j + b_iright) p(h_j=1|mathbf{v}) & = sigmaleft(sum_i w_{ij}v_i + c_jright) end{align}]

Here is an example of an inline expression: (p(x|y) = frac{p(y|x)p(x)}{p(y)}).

4.13 Figures and tables

Table 4.1:

Table 4.1: An example table.
Sepal.LengthSepal.WidthPetal.LengthPetal.Width
5.13.51.40.2
4.93.01.40.2
4.73.21.30.2
4.63.11.50.2
5.03.61.40.2
5.43.91.70.4

Xie, Yihui. 2021a. Servr: A Simple Http Server to Serve Static Files or Dynamic Documents. https://github.com/yihui/servr.

———. 2021b. Xaringan: Presentation Ninja. https://github.com/yihui/xaringan.

Xie, Yihui, Romain Lesur, Brent Thorne, and Xianying Tan. 2021. Pagedown: Paginate the Html Output of R Markdown with Css for Print. https://github.com/rstudio/pagedown.

There are some things that I run into fairly frequently (and some not so much) when I’m rendering my rmarkdown documents. This section details some the common problems, and the solution that I have found works for me.

If you want to practice on fixing broken rmarkdown documents, check out some pathologically broken examples on github at njtierney/rmd-errors.

15.1 Avoiding problems

To avoid problems in the first place, I try and do the following:

  • Develop code in chunks and execute the chunks until they work, then move on.
  • knit the document regularly to check for errors.

Then, if there is an error:

  • recreate the error in an interactive session:
    • restart R
    • run all chunks below
    • find the chunk that did not work, fix until it does
    • run all chunks below
    • explore working directory issues
      • remember that the rmarkdown directory is where the .Rmd file lives

15.2 The errors

What follows from here are all the errors you might in an rmarkdown document, with the following structure:

  • What they might look like
  • What the error message might appear to be, and
  • How to solve them

15.3 “Duplication”: Duplicated chunk names

What it might look like

Chunks like this:

The error message

This is caught before the document compiles with a warning like:

The important part to note is the start:

How to solve

  • In our case we have the same chunk name twice: ‘title-one’. Change the chunk name of one of them!

15.4 “Not what I ordered”: Objects not created in the right order

What it might look like

The error message

How to solve

15.5 “Forgotten Trails I”: Missing “,”, or “(”, “}”, or “’”

What it might look like

The error message

How to solve

15.6 “Forgotten Trails II”: Chunk option with trailing ', or not input

What it might look like

The error message

How to solve it?

  • The easiest way is to do Cmd+Shift+F, which opens up a global search in your rstudio project, and then type in the offending string mentioned in the NOTE. In this case, I would search for the partial string 'fig.cap = 'Setting the options right for rstudio, so you don't restore previous sessions work, and. I search for the partial string because there might be parts at the end of the error message that aren’t in the text.

15.7 “Forgotten Trail III”:

What it might look like

The error message

How to solve

This error message is pretty good, I needed to add a comma after my chunk name.

R Markdown Html To Pdf

So, go from:

to

15.8 “The Path Not Taken” File path incorrect

What it might look like

The error message

How to solve

15.9 “Spolling I” Incorrectly spelled chunk options

These are often not an error, but you just won’t get the behaviour that you expect.

What it might look like

  • fig.caption instead of fig.cap. This once caused me to rewrite a lot of code and an entire section of a paper until I realised the problem.

The error message

How to solve

  • There might be a switch you can flick to ask knitr to solve

15.10 “Spolling II” Incorrectly spelled chunk option inputs

So this is when you provide the wrong input to your chunk options. Like something that requires TRUE gets “yes”, or something that needs '100%' instead gets 100

What it might look like

The error message

How to solve

What was the problem? Turns out I provided the option FALSe instead of FALSE.

Go from:

to

R Markdown To Html

15.11 “The Legend of Link I”: Your images in ![]() don’t work.

I often forget that it is ![](path/to/image), and not ![]('path/to/image'). There are no quote marks!

15.12 LaTeX errors

There is no panacea for LaTeX errors, but if you aren’t familiar with “what that error message” might look like, here are some details.

What it might look like

The error message

How to solve

15.13 I want to include inline R code verbatim to show an example

… Like for a book on using rmarkdown or something.

Check out this great blog post by T. Hovorka from R Views

It boils down to this:

`` `r 'u0060r expressionu0060'` ``.

15.14 My Figure or Table isn’t being cited

What it might look like

You create a figure,

The error message

There isn’t one - you just get @ref(fig:figure-chunk-name) printed.

How to solve

You need to make sure that you actually print the table or plot. If you create the plot and save it, but do not print it in the document, then you will not be able to reference the plot or table.

15.15 Your Turn

  1. Go to this repo njtierney/rmd-errors, and give debugging some of these common rmarkdown errors a go.




Comments are closed.