Getting Started with rmake
Michal Burda
2026-01-09
Source:vignettes/getting-started.Rmd
getting-started.RmdIntroduction
R is a mature scripting language for statistical computations and data processing. An important advantage of R is that it allows writing repeatable statistical analyses by programming all steps of data processing in scripts, which allows re-executing the whole process after any change in data or processing steps.
There are several useful packages for R to obtain repeatability of
statistical computations, such as knitr and
rmarkdown. These tools allow writing R scripts that
generate reports combining text with tables and figures generated from
data.
However, if analyses grow in complexity, manual re-execution of the whole process may become tedious, prone to errors, and very demanding computationally. Complex analyses typically involve:
- Many pre-processing steps on large datasets
- Repetitive execution of commands differing only in parameters
- Production of multiple output files in various formats
It is inefficient to re-run all pre-processing steps repeatedly to
refresh the final report after any change. A caching mechanism provided
by knitr is helpful but limited to a single report.
Splitting complex analyses into several parts and saving intermediate
results into files is rational, but brings another challenge:
management of dependencies between inputs, outputs, and
underlying scripts.
This is where Make comes in. Make is a tool that
controls the generation of files from source data and script files by
reading dependencies from a Makefile and comparing
timestamps to determine which files need to be refreshed.
The rmake package provides tools for easy generation of
Makefiles for statistical and data manipulation tasks in R.
Key Features
The main features of rmake are:
- Use of the well-known Make tool
- Easy definitions of file dependencies in the R language
- High flexibility through parameterized execution and programmatic rule generation
- Simple, short code thanks to the
%>>%pipeline operator and templating - Support for R scripts and R markdown files
- Extensibility for user-defined rule types
- Isolated and parallel execution via Make’s parallel processing
- Support for all platforms: Unix (Linux), MacOS, Windows, and Solaris
- Compatibility with RStudio
Why Use rmake?
R allows the development of repeatable statistical
analyses. However, when analyses grow in complexity, manual re-execution
on any change may become tedious and error-prone. Make
is a widely accepted tool for managing the generation of resulting files
from source data and script files. rmake makes it easy to
generate Makefiles for R analytical projects.
Installation
To install rmake from CRAN:
install.packages("rmake")Alternatively, install the development version from GitHub:
install.packages("devtools")
devtools::install_github("beerda/rmake")Load the package:
Prerequisites
System Requirements
- R: Version 3.5.0 or higher
-
Make: GNU Make or compatible make tool
- On Linux/macOS: Usually pre-installed
- On Windows: Install Rtools (which includes make)
Environment Variables
The package requires the R_HOME environment variable to
be properly set. This variable indicates the directory where R is
installed and is automatically set when running from within R or
RStudio.
When is R_HOME needed?
When running make from the command line (outside of R),
you may need to set R_HOME manually.
Finding R_HOME
To find the correct value for your system, run this in R:
R.home()You can also check the current values of R environment variables:
Sys.getenv("R_HOME")Setting R_HOME
On Linux/macOS:
On Windows (Command Prompt):
On Windows (PowerShell):
For permanent setup, add the export commands to your shell
configuration file (.bashrc, .zshrc, etc. on
Unix-like systems, or system environment variables on Windows).
For more information on R environment variables, see the official R documentation.
Project Initialization
Creating Skeleton Files
To start a new project with rmake:
library(rmake)
rmakeSkeleton(".")This creates two files: - Makefile.R - R script to
generate the Makefile - Makefile - The generated Makefile
(initially minimal)
The initial Makefile.R contains:
Basic Example
Let’s walk through a simple example. Suppose we have: -
data.csv - input data file - script.R - R
script to process the data - Output: sums.csv - computed
results
Step 2: Create the Processing Script
Create script.R:
d <- read.csv("data.csv")
sums <- data.frame(ID = "sum",
V1 = sum(d$V1),
V2 = sum(d$V2))
write.csv(sums, "sums.csv", row.names = FALSE)Using the Pipe Operator
The %>>% pipe operator makes rule definitions more
readable:
This is equivalent to the previous example but more concise.
Adding a Markdown Report
Let’s extend our example to create a PDF report. Create
analysis.Rmd:
---
title: "Analysis"
output: pdf_document
---
# Sums of data rows
```{r, echo=FALSE, results='asis'}
sums <- read.csv('sums.csv')
knitr::kable(sums)
```Update Makefile.R:
library(rmake)
job <- list(
rRule(target = "sums.csv", script = "script.R", depends = "data.csv"),
markdownRule(target = "analysis.pdf", script = "analysis.Rmd",
depends = "sums.csv")
)
makefile(job, "Makefile")Or using pipes:
library(rmake)
job <- "data.csv" %>>%
rRule("script.R") %>>%
"sums.csv" %>>%
markdownRule("analysis.Rmd") %>>%
"analysis.pdf"
makefile(job, "Makefile")Run make again:
make()Visualizing Dependencies
Visualize the dependency graph:
visualize(job, legend = FALSE)This creates an interactive graph showing: -
Squares: Data files - Diamonds: Script
files
- Ovals: Rules - Arrows:
Dependencies
Multiple Dependencies
Handle complex dependencies:
chain1 <- "data1.csv" %>>% rRule("preprocess1.R") %>>% "intermed1.rds"
chain2 <- "data2.csv" %>>% rRule("preprocess2.R") %>>% "intermed2.rds"
chain3 <- c("intermed1.rds", "intermed2.rds") %>>%
rRule("merge.R") %>>% "merged.rds" %>>%
markdownRule("report.Rmd") %>>% "report.pdf"
job <- c(chain1, chain2, chain3)Alternatively, you can define all chains directly without intermediate variables:
Rule Types
rmake provides several pre-defined rule types:
-
rRule(): Execute R scripts -
markdownRule(): Render R Markdown documents -
knitrRule(): Process knitr documents -
copyRule(): Copy files -
offlineRule(): Manual tasks with reminders
For detailed documentation on all rule types including
depRule(), subdirRule(), and custom rules, see
the Build Rules vignette.
Next Steps
For more information on specific topics, see these vignettes:
- rmake Project Management: Learn about project initialization, running builds, cleaning, and parallel execution
- Build Rules: Comprehensive reference for all rule types (rRule, markdownRule, knitrRule, copyRule, depRule, subdirRule, offlineRule)
- Tasks and Templates: Advanced features including tasks, parameterized execution, and rule templates
Summary
Key takeaways: 1. Use rmakeSkeleton() to initialize
projects 2. Define rules in Makefile.R 3. Use
%>>% for readable rule chains 4. Run
make() to execute the build process 5. Use
visualize() to understand dependencies
Resources
- Package documentation:
?rmake - GitHub: https://github.com/beerda/rmake
- Issues: https://github.com/beerda/rmake/issues