General Rules
Dependencies
Always avoid package dependencies when possible (see tinyverse.org).
Adding a new dependency should be considered as a last resort (always prefer a base R solution).
Namespaces
Specify the namespace of each used function, except if it is from base
package.
# GOOD
::ggplot()
ggplot2::select()
dplyr
# BAD
ggplot()
select()
Side-effects
Never mix side-effects and computing in a single method: always return the computation and perform the output in the print()
, show()
or plot()
method.
A function called for its side-effects must return the first argument invisibly (this makes it possible to use it as part of a pipe).
Loops
Use seq_along(x)
to protect against instances where x
is empty.
# GOOD
for (i in seq_along(x)) {
do_something(i)
}
# BAD
for (i in 1:length(x)) {
do_something(i)
}
Use apply
, lapply
, etc. when possible.
Pipes
Do not use pipes (%>%
or |>
) inside functions.
Naming
As a general rule, abbreviations must be avoided when naming.
Naming Files
File names must use .R
extension.
# GOOD
plot.R
# BAD
plot
File names must be meaningful.
# GOOD
plot.R
# BAD
Untitled1.R
File names must not contain /
and spaces. Instead, a dash (-
) or underscore (_
) must be used.
# GOOD
read_csv.R-methods.R
plot
# BAD
read csv.R
File names must use letters from Basic Latin, NOT from Latin-1 Supplement, and must be lowercase.
# GOOD
plot.R
# BAD
Plot.R données.R
Use meaningful verbs for file names.
# GOOD
fit_model.R
# BAD
addition.R
Mind special names:
AllClasses.R
stores all S4 classes definitions.AllGenerics.R
stores all S4 generic functions.zzz.R
contains.onLoad()
and friends.
Use the -methods
suffix for S4 class methods.
If the file contains only one function, name it by the function name.
Naming Variables
Variables names must be as short as possible.
Variables names must be meaningful nouns.
Variable names must be lowercase.
Never separate words within the name by .
(reserved for an S3 dispatch) or use PascalCase (reserved for S4 classes definitions). Instead, use an underscore (_
).
# GOOD
<- 3
std_dev
# BAD
<- 3
std.dev <- 3 StdDev
Do not use names of existing function and variables (especially, built-in ones).
# GOOD
<- 3
std_dev
# BAD
<- 1
T <- 2 * 2
c <- 10 mean
Naming Functions
Function names must contain a verb that refers to the primary action of the function.
Function names must be in snake_case. Use .
only for dispatching S3 generic.
An object_verb()
naming scheme should be prefered as often as possible. This scheme is easy to read and auto-complete.
# GOOD
peak_detect()
# BAD
addition()
readFile()
Avoid function name conflicts with base packages or other popular ones.
Naming S4 Classes
Class names must be nouns in PascalCase with initial capital case letter and the first letter of each subsequent concatenated word capitalized.
Syntax
Line Length
The maximum length of lines is limited to 80 characters.
Do NOT put more than one statement per line. Do NOT use semicolon as termination of the command.
# GOOD
<- 1
x <- x + 1
x
# BAD
<- 1; x <- x + 1 x
Function Call
In a function call, specify arguments by name. Never specify by partial name and never mix by position and complete name.
# GOOD
mean(x, na.rm = TRUE)
# BAD
mean(x, na = TRUE)
The required arguments should be first, followed by optional arguments.
# GOOD
raise_to_power(x, power = 2.7)
# BAD
raise_to_power(power = 2.7, x)
The ...
argument should either be in the beginning or in the end.
# GOOD
standardize(..., scale = TRUE, center = TRUE)
# BAD
standardize(scale = TRUE, ..., center = TRUE)
Set default arguments inside the function using NULL
idiom, and avoid dependence between arguments.
Always validate arguments in a function.
Assignment
Use <-
for assignment, NOT =
.
# GOOD
<- 1
x
# BAD
= 1
x 1 -> x
Spacing
Put spaces around all infix binary operators (=
, +
, *
, ==
, &&
, <-
, %*%
, etc.).
# GOOD
== y
x <- 2 + 1
z
# BAD
==y
x<-2+1 z
Put spaces around =
in function calls.
# GOOD
mean(x = c(1, 2, 3), na.rm = TRUE)
# BAD
mean(x=c(1, 2, NA), na.rm=TRUE)
Do not place space for subsetting ($
and @
), namespace manipulation (::
and :::
), and for sequence generation (:
).
# GOOD
$cyl
car::select()
dplyr1:10
# BAD
$cyl
car :: select()
dplyr1: 10
Put a space after a comma.
# GOOD
1, ]
mtcars[mean(x = c(1, NA, 2), na.rm = TRUE)
# BAD
1 ,]
mtcars[mean(x = c(1,NA,2),na.rm = TRUE)
Use a space before left parentheses, except in a function call.
# GOOD
for (element in element_list)
if (total == 5)
sum(1:10)
# BAD
for(element in element_list)
if(total == 5)
sum (1:10)
No spacing around code in parenthesis or square brackets.
# GOOD
if (is_true) message("Hello!")
"tiger", ]
species[
# BAD
if ( is_true ) message("Hello!")
"tiger" ,] species[
Curly Braces
An opening curly brace should never go on its own line and should always be followed by a new line.
# GOOD
if (is_true) {
# do something
}
if (is_true) {
# do something
else {
} # do something else
}
# BAD
if (is_true)
{# do something
}
if (is_true) { # do something }
else { # do something else }
A closing curly brace should always go on its own line, unless it’s followed by else
.
# GOOD
if (is_true) {
# do something
else {
} # do something else
}
# BAD
if (is_true) {
# do something
}else {
# do something else
}
Always indent the code inside curly braces.
# GOOD
if (is_true) {
# do something
# and then something else
}
# BAD
if (is_true) {
# do something
# and then something else
}
Curly braces and new lines can be avoided, if a statement after if
is very short.
# OK
if (is_true) return(value)
Indentation
Do not use tabs or mixes of tabs and spaces for indentation.
Use two spaces for indentation.
New Line
In a function definition or call excessive arguments must be indented where the closing parenthesis is located, if only two lines are sufficient.
# GOOD
<- function(arg1, arg2, arg3, arg4,
long_function_name long_argument_name1 = TRUE)
plot(table(rpois(100, 5)), type = "h", col = "red", lwd = 10,
main = "rpois(100, lambda = 5)")
Otherwise, each argument can go into a separate line, starting with a new line after the opening parenthesis.
# GOOD
<- function(long_argument_name1 = c("value1", "value2"),
long_function_name long_argument_name2 = TRUE,
long_argument_name3 = NULL,
long_argument_name4 = FALSE)
list(
mean = mean(x),
sd = sd(x),
var = var(x),
min = min(x),
max = max(x),
median = median(x)
)
If the condition in if
statement expands into several lines, than each condition must end with a logical operator, NOT start with it.
# GOOD
if (some_very_long_name_1 == 1 &&
== 1 ||
some_very_long_name_2 %in% some_very_long_name_4)
some_very_long_name_3
# BAD
if (some_very_long_name_1 == 1
&& some_very_long_name_2 == 1
|| some_very_long_name_3 %in% some_very_long_name_4)
If the statement, which contains operators, expands into several lines, then each line should end with an operator and not begin with it.
# GOOD
<- 1 / sqrt(2 * pi * d_sigma ^ 2) *
normal_pdf exp(-(x - d_mean) ^ 2 / 2 / s ^ 2)
# BAD
<- 1 / sqrt(2 * pi * d_sigma ^ 2)
normal_pdf * exp(-(x - d_mean) ^ 2 / 2 / d_sigma ^ 2)
Each grammar statement of dplyr
(after %>%
) and ggplot2
(after +
) should start with a new line.
Commenting
Comments start with #
followed by space and text of the comment.
# This is a comment.
Comments should explain the why, not the what. Comments should explain the overall intention of the command.
# GOOD
# define iterator
<- 1
i
# BAD
# set i to 1
<- 1 i
Short comments can be placed on the same line of the code.
plot(price, weight) # Plot a scatter chart of price and weight
It makes sense to split the source into logical chunks by #
followed by -
or =
.
# Read data -------------------------------------------------------------
# Clean data ------------------------------------------------------------
Function and object descriptions should adhere to roxygen2 guidelines.
#' Add Together Two Numbers
#'
#' @param x A number.
#' @param y A number.
#' @return The sum of x and y.
#' @examples
#' add(1, 1)
#' add(10, 1)
<- function(x, y) {
add # general comment
+ y # inline comment
x }
Attribution
This coding style is derived from the tidyverse style guide, the rOpenSci Packages book and Iegor Rudnytskyi’s R Coding Style Guide.