"kitten" %>% print()
[1] "kitten"
The tidyverse
makes heavy use of the R concept of forward pipes. Forward pipes, represented via %>%
, are provided by the magrittr package, which should be automatically loaded by the tidyverse.
A forward pipe ‘%>%’ forwards the variable on the left into the first argument to the function on the right, e.g.
"kitten" %>% print()
[1] "kitten"
will forward the string “kitten” so that it is the first argument to the function print
. Hence this is exactly identical to
print("kitten")
[1] "kitten"
This is useful because it enables you to chain together a lot of functions. For example, the tidyverse
dply package provides the function filter
, for filtering data.
library(tidyverse)
<- read_csv("https://raw.githubusercontent.com/Bristol-Training/intermediate-r/refs/heads/main/data/cats.csv")
cats
%>% filter(Sex=="F") cats
# A tibble: 47 × 3
Sex BodyWeight HeartWeight
<chr> <dbl> <dbl>
1 F 2 7
2 F 2 7.4
3 F 2 9.5
4 F 2.1 7.2
5 F 2.1 7.3
6 F 2.1 7.6
7 F 2.1 8.1
8 F 2.1 8.2
9 F 2.1 8.3
10 F 2.1 8.5
# ℹ 37 more rows
has filtered the cats data set from the last page to return a tibble
that contains data only for female cats. This was identical to typing filter(cats, Sex=="F")
.
The power comes that we can now chain filters, e.g.
%>% filter(Sex=="F") %>% filter(BodyWeight > 2.5) cats
# A tibble: 11 × 3
Sex BodyWeight HeartWeight
<chr> <dbl> <dbl>
1 F 2.6 8.7
2 F 2.6 10.1
3 F 2.6 10.1
4 F 2.7 8.5
5 F 2.7 10.2
6 F 2.7 10.8
7 F 2.9 9.9
8 F 2.9 10.1
9 F 2.9 10.1
10 F 3 10.6
11 F 3 13
We can then use the dplyr
summarise()
function to create a new dataframe with the mean of a specified column of this filtered data. For example,
%>%
cats filter(Sex=="F") %>%
filter(BodyWeight>2.5) %>%
summarise(mean=mean(HeartWeight))
# A tibble: 1 × 1
mean
<dbl>
1 10.2
is the mean average of the heart weight in grams of female cats whose body weight is greater than 2.5 kg.
Note how we have split this over multiple lines, putting the forward pipe %>%
at the end so that it is clear that the line continues. If you are using the R Console you can start a new line with Shift+Enter without running the command.
To save this to a variable, we would use the assign <-
as normal
<- cats %>%
average_heart_weight filter(Sex=="F") %>%
filter(BodyWeight>2.5) %>%
summarise(mean=mean(HeartWeight))
This is a very dense bit of code. This is typical for R. You will often see very dense blocks of code that use forward pipes to push data through several functions, resulting in a final output result. As you can see, it is important that you name your variables, data, columns and functions clearly, so that it is easier for future readers of your code to understand what is going on.
Finally, note that average_heart_weight
is a 1x1 tibble
. You can extract the actual numeric value by typing as.numeric(average_heart_weight)
.