Pipes %>%

Error in library(tidyverse): there is no package called 'tidyverse'
Error in read_csv("https://raw.githubusercontent.com/Bristol-Training/intermediate-r/refs/heads/main/data/cats.csv"): could not find function "read_csv"

The tidyverse makes heavy use of the R concept of forward pipes. Forward pipes, represented via %>%, are provided by the magrittr package, which should be automatically loaded by the tidyverse.

A forward pipe ‘%>%’ forwards the variable on the left into the first argument to the function on the right, e.g.

"kitten" %>% print()
Error in "kitten" %>% print(): could not find function "%>%"

will forward the string “kitten” so that it is the first argument to the function print. Hence this is exactly identical to

print("kitten")
[1] "kitten"

This is useful because it enables you to chain together a lot of functions. For example, the tidyverse dply package provides the function filter, for filtering data.

library(tidyverse)

cats <- read_csv("https://raw.githubusercontent.com/Bristol-Training/intermediate-r/refs/heads/main/data/cats.csv")

cats %>% filter(Sex=="F")
Error in cats %>% filter(Sex == "F"): could not find function "%>%"

has filtered the cats data set from the last page to return a tibble that contains data only for female cats. This was identical to typing filter(cats, Sex=="F").

The power comes that we can now chain filters, e.g.

cats %>% filter(Sex=="F") %>% filter(BodyWeight > 2.5)
Error in cats %>% filter(Sex == "F") %>% filter(BodyWeight > 2.5): could not find function "%>%"

We can then use the dplyr summarise() function to create a new dataframe with the mean of a specified column of this filtered data. For example,

cats %>% 
    filter(Sex=="F") %>% 
    filter(BodyWeight>2.5) %>% 
    summarise(mean=mean(HeartWeight))
Error in cats %>% filter(Sex == "F") %>% filter(BodyWeight > 2.5) %>% : could not find function "%>%"

is the mean average of the heart weight in grams of female cats whose body weight is greater than 2.5 kg.

Note how we have split this over multiple lines, putting the forward pipe %>% at the end so that it is clear that the line continues. If you are using the R Console you can start a new line with Shift+Enter without running the command.

To save this to a variable, we would use the assign <- as normal

average_heart_weight <- cats %>%
    filter(Sex=="F") %>%
    filter(BodyWeight>2.5) %>%
    summarise(mean=mean(HeartWeight))

This is a very dense bit of code. This is typical for R. You will often see very dense blocks of code that use forward pipes to push data through several functions, resulting in a final output result. As you can see, it is important that you name your variables, data, columns and functions clearly, so that it is easier for future readers of your code to understand what is going on.

Finally, note that average_heart_weight is a 1x1 tibble. You can extract the actual numeric value by typing as.numeric(average_heart_weight).

NoteExercise

Calculate the average heart weight of male cats whose body weight is greater than or equal to 3.0 kg.

Using dplyr summarise() function we can write

cats %>%
     filter(Sex=="M") %>%
     filter(BodyWeight>=3.0) %>%
     summarise(mean=mean(HeartWeight))
Error in cats %>% filter(Sex == "M") %>% filter(BodyWeight >= 3) %>% summarise(mean = mean(HeartWeight)): could not find function "%>%"

Alternatively, in this case we could use dplyr select() function

cats %>%
     filter(Sex=="M") %>%
     filter(BodyWeight>=3.0) %>%
     select(HeartWeight) %>%
     unlist() %>%
     mean()
Error in cats %>% filter(Sex == "M") %>% filter(BodyWeight >= 3) %>% select(HeartWeight) %>% : could not find function "%>%"
NoteExercise

Calculate the maximum body weight of both the male cat and the female cat that has a heart weight of less than or equal to 9 grams.

max_male <- cats %>%
   filter(Sex=="M") %>%
   filter(HeartWeight <= 9.0) %>%
   summarise(max=max(BodyWeight))
Error in cats %>% filter(Sex == "M") %>% filter(HeartWeight <= 9) %>% : could not find function "%>%"
max_female <- cats %>%
   filter(Sex=="F") %>%
   filter(HeartWeight <= 9.0) %>%
   summarise(max=max(BodyWeight))
Error in cats %>% filter(Sex == "F") %>% filter(HeartWeight <= 9) %>% : could not find function "%>%"
cat( "Maximum body weight: male =",
     as.numeric(max_male) %>% round(digits=2),
     "kg, female =",
     as.numeric(max_female) %>% round(digits=2),
     "kg\n" )
Error in as.numeric(max_male) %>% round(digits = 2): could not find function "%>%"
NoteExercise

Look back at the vignette broom and dplyr you found when searching for the Pearson’s product-moment correlation. How much more of this vignette do you now understand?