One useful strategy for starting to learn R is to focus on mastering a simple goal. This will build confidence and lead to immediate improvement in current workflows. In this tweetorial, I will illustrate how easy it is to build and customize scatterplots in R. #rstats
Step 1: Preparation (i.e., install and load required R packages).

install.packages("ggplot2")
install.packages("dplyr")
install.packages("gghighlight")
install.packages("ggeasy")

library(ggplot2)
library(dplyr)
library(gghighlight)
library(ggeasy)
Step 2: Load and prepare mtcars dataset (a famous dataset available in R).

# load dataset
data(mtcars)

# convert the gear variable to a categorical variable
# with categories 3, 4 and 5
mtcars <- mutate(mtcars, gear = factor(gear))
Step 3: Create a "base" graph which will be used as the initial starting point for our scatterplots. This will be an empty canvass showing what variable will be plotted on the x-axis (wt; car weight) and what variable will be plotted on the y-axis (mpg; miles per gallon).
Step 3 ctd: The R code for creating the "base" graph is:

g0 <- ggplot(data = mtcars, aes(x = wt, y = mpg))
g0
Step 4: Now it becomes easy to add layers to the "base" graph. Let's add the actual observations corresponding to (wt, mpg) values in the dataset.

g1 <- g0 +
geom_point(size = 2)
g1
Step 5: We can beautify the current scatterplot, g1, by adding a title, subtitle and axis labels.

g1 <- g1 +
easy_labs(title = "Plot Title",
subtitle = 'Plot Subtitle',
x = 'x axis label',
y = 'y axis label')

g1
Step 6: We can highlight observations in the scatterplot (e.g., observations for cars with gear == "3").

g1 <- g1 +
aes(colour = gear) +
gghighlight(gear == "3", use_direct_label = FALSE) +
easy_remove_legend()
g1
Step 7: We can add a regression line and associated 95% pointwise confidence band through the scatterplot observations.

g1 <- g0 +
geom_point(size = 2) +
geom_smooth(method = "lm", se = TRUE)
g1

Set se = FALSE to suppress the confidence band.
Step 8: We can use grouping by gear in our scatterplot, to see how mpg varies with wt by gear.

g2 <- g0 +
geom_point(aes(group = gear, colour = gear),
size = 2) +
geom_smooth(aes(group = gear, colour = gear,
fill = gear), method = "lm", se = TRUE)
g2
Step 9: We can show the relationship between mpg and wt in separate panels - one per gear (rather than in the same panel, as done with grouping). Both panelling and grouping achieve the same thing, except they display results differently: multiple panels vs single panel.
Step 9 - Ctd.: The R code for panelling by gear is:

g3 <- g0 +
geom_point(aes(colour = gear), size = 2) +
geom_smooth(aes(colour = gear, fill = gear),
method = "lm", se = TRUE) +
facet_wrap(~gear)
g3

Can also use facet_grid() here.
Step 10: Finally, we can combine two or more scatterplots in the same graphical window with patchwork.

install.packages("patchwork")
library(patchwork)

g1 + g2 # combine 2 graphs

(g1 + g2)/g3 # combine 3 graphs
Step 10 Ctd.: With patchwork, you can combine your graphs in any way you want:

g1/g2/g3 # g1 on top row; g2 on middle row;
# g3 on bottom row

g1/(g2 + g3) # g1 at the top in its own row;
# g1 and g3 at the bottom in same row
Step 11: Labelling observations in a scatterplot is easy.

install.packages("ggrepel")
library(ggrepel)

g4 <- g0 +
aes(label = gear) +
geom_point(aes(colour = gear), size = 2) +
geom_text_repel(aes(colour = gear), show.legend = FALSE)

g4
You can follow @IsabellaGhement.
Tip: mention @twtextapp on a Twitter thread with the keyword “unroll” to get a link to it.

Latest Threads Unrolled: