Scatter plots in R Language

Scatter plots in detail

A scatter plot is a set of dotted points representing individual data pieces on the horizontal and vertical axis. In a graph in which the values of two variables are plotted along the X-axis and Y-axis, the pattern of the resulting points reveals a correlation between them.

R – Scatter Plots

We can create a scatter plot in R Programming Language using the plot() function.

Syntax:

plot(x, y, main, xlab, ylab, xlim, ylim, axes)

Parameters:

  • x: This parameter sets the horizontal coordinates.
  • y: This parameter sets the vertical coordinates.
  • xlab: This parameter is the label for the horizontal axis.
  • ylab: This parameter is the label for the vertical axis.
  • main: This parameter is the title of the chart.
  • xlim: This parameter is used for plotting values of x.
  • ylim: This parameter is used for plotting values of y.
  • axes: This parameter indicates whether both axes should be drawn on the plot.

Simple Scatterplot Chart

To create a Scatterplot Chart:

  • We use the dataset iris.
  • Use the columns Sepal.Length and Petal.Length in iris.

Example:

# Get the input values.
input <- iris[, c('Sepal.Length', 'Petal.Length')]

# Print the first few rows
print(head(input))

Output:

Sepal.Length Petal.Length
1          5.1          1.4
2          4.9          1.4
3          4.7          1.3
4          4.6          1.5
5          5.0          1.4
6          5.4          1.7
Creating a Scatterplot Graph

To create an R Scatterplot graph:

  • We use the plot() function to generate the scatterplot.
  • The xlab parameter describes the X-axis and ylab describes the Y-axis.

Example:

# Get the input values.
input <- iris[, c('Sepal.Length', 'Petal.Length')]

# Plot the chart for Sepal.Length and Petal.Length.
plot(x = input$Sepal.Length, y = input$Petal.Length,
    xlab = "Sepal Length",
    ylab = "Petal Length",
    xlim = c(4, 8),
    ylim = c(1, 7),
    main = "Sepal Length vs Petal Length"
)

Output:

Scatterplot Matrices

When we have two or more variables and we want to correlate between one variable and others, we use an R scatterplot matrix.

The pairs() function is used to create matrices of scatterplots.

Syntax:

pairs(formula, data)

Parameters:

  • formula: This parameter represents the series of variables used in pairs.
  • data: This parameter represents the dataset from which the variables will be taken.

Example:

# Load the built-in iris dataset
data(iris)

# Create the scatterplot matrix
pairs(~Sepal.Length + Sepal.Width + Petal.Length + Petal.Width,
      data = iris,
      main = "Scatterplot Matrix")

Output:

Scatterplot with Fitted Values

Creating a Scatterplot in R

To create a scatterplot in R, we use the ggplot2 package, which provides the ggplot() and geom_point() functions for visualization.

In this example, we use the mtcars dataset and plot the relationship between the logarithm of mpg (miles per gallon) and drat (rear axle ratio). The stat_smooth() function is used to add a fitted linear regression line.

Example:

# Loading ggplot2 package
library(ggplot2)

# Creating scatterplot with fitted values.
ggplot(mtcars, aes(x = log(mpg), y = log(drat))) +
        geom_point(aes(color = factor(gear))) +
        stat_smooth(method = "lm", col = "#C42126", se = FALSE, size = 1)

Output:

Adding Titles Dynamically

To enhance the visualization, we add a title, subtitle, and caption using the labs() function.

Example:

# Loading ggplot2 package
library(ggplot2)

# Creating scatterplot with fitted values.
graph <- ggplot(mtcars, aes(x = log(mpg), y = log(drat))) +
          geom_point(aes(color = factor(gear))) +
          stat_smooth(method = "lm", col = "#C42126", se = FALSE, size = 1)

# Adding title, subtitle, and caption
graph + labs(
        title = "Relationship between Mileage and Drat",
        subtitle = "Data categorized by gear count",
        caption = "Computed using mtcars dataset"
)

Output:

3D Scatterplots

For 3D scatterplots, we use the plotly package, which enables interactive visualizations.

Example:

# Loading required library
library(plotly)

# Attaching mtcars dataset
attach(mtcars)

# Creating a 3D scatterplot
plot_ly(data = mtcars, x = ~mpg, y = ~hp, z = ~cyl, color = ~gear)

Output:

Scatter Plot

A scatter plot visually represents the relationship between two numerical variables. The x-axis represents one data vector, while the y-axis represents another.

Syntax:

plot(x, y, type, xlab, ylab, main)

Parameters:

  • x: Data vector for the x-axis
  • y: Data vector for the y-axis
  • type: Type of plot (e.g., “l” for lines, “p” for points, “s” for steps)
  • xlab: Label for the x-axis
  • ylab: Label for the y-axis
  • main: Title of the graph
help("plot")

Example:

# Creating a dataset
data_set <- data.frame(Height = c(150, 160, 165, 170, 175, 180, 185, 190),
                       Weight = c(50, 55, 60, 65, 70, 75, 80, 85))

# Output as a PNG file
png(file = "scatterplot_output.png")

# Creating the scatter plot
plot(x = data_set$Height, y = data_set$Weight,
     xlab = "Height (cm)", ylab = "Weight (kg)",
     main = "Height vs. Weight", col = "red", pch = 19)

# Saving the file
dev.off()

Output:

Comments

Leave a Reply

Your email address will not be published. Required fields are marked *