R – Pareto Chart

Pareto Chart in detail

The Pareto chart combines a bar chart and a line chart: the left vertical axis shows the frequency of occurrences for different categories (sorted in descending order), and the right vertical axis displays the cumulative percentage. This visualization follows the Pareto principle, which states that roughly 80% of effects come from 20% of the causes.

Syntax:

pareto.chart(x,
             ylab = "Frequency",
             ylab2 = "Cumulative Percentage",
             xlab,
             cumperc = seq(0, 100, by = 25),
             ylim,
             main,
             col = heat.colors(length(x)))

Parameters:

  • x: A vector of values. The names attached to x are used for labeling the bars.
  • ylab: A string specifying the label for the primary y-axis (left side).
  • ylab2: A string specifying the label for the secondary y-axis (right side) showing the cumulative percentage.
  • xlab: A string specifying the label for the x-axis.
  • cumperc: A vector of percentage values to be used as tick marks for the secondary y-axis.
  • ylim: A numeric vector specifying the limits for the primary y-axis.
  • main: A string specifying the main title for the plot.
  • col: A value for the color, a vector of colors, or a palette for the bars.

Steps for Plotting a Pareto Chart in R

  1. Create a vector that contains the frequency counts of different categories.
  2. Assign names to the vector elements to label each category.
  3. Plot the vector using the pareto.chart() function.

Example:

# Install and load the qcc package
install.packages("qcc")
library(qcc)

# Frequency counts for various customer issues
issues <- c(35, 850, 15, 50, 20, 120, 40, 10, 55, 500)

# Labels for the issues
names(issues) <- c("Late Delivery", "Damaged Item", "Incorrect Order",
                   "Poor Packaging", "Missing Item", "Wrong Product",
                   "Customer Service", "Billing Error", "Return Issues", "Other")

# Generate the Pareto chart
pareto.chart(issues,
             xlab = "Issue Categories",        # Label for x-axis
             ylab = "Frequency",               # Label for left y-axis
             col = heat.colors(length(issues)),# Colors for the bars
             cumperc = seq(0, 100, by = 20),     # Tick marks for the cumulative percentage
             ylab2 = "Cumulative Percentage",  # Label for right y-axis
             main = "Customer Complaints")     # Chart title

Output:

Example 2: Product Defects

# Frequency counts for product defects
defects <- c(6000, 3500, 4800, 2500, 900)

# Labels for the product categories
names(defects) <- c("Type X", "Type Y", "Type Z", "Type W", "Type V")

# Generate the Pareto chart
pareto.chart(defects,
             xlab = "Product Categories",      # Label for x-axis
             ylab = "Frequency",               # Label for left y-axis
             col = heat.colors(length(defects)),# Colors for the bars
             cumperc = seq(0, 100, by = 10),     # Tick marks for the cumulative percentage
             ylab2 = "Cumulative Percentage",  # Label for right y-axis
             main = "Product Defects")         # Chart title

Output:

Comments

Leave a Reply

Your email address will not be published. Required fields are marked *