Convert Factor to Numeric and Numeric to Factor in R Programming

Convert Factor to Numeric and Numeric to Factor in detail

Factors are data objects used to categorize data and store it as levels. They can store both strings and integers. Factors represent columns with a limited number of unique values. In R, factors can be created using the factor() function, which takes a vector as input. The c() function is used to create a vector with explicitly provided values.

Example:

items <- c("Apple", "Banana", "Grapes", "Apple", "Grapes", "Grapes", "Banana", "Banana")

print(items)
print(is.factor(items))

# Convert to factor
type_items <- factor(items)
print(levels(type_items))

Parameters:

  • x: A matrix, array, or data frame.
  • na.rm: A logical argument. If set to TRUE, it removes missing values (NA) before calculating the sum. Default is FALSE.
  • dims: An integer specifying the dimensions regarded as ‘rows’ to sum over. It applies summation over dims+1, dims+2, ...
[1] "Apple"  "Banana" "Grapes" "Apple"  "Grapes" "Grapes" "Banana" "Banana"
[1] FALSE
[1] "Apple"  "Banana" "Grapes"

Output:

[1] "Apple"  "Banana" "Grapes" "Apple"  "Grapes" "Grapes" "Banana" "Banana"
[1] FALSE
[1] "Apple"  "Banana" "Grapes"

Here, items is a vector with 8 elements. It is converted to a factor using the factor() function. The unique elements in the data are called levels, which can be retrieved using the levels() function.

Ordering Factor Levels

Ordered factors are an extension of factors, arranging the levels in increasing order. This can be done using the factor() function with the ordered argument.

Syntax:

factor(data, levels = c(""), ordered = TRUE)

Parameters:

data: Input vector with explicitly defined values.
levels: List of levels mentioned using the c() function.
ordered: Set to TRUE to enable ordering.

Example:

# Creating size vector
sizes <- c("small", "large", "large", "small", "medium", "large", "medium", "medium")

# Converting to factor
size_factor <- factor(sizes)
print(size_factor)

# Ordering the levels
ordered_size <- factor(sizes, levels = c("small", "medium", "large"), ordered = TRUE)
print(ordered_size)

Output:

[1] "Apple"  "Banana" "Grapes" "Apple"  "Grapes" "Grapes" "Banana" "Banana"
[1] FALSE
[1] "Apple"  "Banana" "Grapes"

Here, items is a vector with 8 elements. It is converted to a factor using the factor() function. The unique elements in the data are called levels, which can be retrieved using the levels() function.

Ordering Factor Levels

Ordered factors are an extension of factors, arranging the levels in increasing order. This can be done using the factor() function with the ordered argument.

Syntax:

factor(data, levels = c(""), ordered = TRUE)

Parameters:

  • data: Input vector with explicitly defined values.
  • levels: List of levels mentioned using the c() function.
  • ordered: Set to TRUE to enable ordering.

Example:

# Creating size vector
sizes <- c("small", "large", "large", "small", "medium", "large", "medium", "medium")

# Converting to factor
size_factor <- factor(sizes)
print(size_factor)

# Ordering the levels
ordered_size <- factor(sizes, levels = c("small", "medium", "large"), ordered = TRUE)
print(ordered_size)

Output:

[1] small  large  large  small  medium large  medium medium
Levels: large medium small

[1] small  large  large  small  medium large  medium medium
Levels: small < medium < large

In this example, the sizes vector is created using the c() function. It is then converted to a factor, and for ordering the levels, the factor() function is used with the specified order.

Alternative Method Using ordered():

# Creating vector sizes
sizes <- c("small", "large", "large", "small", "medium")
size_ordered <- ordered(sizes, levels = c("small", "medium", "large"))
print(size_ordered)

Output:

[1] small  large  large  small  medium
Levels: small < medium < large
Level Ordering Visualization in R

This example creates a dataset of student ages categorized by education level (high school, college, and graduate). It then generates a boxplot to visualize the distribution of ages for each education level using pandas and matplotlib.

# Create a sample dataset of student grades
grade_data <- data.frame(
  score = c(70, 85, 60, 95, 88, 76, 82, 91, 69, 79, 92, 84, 77, 83, 90),
  class_level = factor(c(rep("freshman", 5), rep("sophomore", 4), rep("junior", 3), rep("senior", 3)))
)

# Specify level ordering for the "class_level" factor
grade_data$class_level <- factor(grade_data$class_level, levels = c("freshman", "sophomore", "junior", "senior"))

# Create a boxplot of grades by class level
boxplot(score ~ class_level, data = grade_data, main = "Student Grades by Class Level")

Comments

Leave a Reply

Your email address will not be published. Required fields are marked *