The names() function in R is used to either retrieve or set the names of elements in an object. The function can be applied to vectors, matrices, or data frames. When assigning names to an object, the length of the names vector must match the length of the object.
Syntax:
names(x) <- value
Parameters:
x: The object (e.g., vector, matrix, data frame) whose names are to be set or retrieved.
value: The vector of names to be assigned to the object x.
Example 1: Assigning Names to a Vector
# R program to assign names to a vector
# Create a numeric vector
vec <- c(10, 20, 30, 40, 50)
# Assign names using the names() function
names(vec) <- c("item1", "item2", "item3", "item4", "item5")
# Display the names
names(vec)
# Print the updated vector
print(vec)
# R program to get the column names of a data frame
# Load built-in dataset
data("mtcars")
# Display the first few rows of the dataset
head(mtcars)
# Retrieve column names using the names() function
names(mtcars)
The attributes() function in R is used to retrieve all the attributes of an object. Additionally, it can be used to set or modify attributes for an object.
Syntax:
attributes(x)
Parameters:
x: The object whose attributes are to be accessed or modified.
Example 1: Retrieving Attributes of a Data Frame
# R program to illustrate attributes() function
# Create a data frame
data_set <- data.frame(
Name = c("Alice", "Bob", "Charlie"),
Age = c(25, 30, 22),
Score = c(85, 90, 95)
)
# Print the first few rows of the data frame
head(data_set)
# Retrieve the attributes of the data frame
attributes(data_set)
Here, the attributes() function lists all the attributes of the data_set data frame, such as column names, class, and row names.
Example 2: Adding New Attributes to a Data Frame
# R program to add new attributes
# Create a data frame
data_set <- data.frame(
Name = c("Alice", "Bob", "Charlie"),
Age = c(25, 30, 22),
Score = c(85, 90, 95)
)
# Create a list of new attributes
new_attributes <- list(
names = c("Name", "Age", "Score"),
class = "data.frame",
description = "Sample dataset"
)
# Assign new attributes to the data frame
attributes(data_set) <- new_attributes
# Display the updated attributes
attributes(data_set)
In this example, a new attribute (description) is added to the data_set data frame, along with retaining existing attributes like column names and class.
attr() Function
The attr() function is used to access or modify a specific attribute of an object. Unlike attributes(), it requires you to specify the name of the attribute you want to retrieve or update.
Syntax:
attr(x, which = "attribute_name")
Parameters:
x: The object whose attribute is to be accessed or modified.
which: The name of the attribute to be accessed or modified.
Example: Accessing a Specific Attribute
# R program to illustrate attr() function
# Create a data frame
data_set <- data.frame(
Name = c("Alice", "Bob", "Charlie"),
Age = c(25, 30, 22),
Score = c(85, 90, 95)
)
# Retrieve the column names using attr()
attr(x = data_set, which = "names")
Output:
[1] "Name" "Age" "Score"
Here, the attr() function retrieves the column names of the data_set data frame.
In Object-Oriented Programming (OOP) within the R language, encapsulation refers to bundling data and methods within a class. The R6 package in R provides an encapsulated OOP framework, enabling the use of encapsulation effectively. The R6 package offers R6 classes, which function similarly to reference classes but are independent of S4 classes. In the R6 system, you define a class by creating a new R6Class object, specifying the class name, and including a list of properties and methods. Properties can be any R object, while methods are functions that interact with objects of the class.
To create an instance of an R6 class, you use the $new() method, passing any initial values for the properties. Once an object is instantiated, you can call its methods and access or modify its properties using the $ operator.
A key feature of R6 is its support for encapsulation and information hiding. This allows internal object details to remain hidden, simplifying the creation of complex and robust programs.
R6 classes allow for organizing code efficiently, enabling the creation of custom objects with their own properties and behaviors. Additionally, R6 supports inheritance, even across classes defined in different packages. Prominent R packages like dplyr and shiny utilize R6 classes.
Example: Basic R6 Class Implementation
library(R6)
# Define a Stack class
Stack <- R6Class("Stack",
# Public members
public = list(
# Constructor/initializer
initialize = function(...) {
private$items <- list(...)
},
# Push an item onto the stack
push = function(item) {
private$items <- append(private$items, item)
},
# Pop an item from the stack
pop = function() {
if (self$size() == 0)
stop("Stack is empty")
item <- private$items[[length(private$items)]]
private$items <- private$items[-length(private$items)]
item
},
# Get the number of items in the stack
size = function() {
length(private$items)
}
),
# Private members
private = list(
items = list()
)
)
# Create a Stack object
StackObject <- Stack$new()
# Push 10 onto the stack
StackObject$push(10)
# Push 20 onto the stack
StackObject$push(20)
# Pop the top item (20)
StackObject$pop()
# Pop the remaining item (10)
StackObject$pop()
Output:
[1] 20
[1] 10
In this example, the stack is implemented with private storage (items) that is hidden from external modification. The initialize method acts as the constructor, and public methods like push and pop provide controlled access to the stack.
Example: Inheritance in R6 Classes
# Define a subclass of Stack
ExtendedStack <- R6Class("ExtendedStack",
# Inherit the Stack class
inherit = Stack,
public = list(
# Override the size method to display a message
size = function() {
message("Calculating stack size...")
super$size() # Call the size method of the superclass
}
)
)
# Create an ExtendedStack object
ExtendedStackObject <- ExtendedStack$new()
# Push 5 onto the stack
ExtendedStackObject$push(5)
# Push 15 onto the stack
ExtendedStackObject$push(15)
# Check the stack size (with a message)
ExtendedStackObject$size()
# Pop the top item (15)
ExtendedStackObject$pop()
# Pop the remaining item (5)
ExtendedStackObject$pop()
Output:
[1] 20
[1] 10
In this example, the stack is implemented with private storage (items) that is hidden from external modification. The initialize method acts as the constructor, and public methods like push and pop provide controlled access to the stack.
Example: Inheritance in R6 Classes
# Define a subclass of Stack
ExtendedStack <- R6Class("ExtendedStack",
# Inherit the Stack class
inherit = Stack,
public = list(
# Override the size method to display a message
size = function() {
message("Calculating stack size...")
super$size() # Call the size method of the superclass
}
)
)
# Create an ExtendedStack object
ExtendedStackObject <- ExtendedStack$new()
# Push 5 onto the stack
ExtendedStackObject$push(5)
# Push 15 onto the stack
ExtendedStackObject$push(15)
# Check the stack size (with a message)
ExtendedStackObject$size()
# Pop the top item (15)
ExtendedStackObject$pop()
# Pop the remaining item (5)
ExtendedStackObject$pop()
Output:
Calculating stack size...
[1] 2
[1] 15
[1] 5
In this example, the ExtendedStack class inherits from the Stack class. It overrides the size method to include a message while still calling the original method using super. This demonstrates how methods from the parent class can be extended or customized in the subclass.
Key Features of R6 Classes
Encapsulation: Private members (e.g., private$items) ensure internal object details are protected from external modification.
Public and Private Members: Public members are accessible using $, while private members are accessible only within class methods.
Inheritance: Subclasses can inherit properties and methods from parent classes, and super allows access to parent methods.
Initialization: The initialize method acts as a constructor for setting up objects.
These features make R6 a robust and flexible system for implementing OOP in R.
Explicit coercion refers to the process of converting an object from one type of class to another using specific functions. These functions are similar to base functions but differ in that they are not generic and do not call S3 class methods for conversion.
Difference Between Conversion, Coercion, and Casting
Coercion: Refers to implicit conversion of data types.
Casting: Refers to explicit conversion of data types.
Conversion: A general term that encompasses both coercion and casting.
Explicit Coercion to Character
To explicitly convert an object to a character type, two functions can be used: as.character() and as.string(). The encoding parameter informs the R compiler about the encoding of the vector and assists in managing character and string vectors.
Example:
# Creating a vector
v <- c(10, 20, 30, 40)
# Converting to character type
as.character(v)
Output:
[1] "10" "20" "30" "40"
Explicit Coercion to Numeric and Logical
Explicit coercion to numeric, logical, or other data types is done using as.* functions, where the * represents the target data type. These functions take a vector as their parameter.
Function Descriptions
Function
Description
as.logical
Converts values to logical type.0 is converted to FALSE.Non-zero values are converted to TRUE.
as.integer
Converts the object to integer type.
as.double
Converts the object to double precision type.
as.complex
Converts the object to complex type.
as.list
Accepts a vector or dictionary as input.
Example:
# Creating a vector
x <- c(0, 1, 2, 3)
# Checking the class of the vector
class(x)
# Converting to integer type
as.integer(x)
# Converting to double type
as.double(x)
# Converting to logical type
as.logical(x)
# Converting to a list
eas.list(x)
# Converting to complex type
as.complex(x)
If R cannot determine how to coerce an object, it will produce NA values and issue a warning.
Example:
# Creating a vector with non-numeric values
x <- c("apple", "banana", "cherry")
# Attempting to convert to numeric
as.numeric(x)
# Attempting to convert to logical
as.logical(x)
Output:
[1] NA NA NA
Warning message:
NAs introduced by coercion
[1] NA NA NA
In R, everything is considered an object. Objects have attributes, and one of the most commonly associated attributes is the class. The class() command is used to define the class of an object or determine the class of an object.
Key Features of Class in R:
Inheritance: Objects can inherit from multiple classes.
Order of Inheritance: For complex classes, the order of inheritance can be specified.
Example: Checking the class of an object
# Creating a vector of fruits
fruits <- c("apple", "banana", "cherry")
# Checking the class of the vector
class(fruits)
Output:
[1] "character"
Example: Appending a class to an object
# Creating a vector of fruits
fruits <- c("apple", "banana", "cherry")
# Appending the class "Food" to the vector
class(fruits) <- append(class(fruits), "Food")
class(fruits)
Output:
[1] "character" "Food"
Managing Memory in R
When deciding between S3 and S4 classes in R, S4 offers a more structured approach, while S3 provides flexibility due to its less strict framework.
Memory environments contribute to S3’s flexibility. An environment is similar to a local scope, maintaining a variable set associated with it. Variables within the environment can be accessed if the environment’s ID is known.
To assign or retrieve variable values in an environment, assign() and get() commands are used.
Example: Assigning and retrieving variable values in an environment
S3 is the simplest and most widely used class system in R. An S3 object is essentially a list with assigned class attributes.
Steps to create an S3 object:
Create a list containing the required components.
Use class() to assign a class name to the list.
Example 1: Creating an S3 object for employee details
# Creating a list for employee details
employee <- list(name = "Ravi", emp_id = 1001, salary = 50000)
# Assigning the class "employee" to the list
class(employee) <- "employee"
employee
# Creating a list for book details
book <- list(title = "Data Science Basics", author = "John Doe", pages = 250)
# Assigning the class "book" to the list
class(book) <- "book"
book
R uses generic functions, which are capable of operating differently on various classes of objects. A common generic function is print().
For example, the print() function can print different types of objects like vectors, data frames, and matrices. This is possible because print() is a generic function comprising multiple methods.
Example: Methods of the print() function
# Checking methods available for the generic print() function
methods(print)
Creating, Listing, and Deleting Objects in Memory with Example
In R, what is often referred to as an “object” is equivalent to what many other programming languages call a “variable.” Objects and variables, depending on the context, can have very different interpretations. Variables in all programming languages provide a way to access data stored in memory. However, R does not offer direct access to memory; instead, it provides various specialized data structures, referred to as objects.
In R, objects are accessed through symbols or variable names. Interestingly, symbols in R are themselves objects and can be manipulated like other objects. This unique feature of R has significant implications. In R, everything, from numbers to strings, arrays, vectors, lists, and data frames, is treated as an object.
Creating Objects in Memory
To perform operations in R, you must assign values to objects. This is done by giving the object a name, followed by an assignment operator (<- or =), and then the desired value.
Example:
# Assigning values to objects
num <- 10
text <- "Hello, R!"
vec <- c(4, 8, 12)
Here, the <- symbol is an assignment operator. For instance, after executing num <- 10, the value 10 is assigned to the object num. The assignment can be read as “10 is assigned to num.”
The “for” loop, while a fundamental construct in programming, is often criticized for its high memory consumption and slow execution, especially when dealing with large datasets. In R, alternative functions offer efficient ways to perform looping operations, particularly when working interactively in a command-line environment. Let’s explore these alternatives:
Classes and Objects
A class is a template or blueprint from which objects are created by encapsulating data and methods. An object is a data structure containing attributes and methods that act upon those attributes.
Overview of Looping Functions in R
Looping Function
Operation
apply()
Applies a function over the margins of an array or matrix
lapply()
Applies a function over a list or a vector
sapply()
Similar to lapply(), but returns simplified results
tapply()
Applies a function over subsets of a vector grouped by a factor
mapply()
Multivariate version of lapply()
Let’s explore these functions with examples and their respective outputs.
1. apply(): The apply() function applies a given function across the rows or columns of an array or matrix.
Syntax:
apply(array, margins, function, ...)
array: Input array or matrix
margins: Dimension along which to apply the function (1 for rows, 2 for columns)
2. lapply():The lapply() function applies a function to each element of a list or vector and returns the results as a list.
Syntax:
lapply(list, function, ...)
list: Input list or vector
function: Operation to perform
Example:
# Creating a list of vectors
list_data <- list(vec1 = c(1, 2, 3), vec2 = c(4, 5, 6))
# Applying lapply()
result <- lapply(list_data, mean)
print(result)
Output:
$vec1
[1] 2
$vec2
[1] 5
3. sapply(): The sapply() function simplifies the output of lapply() when possible. It returns a vector, matrix, or list, depending on the input and output.
Syntax:
sapply(list, function, ...)
list: Input list or vector
function: Operation to perform
Example:
# Creating a list of vectors
list_data <- list(vec1 = c(1, 2, 3), vec2 = c(4, 5, 6))
# Applying lapply()
result <- lapply(list_data, mean)
print(result)
Output:
a b
2 5
4. tapply(): The tapply() function applies a function over subsets of a vector grouped by a factor.
Syntax:
tapply(vector, factor, function, ...)
tapply(vector, factor, function, ...)
vector: Input vector
factor: Grouping factor
function: Operation to perform
Example:
# Creating a numeric vector
values <- c(10, 20, 30, 40, 50)
# Creating a factor for grouping
groups <- c("A", "A", "B", "B", "C")
# Applying tapply()
result <- tapply(values, groups, sum)
print(result)
Output:
A B C
30 70 50
5. mapply():The mapply() function applies a function to multiple arguments simultaneously. It is a multivariate version of lapply().
Here are some additional ways to loop through lists and display elements:
Display All Elements on the Same Line
my_list <- c(1, 2, 3, 4, 5)
for (element in my_list) {
cat(element, " ")
}
Output:
1 2 3 4 5
Display All Elements on Different Lines
for (element in my_list) {
cat(element, "\n")
}
Output:
1
2
3
4
5
Display Specific Values
for (element in my_list) {
if (element %% 2 == 0) { # Display only even values
cat(element, "\n")
}
}
Output:
2
4
These functions provide flexible and efficient ways to handle looping operations in R, reducing memory consumption and improving execution speed. Use them in place of traditional “for” loops for better performance and cleaner code.
People who have been using the R programming language for a while are likely familiar with passing functions as arguments to other functions. However, they are less likely to return functions from their custom code. This is unfortunate, as doing so can unlock a new level of abstraction, reducing both the amount and complexity of the code required for certain tasks. Below, we present examples demonstrating how R programmers can leverage lexical closures to encapsulate both data and behavior.
Implementation in R
Simple Example: Adding Numbers
To start with a simple example, suppose you want a function that adds 3 to its argument. You might write something like this:
add_3 <- function(y) { 3 + y }
This function works as expected:
> add_3(1:10)
[1] 4 5 6 7 8 9 10 11 12 13
Now, suppose you need another function that adds 8 to its argument. Instead of writing a new function similar to add_3, a better approach is to create a function that generates these functions dynamically. Here’s how you can do that:
add_x <- function(x) {
function(y) { x + y }
}
Calling add_x with an argument returns a new function that performs the desired operation:
If you closely examine the definition of add_x, you may wonder how the returned function knows about x when it is called later. This behavior is due to R’s lexical scoping. When add_x is called, the x argument is captured in the environment of the returned function.
Advanced Example: Bootstrapping with Containers
Now, let’s look at a more practical example. Suppose you’re performing some complex bootstrapping, and for efficiency, you pre-allocate vectors to store results. Here’s a straightforward implementation for a single vector:
nboot <- 100
bootmeans <- numeric(nboot)
data <- rnorm(1000) # Example dataset
for (i in 1:nboot) {
bootmeans[i] <- mean(sample(data, length(data), replace = TRUE))
}
> mean(data)
[1] -0.0024
> mean(bootmeans)
[1] -0.0018
However, if you need to track multiple statistics, each requiring a unique index variable, this process can become tedious and error-prone. Using closures, you can abstract away the bookkeeping. Here’s a function that creates a pre-allocated container:
make_container <- function(n) {
x <- numeric(n)
i <- 1
function(value = NULL) {
if (is.null(value)) {
return(x)
} else {
x[i] <<- value
i <<- i + 1
}
}
}
Calling make_container with a size n returns a function that manages the container. If the argument to the function is NULL, it returns the entire vector. Otherwise, it adds the value to the next position in the vector:
nboot <- 100
bootmeans <- make_container(nboot)
data <- rnorm(1000)
for (i in 1:nboot) {
bootmeans(mean(sample(data, length(data), replace = TRUE)))
}
> mean(data)
[1] -0.0024
> mean(bootmeans())
[1] -0.0019
This approach simplifies the management of multiple containers and ensures that indexing is handled internally.
Inheritance is one of the key concepts in object-oriented programming, enabling new classes to be derived from existing or base classes, thereby enhancing code reusability. Derived classes can inherit properties and behaviors from the base class or include additional features, creating a hierarchical structure in the programming environment. This article explores how inheritance is implemented in R programming using three types of classes: S3, S4, and Reference Classes.
Inheritance in S3 Class
S3 classes in R are informal and have no rigid definitions. They use lists with a class attribute set to a class name. Objects of S3 classes inherit only methods from their base class.
Example:
# Create a function to define a class
employee <- function(n, a, e){
obj <- list(name = n, age = a, emp_id = e)
attr(obj, "class") <- "employee"
obj
}
# Define a method for the generic function print()
print.employee <- function(obj){
cat("Name:", obj$name, "\n")
cat("Age:", obj$age, "\n")
cat("Employee ID:", obj$emp_id, "\n")
}
# Create an object and inherit the class
e <- list(name = "Priya", age = 30, emp_id = 123,
department = "HR")
class(e) <- c("Manager", "employee")
cat("Using the inherited method print.employee():\n")
print(e)
# Overwrite the print method
print.Manager <- function(obj){
cat(obj$name, "is a Manager in", obj$department, "department.\n")
}
cat("After overwriting the method print.employee():\n")
print(e)
# Check inheritance
cat("Is the object 'e' inherited from class 'employee'?\n")
inherits(e, "employee")
Output:
Using the inherited method print.employee():
Name: Priya
Age: 30
Employee ID: 123
After overwriting the method print.employee():
Priya is a Manager in HR department.
Is the object 'e' inherited from class 'employee'?
[1] TRUE
Inheritance in S4 Class
S4 classes in R are more formal, with strict definitions for slots. Derived classes can inherit both attributes and methods from the base class.
Example:
# Define an S4 class
setClass("employee",
slots = list(name = "character",
age = "numeric", emp_id = "numeric")
)
# Define a method to display object details
setMethod("show", "employee",
function(obj){
cat("Name:", obj@name, "\n")
cat("Age:", obj@age, "\n")
cat("Employee ID:", obj@emp_id, "\n")
}
)
# Create a derived class
setClass("Manager",
slots = list(department = "character"),
contains = "employee"
)
# Create an object of the derived class
m <- new("Manager", name = "Priya", age = 30, emp_id = 123, department = "HR")
show(m)
Output:
Name: Priya
Age: 30
Employee ID: 123
Inheritance in Reference Class
Reference classes use the setRefClass() function to create classes and manage inheritance. The concept is similar to S4 classes but supports mutable objects.
Example:
# Define a reference class
employee <- setRefClass("employee",
fields = list(name = "character",
age = "numeric", emp_id = "numeric"),
methods = list(
increment_age = function(x) {
age <<- age + x
},
decrement_age = function(x) {
age <<- age - x
}
)
)
# Create a derived reference class
Manager <- setRefClass("Manager",
fields = list(department = "character"),
contains = "employee",
methods = list(
decrement_age = function(x) {
if ((age - x) < 0) stop("Age cannot be negative")
age <<- age - x
}
)
)
# Create an object
mgr <- Manager(name = "Priya", age = 30, emp_id = 123, department = "HR")
cat("Decreasing age by 5:\n")
mgr$decrement_age(5)
mgr$age
cat("Attempting to decrease age by 30:\n")
mgr$decrement_age(30)
mgr$age
Output:
Decreasing age by 5:
[1] 25
Attempting to decrease age by 30:
Error in mgr$decrement_age(30) : Age cannot be negative
[1] 25
R supports parametric polymorphism, enabling methods (functions) to operate on a variety of object types. Unlike class-based systems, R’s methods are tied to functions rather than classes. This mechanism allows the creation of generic methods or functions that can handle diverse object types, even ones not explicitly defined. By utilizing this, developers can use the same function name across different object classes, with behavior tailored for each.
Generic Functions in R
Polymorphism in R is achieved through generic functions, which serve as dispatchers. These functions invoke specific methods based on the class of their input objects. For instance, R’s plot() and summary() functions adapt their behavior depending on the type of object they receive as input.
Example: plot() Function
The plot() function demonstrates polymorphism by producing different visualizations depending on the type of input—numeric vectors, factors, or data frames.
Structure of plot() Function
plot
Example:
function (x, y, ...)
UseMethod("plot")
The UseMethod("plot") directive ensures the appropriate plot method is called based on the input object’s class.
Example:
methods(plot)
Output:
Viewing Available Methods for plot()
Available Methods
plot.default
plot.function
plot.lm
plot.ts
plot.factor
plot.data.frame
plot.histogram
plot.ecdf
plot.stepfun
plot.hclust
plot.prcomp
plot.density
plot.qqnorm
Single Numeric Vector: When a numeric vector is passed to plot(), it produces a line graph.
# Example: Single Numeric Vector
x <- 1:20
plot(x, type = 'l', col = 'blue')
Output:
Two Numeric Vectors: Passing two numeric vectors generates a scatter plot.
# Example: Two Numeric Vectors
x <- 1:10
y <- x * 2
plot(x, y, type = 'p', col = 'red', pch = 16)
Output:
Factor: Passing a factor to plot() produces a bar chart.