Author: Pooja Kotwani

  • Introduction to Object-Oriented Programming in R Language

    Introduction

    Object-Oriented Programming (OOP) is a programming paradigm that organizes software design around objects, rather than functions and logic alone. Objects represent real-world entities and combine data (attributes) and behavior (methods) into a single unit.

    In R, Object-Oriented Programming is especially important for:

    • Statistical modeling
    • Data analysis
    • Package development
    • Complex data structures
    • Reusable and maintainable code

    Unlike many languages (Java, C++), R supports multiple OOP systems, each designed for different use cases.


    What is Object-Oriented Programming?

    Object-Oriented Programming is based on four core principles:

    1. Encapsulation – bundling data and methods together
    2. Abstraction – exposing only essential details
    3. Inheritance – creating new objects from existing ones
    4. Polymorphism – same function behaving differently for different objects

    Why OOP is Important in R

    OOP helps:

    • Organize large programs
    • Reuse code efficiently
    • Write cleaner and modular code
    • Build extensible statistical models
    • Create professional R packages

    Many built-in R functions like plot(), print(), and summary() use OOP concepts internally.


    Object-Oriented Systems in R

    R supports three main OOP systems:

    1. S3 – Simple and informal (most common)
    2. S4 – Formal and strict
    3. R6 – Modern, reference-based OOP

    Objects in R

    An object in R is any data structure stored in memory.

    Examples:

    x <- 10
    y <- "R"
    z <- c(1, 2, 3)
    

    Everything in R is an object:

    • Variables
    • Vectors
    • Data frames
    • Functions
    • Models

    Classes in R

    A class defines the type or category of an object.

    class(x)
    

    Example:

    v <- c(1, 2, 3)
    class(v)
    

    Output:

    "numeric"
    

    Encapsulation in R

    Encapsulation means combining data and functions that operate on that data.

    In R:

    • Objects store data
    • Functions operate on objects

    Example:

    person <- list(name = "Alice", age = 25)
    

    Abstraction in R

    Abstraction means hiding implementation details and showing only essential functionality.

    Example:

    summary(lm(mpg ~ wt, data = mtcars))
    

    You don’t need to know how summary() works internally.


    Polymorphism in R

    Polymorphism allows the same function to behave differently depending on object type.

    Example:

    print(10)
    print("R")
    print(c(1, 2, 3))
    

    The print() function behaves differently for each object.


    Method Dispatch in R

    Method dispatch is how R decides which function to call for an object.

    Example:

    plot(1:10)
    plot(mtcars$wt, mtcars$mpg)
    

    Different plots, same function name.


    S3 Object-Oriented System (Introduction)

    What is S3?

    S3 is the simplest and most widely used OOP system in R.

    Characteristics:

    • Informal
    • No strict class definitions
    • Easy to use and flexible

    Creating an S3 Object

    person <- list(name = "Alice", age = 25)
    class(person) <- "Person"
    

    Creating an S3 Method

    print.Person <- function(obj) {
      cat("Name:", obj$name, "\n")
      cat("Age:", obj$age, "\n")
    }
    
    print(person)
    

    Polymorphism with S3

    print(10)
    print(person)
    

    Same function name, different behavior.


    S4 Object-Oriented System (Brief Introduction)

    What is S4?

    S4 is a formal OOP system with:

    • Explicit class definitions
    • Defined slots (attributes)
    • Strict type checking

    Example:

    setClass("Person",
             slots = list(
               name = "character",
               age = "numeric"
             ))
    

    R6 Object-Oriented System (Introduction)

    What is R6?

    R6 is a modern OOP system similar to Python and Java.

    Features:

    • Reference-based objects
    • Encapsulation
    • Private and public members

    Creating an R6 Class

    library(R6)
    
    Person <- R6Class(
      "Person",
      public = list(
        name = NULL,
        age = NULL,
        initialize = function(name, age) {
          self$name <- name
          self$age <- age
        },
        greet = function() {
          cat("Hello, my name is", self$name, "\n")
        }
      )
    )
    
    p <- Person$new("Alice", 25)
    p$greet()
    

    Inheritance in R6

    Student <- R6Class(
      "Student",
      inherit = Person,
      public = list(
        course = NULL,
        initialize = function(name, age, course) {
          super$initialize(name, age)
          self$course <- course
        }
      )
    )
    

    Comparison of OOP Systems in R

    FeatureS3S4R6
    FormalityLowHighHigh
    SyntaxSimpleVerboseModern
    InheritanceYesYesYes
    EncapsulationWeakStrongStrong
    Reference-basedNoNoYes

    When to Use Which OOP System

    • S3: Simple modeling, quick prototyping
    • S4: Complex statistical packages
    • R6: Applications, APIs, large systems

    Common Mistakes in OOP with R

    • Mixing S3 and S4 incorrectly
    • Not understanding method dispatch
    • Misusing reference-based objects
    • Overcomplicating simple tasks

    Summary

    R supports multiple object-oriented programming systems. OOP in R allows you to write reusable, modular, and maintainable code. Understanding S3, S4, and R6 is essential for advanced R programming, package development, and professional data science work.

  • Object Oriented Programming

    max() function in detail

    max() Function in R Language
    The max() function in R is used to identify the largest element in a given object. This object can be a vector, list, matrix, data frame, etc.

    Syntax

    max(object, na.rm)

    Parameters

    • object: Any R object such as a vector, matrix, list, or data frame.
    • na.rm: A logical value (TRUE or FALSE) that determines whether to ignore NA values.

    Example 1: Finding the Maximum Element in Vectors

    # Creating vectors
    vector1 <- c(12, 25, 8, 14, 19)
    vector2 <- c(5, NA, 17, 4, 10)
    
    # Finding the maximum element
    max(vector1)               # Without NA
    max(vector2, na.rm = FALSE) # Includes NA
    max(vector2, na.rm = TRUE)  # Excludes NA

    Output:

    [1] 25
    [1] NA
    [1] 17

    Example 2: Finding the Maximum Element in a Matrix

    # Creating a matrix
    matrix_data <- matrix(1:12, nrow = 3, ncol = 4)
    print(matrix_data)
    
    # Finding the maximum element
    max(matrix_data)

    Output:

    [,1] [,2] [,3] [,4]
    [1,]    1    4    7   10
    [2,]    2    5    8   11
    [3,]    3    6    9   12
    
    [1] 12

  • Print a Formatted string in R Programming – sprintf() Function

    sprintf() Function in detail

    The sprintf() function in R uses a format specified by the user to return the formatted string, inserting the values provided.

    Syntax:

    sprintf(format, values)

    Parameters:

    • format: The format of the string for printing values.
    • values: The variables or values to be inserted into the string.

    Example 1: Formatted String using sprintf()

    # R program to illustrate
    # the use of sprintf() function
    
    # Initializing values
    name <- "Alice"
    greeting <- "Good Morning"
    
    # Calling sprintf() function
    sprintf("%s, %s!", greeting, name)

    Output:

    [1] "Good Morning, Alice!"

    In this example, sprintf() is used to create a formatted string where the placeholders %s are replaced by the values of the variables greeting and name.

    Example 2: Formatted String with Numbers using sprintf()

    # R program to illustrate
    # the use of sprintf() function
    
    # Initializing values
    item <- "Apples"
    quantity <- 12
    price_per_item <- 2.5
    
    # Calling sprintf() function
    sprintf("The price for %d %s is $%.2f", quantity, item, quantity * price_per_item)

    Output:

    [1] "The price for 12 Apples is $30.00"

    This example demonstrates using sprintf() to format a string that includes both text and numeric values, where the %d placeholder is used for integers, and %.2f is used for floating-point numbers.

    Using paste(): The paste() function in R is used for concatenating elements into a single string, with optional separators between them.

    Example 3: Formatted String using paste()

    # Generate some example statistical results
    mean_value <- 35.68
    standard_deviation <- 7.42
    
    # Create a formatted string to display the results
    formatted_string <- paste("The mean is", round(mean_value, 2),
                              "and the standard deviation is", round(standard_deviation, 2))
    
    # Print the formatted string
    cat(formatted_string)

    Output:

    The mean is 35.68 and the standard deviation is 7.42

    Here, paste() is used to concatenate the components of the string, and round() ensures the values are formatted to two decimal places.

  • Splitting Strings in R programming – strsplit() method

    strsplit() method in detail

    The strsplit() function in R is used to divide a string into smaller parts based on a specified delimiter.

    Syntax of strsplit()

    strsplit(string, split, fixed)

    Parameters:

    • string: The input text or vector of strings.
    • split: The character or pattern used to split the string.
    • fixed: A logical value indicating whether the split should be treated as a literal match (TRUE) or as a regular expression (FALSE).

    Return Value:

    It returns a list containing the substrings obtained after the split.

    Examples of Splitting Strings in R

    Example 1: Using strsplit() with a Space as a Delimiter

    In this example, we use the strsplit() function to split a given text using space (" ") as a delimiter.

    # R program to split a string
    
    # Given String
    text <- "Data Science with R"
    
    # Using strsplit() method
    result <- strsplit(text, " ")
    
    print(result)

    Output:

    [[1]]
    [1] "Data"    "Science" "with"    "R"

    Example 2: Using Regular Expression to Split a String

    Here, we use a regular expression to split the string wherever one or more numeric characters ([0-9]+) appear.

    # R program to split a string using regex
    
    # Given String
    text <- "Learn7R5Programming"
    
    # Using strsplit() method
    result <- strsplit(text, split = "[0-9]+")
    
    print(result)

    Output:

    [[1]]
    [1] "Learn"      "R"          "Programming"

    Example 3: Splitting Date Strings Using strsplit()

    We can also split date strings into separate components using a specific delimiter, such as "-".

    # R program to split date strings
    
    # Given Date Strings
    date_strings <- c("10-05-2023", "15-06-2023", "20-07-2023", "25-08-2023", "30-09-2023")
    
    # Using strsplit() function
    result <- strsplit(date_strings, split = "-")
    
    print(result)

    Output:

    [[1]]
    [1] "10"  "05"  "2023"
    
    [[2]]
    [1] "15"  "06"  "2023"
    
    [[3]]
    [1] "20"  "07"  "2023"
    
    [[4]]
    [1] "25"  "08"  "2023"
    
    [[5]]
    [1] "30"  "09"  "2023"

  • Convert String from Uppercase to Lowercase in R programming – tolower() method

    tolower() Method in detail

    The tolower() function in R is used to convert uppercase letters in a string to lowercase.

    Syntax:

    tolower(s)

    Return: Returns the string in lowercase.

    Example 1:

    # R program to convert string
    # from uppercase to lowercase
    
    text <- "Hello World"
    
    # Using tolower() method
    result <- tolower(text)
    
    print(result)

    Output:

    [1] "hello world"

    Example 2: 

    # R program to convert string
    # from uppercase to lowercase
    
    # Given String
    sentence <- "ProGRamMing in R is FuN!"
    
    # Using tolower() method
    result <- tolower(sentence)
    
    print(result)

    Output:

    [1] "programming in r is fun!"
  • Convert string from lowercase to uppercase in R programming – toupper() function

    toupper() function in detail

    The toupper() function in R is used to convert a lowercase string to an uppercase string.

    Syntax:

    Return: Returns the uppercase version of the given string.

    Example 1:

    # R program to convert a string
    # from lowercase to uppercase
    
    # Given String
    text <- "Welcome to the world of R programming"
    
    # Using toupper() method
    result <- toupper(text)
    
    print(result)

    Output:

    [1] "WELCOME TO THE WORLD OF R PROGRAMMING"

    Example 2:

    # R program to convert a string
    # from lowercase to uppercase
    
    # Given String
    sentence <- "Practice makes a person perfect"
    
    # Using toupper() method
    converted <- toupper(sentence)
    
    print(converted)

    Output:

    [1] "PRACTICE MAKES A PERSON PERFECT"
  • Adding elements in a vector in R programming – append() method

    append() method in detail

    The append() function in R is used to insert values into a vector at a specified position. If no position is mentioned, the values are added at the end of the vector.

    Syntax:

    append(x, value, index(optional))

    Return Value: It returns a new vector with the specified values appended.

    Example 1: Appending a Value at the End of a Vector

    vec <- c(2, 4, 6, 8)
    
    # Appending value 12 to the vector
    result <- append(vec, 12)
    
    print(result)

    Output:

    [1]  2  4  6  8 12

    Example 2: Inserting a Value at a Specific Position

    vec <- c(5, 10, 15, 20)
    
    # Inserting 7 at the second position
    result <- append(vec, 7, after = 1)
    
    print(result)

    Output:

    [1]  5  7 10 15 20

  • Finding the length of string in R programming – nchar() method

    nchar() method in detail

    The nchar() function in R is used to determine the number of characters in a string object.

    Syntax:

    nchar(string)

    Return Value:

    The function returns the length (number of characters) present in the given string.

    Example 1: Finding the Length of a String

    In this example, we will calculate the length of a string using the nchar() function.

    # R program to determine the length of a string
    
    # Define a string
    text <- "Hello R Programming"
    
    # Use nchar() function
    length_result <- nchar(text)
    
    print(length_result)

    Output:

    [1] 20

    Example 2: Using nchar() with Character Vectors

    This example demonstrates how to apply nchar() to a vector containing different types of elements.

    # R program to get the length of character vectors
    
    # Defining a character vector
    vec <- c('code', '7', 'world', 99)
    
    # Displaying the type of vector
    typeof(vec)
    
    # Applying nchar() function
    nchar(vec)

    Output:

    'character'
    4 1 5 2

    Example 3: Handling NA Values in nchar()

    The nchar() function provides an optional argument keepNA, which helps when dealing with NA values.

    # R program to handle NA values using nchar()
    
    # Defining a vector with NULL and NA values
    vec <- c(NULL, '3', 'data', NA)
    
    # Applying nchar() with keepNA = FALSE
    nchar(vec, keepNA = FALSE)

    Output:

    1 4 2

    Here, NULL returns nothing, and NA is counted as 2 when keepNA = FALSE.

    If we set keepNA = TRUE, the output will be:

    # Applying nchar() with keepNA = TRUE
    vec <- c('', NULL, 'data', NA)
    
    nchar(vec, keepNA = TRUE)

    Output:

    0 4 <NA>

    This means that an empty string returns 0, and NA is explicitly shown as <NA> when keepNA = TRUE.

  • How to find SubString in R programming?

    String Manipulation in detail

    In this article, we will explore different ways to find substrings in the R programming language.

    R provides multiple methods for substring operations, including:

    • Using substr() function
    • Using str_detect() function
    • Using grep() function
    Method 1: Using substr() function

    The substr() function in R is used to extract a substring from a given string based on specified start and end positions.

    Syntax:

    substr(string_name, start, end)

    Return Value: This function returns the substring from the given string according to the specified start and end indexes.

    Example 1:

    Operations on a Single Matrix

    R allows element-wise operations on matrices using overloaded arithmetic operators. Let’s look at some examples.

    # Given String
    text <- "Learning R Programming"
    
    # Using substr() function
    result <- substr(text, 1, 8)
    
    print(result)

    Example:

    # Given String
    sentence <- "Data science and machine learning"
    
    # Using substr() function
    result <- substr(sentence, 13, 27)
    
    print(result)

    Output:

    [1] "and machine le"
    Method 2: Using str_detect() function

    The str_detect() function from the stringr package checks whether a specified substring exists within a string. It returns TRUE if a match is found and FALSE otherwise.

    Syntax:

    str <- paste(c(2:4), "5", sep = "-")
    print(str)

    Parameters:

    • string: The target string or vector of strings.
    • pattern: The substring pattern to be searched.

    Example:

    # Load library
    library(stringr)
    
    # Creating a vector
    words <- c("Apple", "Banana", "Cherry", "Mango")
    
    # Pattern to search
    pattern <- "Banana"
    
    # Using str_detect() function
    str_detect(words, pattern)

    Output:

    [1] FALSE  TRUE FALSE FALSE
    Method 3: Using grep() function

    The grep() function returns the indices of elements in a character vector that match a specified pattern. If multiple matches are found, it returns a list of their respective indices.

    Syntax:

    grep(pattern, string, ignore.case=FALSE)

    Parameters:

    • pattern: A regular expression pattern to match.
    • string: The character vector to search.
    • ignore.case: Whether to perform case-insensitive searching (default is FALSE).

    Example:

    # Define vector
    text_values <- c("Orange", "orchestra", "blue", "ocean")
    
    # Using grep() to find pattern "or"
    matching_indices <- grep("or", text_values, ignore.case=TRUE)
    
    print(matching_indices)

    Output:

    [1] 1 2 4
  • Concatenate Two Strings in R Programming

    Introduction

    String concatenation means joining two or more strings together to form a single string. In R, string concatenation is a very common operation used in:

    • Creating messages and labels
    • Formatting output
    • Generating file names and paths
    • Data cleaning and transformation
    • Reporting and visualization

    R provides multiple built-in functions to concatenate strings efficiently and flexibly.


    Using paste() Function

    What is paste()?

    The paste() function is the most commonly used function for string concatenation in R. It combines strings and adds a separator between them by default.

    Syntax

    paste(..., sep = " ", collapse = NULL)
    
    • sep: separator between strings (default is space " ")
    • collapse: used to combine multiple elements into one string

    Basic Example

    paste("Hello", "World")
    

    Output:

    "Hello World"
    

    Concatenating with Custom Separator

    paste("Data", "Science", sep = "-")
    

    Output:

    "Data-Science"
    

    Concatenating Numbers and Strings

    R automatically converts numbers to strings.

    paste("Score:", 95)
    

    Output:

    "Score: 95"
    

    Concatenating Multiple Strings

    paste("R", "is", "very", "powerful")
    

    Output:

    "R is very powerful"
    

    Using collapse

    collapse is used when you want to combine multiple elements of a vector into a single string.

    languages <- c("R", "Python", "Java")
    paste(languages, collapse = ", ")
    

    Output:

    "R, Python, Java"
    

    Using paste0() Function

    What is paste0()?

    paste0() is a shortcut version of paste() that does not insert any separator.

    Example

    paste0("R", "Studio")
    

    Output:

    "RStudio"
    

    Practical Use Case

    Creating IDs or file names:

    id <- paste0("EMP_", 101)
    id
    

    Output:

    "EMP_101"
    

    Using sprintf() for Concatenation

    Why sprintf()?

    sprintf() is used for formatted string creation, especially when combining strings with numbers in a specific format.

    Example

    name <- "Alice"
    age <- 25
    sprintf("Name: %s, Age: %d", name, age)
    

    Output:

    "Name: Alice, Age: 25"
    

    Formatting Numbers

    sprintf("Price: %.2f", 123.456)
    

    Output:

    "Price: 123.46"
    

    Concatenating Strings Using cat()

    cat() concatenates and prints strings directly to the console.

    cat("Hello", "World", "\n")
    

    Output:

    Hello World
    

    ⚠️ cat() does not return a value, it only prints.


    Vectorized String Concatenation

    R performs concatenation element-wise when vectors are used.

    first <- c("Data", "Machine")
    second <- c("Science", "Learning")
    
    paste(first, second)
    

    Output:

    [1] "Data Science" "Machine Learning"
    

    Practical Example: Creating Full Names

    first_name <- c("Alice", "Bob")
    last_name <- c("Smith", "Brown")
    
    full_name <- paste(first_name, last_name)
    full_name
    

    Output:

    [1] "Alice Smith" "Bob Brown"
    

    String Matching in R Programming

    Introduction

    String matching means checking whether:

    • A string contains a specific pattern
    • A substring exists inside a string
    • A string matches a given pattern

    String matching is crucial in:

    • Text analysis
    • Data cleaning
    • Searching records
    • Filtering data
    • Regular expressions

    String Matching Using == Operator

    Exact Match

    "R" == "R"
    

    Output:

    TRUE
    

    Vector Comparison

    c("R", "Python") == "R"
    

    Output:

    TRUE FALSE
    

    Pattern Matching Using grepl()

    What is grepl()?

    • Returns TRUE/FALSE
    • Used to check if a pattern exists in a string

    Syntax

    grepl(pattern, text)
    

    Example

    grepl("data", "data science")
    

    Output:

    TRUE
    

    Case-Insensitive Matching

    grepl("Data", "data science", ignore.case = TRUE)
    

    Matching in a Vector

    cities <- c("Delhi", "Mumbai", "Chennai")
    grepl("i", cities)
    

    Output:

    FALSE TRUE FALSE
    

    Pattern Matching Using grep()

    What is grep()?

    • Returns indices of matching elements
    • Useful for filtering data

    Example

    grep("R", c("Python", "R", "Java"))
    

    Output:

    2
    

    Extract Matching Elements

    cities[grep("i", cities)]
    

    Matching Using Regular Expressions

    R supports regular expressions (regex) for advanced matching.

    Example: Match Strings Starting with a Letter

    grepl("^D", c("Delhi", "Mumbai", "Dubai"))
    

    Match Strings Ending with a Letter

    grepl("a$", c("India", "USA", "China"))
    

    Partial Matching Using %in%

    Checks if an element exists in a vector.

    "R" %in% c("Python", "R", "Java")
    

    Output:

    TRUE
    

    String Matching with match()

    Returns the position of first match.

    match("R", c("Python", "R", "Java"))
    

    Output:

    2
    

    Filtering Data Using String Matching

    names <- c("Alice", "Bob", "Charlie")
    names[grepl("a", names, ignore.case = TRUE)]
    

    Output:

    [1] "Alice" "Charlie"
    

    Common Mistakes in String Matching

    • Forgetting case sensitivity
    • Confusing grep() with grepl()
    • Using == for partial matching
    • Not handling NA values

    Summary

    • String concatenation in R is done using paste(), paste0(), sprintf(), and cat()
    • String matching is performed using ==, %in%, grepl(), grep(), and regex
    • These operations are essential for text processing, data cleaning, and filtering
    • R’s vectorized nature makes string operations efficient and powerful