Data Handling in R Programming

Data Handling in detail

The R programming language is extensively used for statistical analysis and data visualization. Handling data involves importing and exporting files, and R simplifies this process by supporting various file types such as CSV, text files, Excel spreadsheets, SPSS, SAS, and more.

R provides several predefined functions to navigate and interact with system directories. These functions allow users to either retrieve the current directory path or change it as needed.

Directory Functions in R
  • getwd(): Retrieves the current working directory.
  • setwd(): Changes the working directory. The directory path is passed as an argument to this function.

Example:

# Change working directory
setwd("D:/RProjects/")

# Alternative way using double backslashes
setwd("D:\\RProjects\\")
  • list.files(): Displays all files and folders in the current working directory.
fluidPage(…, title = NULL, theme = NULL)
Importing Files in R

Importing Text Files: Text files can be read into R using the read.table() function.

Syntax:

read.table(filename, header = FALSE, sep = "")

Parameters:

  • header: Indicates whether the file contains a header row.
  • sep: Specifies the delimiter used in the file.

For more details, use the command:

help("read.table")

Example:
Suppose the file “SampleText.txt” in the current working directory contains the following data:

101 X p
202 Y q
303 Z r
404 W s
505 V t
606 U u

Code:

# Get the current working directory
getwd()

# Read the text file into a data frame
data <- read.table("SampleText.txt", header = FALSE, sep = " ")

# Print the data frame
print(data)

# Print the class of the object
print(class(data))

Output:

[1] "D:/RProjects"
   V1 V2 V3
1 101  X  p
2 202  Y  q
3 303  Z  r
4 404  W  s
5 505  V  t
6 606  U  u
[1] "data.frame"

Importing CSV Files: CSV files can be imported using the read.csv() function.

Syntax:

read.csv(filename, header = FALSE, sep = "")

Parameters:

  • header: Specifies if the file contains a header row.
  • sep: Indicates the delimiter used.

For details, run:

help("read.csv")

Example:
Assume the file “SampleCSV.csv” contains the following data:

101,XA,pa
202,YB,qb
303,ZC,rc
404,WD,sd
505,VE,te

Code:

# Read the CSV file
data <- read.csv("SampleCSV.csv", header = FALSE)

# Print the data frame
print(data)

# Print the class of the object
print(class(data))

Output:

V1  V2  V3
1 101  XA  pa
2 202  YB  qb
3 303  ZC  rc
4 404  WD  sd
5 505  VE  te
[1] "data.frame"

Importing Excel Files: To read Excel files, install the openxlsx package and use the read.xlsx() function.

Syntax:

read.xlsx(filename, sheet = 1)

Parameters:

  • sheet: Specifies the sheet name or index.

For help:

help("read.xlsx")

Example:
Suppose the Excel file “SampleExcel.xlsx” contains the following data:

ABC
1001XYAxyz
2002YZByqw
3003ZWCwuv

Code:

# Install and load the openxlsx package
install.packages("openxlsx")
library(openxlsx)

# Read the Excel file
data <- read.xlsx("SampleExcel.xlsx", sheet = 1)

# Print the data frame
print(data)

# Print the class of the object
print(class(data))

Output:

A    B   C
1 1001  XYA xyz
2 2002  YZB yqw
3 3003  ZWC wuv
[1] "data.frame"
Exporting Files in R

Redirecting Output with cat(): The cat() function outputs objects to the console or redirects them to a file.

Syntax:

cat(..., file)

Example:

# Redirect output to a file
cat("Greetings from R!", file = "OutputText.txt")

Output:

Greetings from R!

Redirecting Output with sink(): The sink() function captures output and redirects it to a file.

Syntax:

sink(filename)
...
sink()

Example:

# Redirect output to a file
sink("OutputSink.txt")

x <- c(2, 4, 6, 8, 12)
print(mean(x))
print(class(x))
print(max(x))

# End redirection
sink()

Output (file content):

[1] 6.4
[1] "numeric"
[1] 12

Writing CSV Files: The write.csv() function writes data to a CSV file.

Syntax:

write.csv(x, file)

Example:

# Create a data frame
df <- data.frame(A = c(11, 22, 33), B = c("X", "Y", "Z"), C = c(TRUE, FALSE, TRUE))

# Write the data frame to a CSV file
write.csv(df, file = "OutputCSV.csv", row.names = FALSE)

Output:

A,B,C
11,X,TRUE
22,Y,FALSE
33,Z,TRUE

Comments

Leave a Reply

Your email address will not be published. Required fields are marked *