Working with Databases in detail
In R, working with datasets is a crucial aspect of statistical analysis and visualization. Instead of manually creating datasets in the console each time, we can retrieve structured and normalized data directly from relational databases such as MySQL, Oracle, and SQL Server. This integration allows for seamless data manipulation and visualization within R.
This guide focuses on MySQL connectivity in R, covering database connection, table creation, deletion, data insertion, updating, and querying.
RMySQL Package
R provides the RMySQL package to facilitate communication between R and MySQL databases. This package needs to be installed and loaded before connecting to MySQL.
Installation
install.packages("RMySQL")
Establishing Connection to MySQL
To connect to MySQL, the dbConnect() function is used, which requires a database driver along with authentication credentials such as username, password, database name, and host details.
Syntax:
dbConnect(drv, user, password, dbname, host)
Parameters
drv– Specifies the database driveruser– MySQL usernamepassword– Corresponding passworddbname– Name of the databasehost– Server hosting the database
Example: Connecting to MySQL Database
# Load necessary library
library("RMySQL")
# Establish connection
conn <- dbConnect(MySQL(), user = 'admin', password = 'mypassword',
dbname = 'SampleDB', host = 'localhost')
# Display available tables
dbListTables(conn)
Output:
Loading required package: DBI
[1] "employees"
Creating a Table in MySQL Using R
A table can be created in MySQL from R using the dbWriteTable() function. If the table already exists, this function will replace it.
Syntax
dbWriteTable(conn, name, value)
Parameters
conn– Connection objectname– Name of the MySQL tablevalue– Dataframe to be converted into a MySQL table
Example: Creating a Table
# Establish connection
conn <- dbConnect(MySQL(), user = 'admin', password = 'mypassword',
dbname = 'SampleDB', host = 'localhost')
# Create new table with selected data
dbWriteTable(conn, "iris_table", iris[1:10, ], overwrite = TRUE)
Output:
[1] TRUE
Deleting a Table in MySQL Using R
To perform various database operations, the dbSendQuery() function can be used to execute SQL queries directly in MySQL from R.
Syntax:
dbSendQuery(conn, statement)
Importing Data from a Delimited File
The read.delim() function is used to import delimited files, where values are separated by specific symbols such as |, $, or ,.
Syntax:
read.delim("file.txt", sep="|", header=TRUE)
Parameters
conn– Connection objectstatement– SQL command to be executed
Example: Dropping a Table
# Establish connection
conn <- dbConnect(MySQL(), user = 'admin', password = 'mypassword',
dbname = 'SampleDB', host = 'localhost')
# Drop existing table
dbSendQuery(conn, 'DROP TABLE iris_table')
Output:
<MySQLResult:9845732, 3, 4>
Inserting Data into MySQL Table Using R
Data can be inserted into a MySQL table from R using SQL INSERT INTO queries.
Example: Inserting Data
# Establish connection
conn <- dbConnect(MySQL(), user = 'admin', password = 'mypassword',
dbname = 'SampleDB', host = 'localhost')
# Insert new record into employees table
dbSendQuery(conn, "INSERT INTO employees(id, name) VALUES (1, 'John Doe')")
Output:
<MySQLResult:9845732, 3, 5>
Updating Data in a MySQL Table Using R
An existing record in the table can be modified using the UPDATE query.
Example: Updating a Table
# Establish connection
conn <- dbConnect(MySQL(), user = 'admin', password = 'mypassword',
dbname = 'SampleDB', host = 'localhost')
# Update a record in employees table
dbSendQuery(conn, "UPDATE employees SET name = 'Jane Doe' WHERE id = 1")
Output:
<MySQLResult:-1, 3, 6>
Retrieving Data from MySQL Using R
To fetch data from MySQL, the dbSendQuery() function is used to send a SQL SELECT statement. The retrieved data can be stored in a dataframe using the fetch() function.
Example:
# Establish connection
conn <- dbConnect(MySQL(), user = 'admin', password = 'mypassword',
dbname = 'SampleDB', host = 'localhost')
# Fetch records from employees table
res <- dbSendQuery(conn, "SELECT * FROM employees")
# Retrieve first 3 rows as dataframe
df <- fetch(res, n = 3)
print(df)
Output:
id name
1 1 John Doe
2 2 Alice Ray
3 3 Mark Smith
Leave a Reply