Working with Data Frames in R: A Comprehensive Guide

March 27, 2024

Data frames are a fundamental concept in R for organizing and manipulating data. In this post, we will provide a comprehensive guide on working with data frames, including creating, subsetting, and merging data frames.

Creating Data Frames

To create a data frame in R, you can use the data.frame() function. You can pass vectors of equal length as arguments to create the columns of the data frame. For example:

data <- data.frame(
  name = c('Alice', 'Bob', 'Charlie'),
  age = c(25, 30, 28),
  height = c(160, 175, 168)
)

Subsetting Data Frames

You can subset data frames using square brackets [] or the subset() function. To subset rows based on a condition, you can use:

subset_data <- data[data$age > 25, ]

To subset specific columns, you can use:

subset_data <- data[, c('name', 'age')]

Merging Data Frames

You can merge data frames using the merge() function. For example, to merge two data frames based on a common column:

merged_data <- merge(data1, data2, by = 'id')

These are just a few of the many operations you can perform with data frames in R. Understanding how to work with data frames is crucial for any data analysis or manipulation in R.