banner



How To Subset Data In R Based On Condition

This article continues the data scientific discipline examples started in our data frame tutorial. We're using the ChickWeight data frame example which is included in the standard R distribution. You tin easily go to this past typing: information(ChickWeight) in the R console. This information frame captures the weight of chickens that were fed different diets over a period of 21 days. If you tin can imagine someone walking around a enquiry subcontract with a clipboard for an agricultural experiment, yous've got the right idea….

We're going to walk through how to extract slices of a data frame in R programming. This series has a couple of parts – feel gratis to skip ahead to the well-nigh relevant parts.

  • Inspecting your data
  • Ways to Select a Subset of Data From an R Data Frame
  • How To Create an R Data Frame
  • How To Sort an R Data Frame
  • How to Add together and Remove Columns
  • Renaming Columns
  • How To Add and Remove Rows
  • How to Merge Two Information Frames

Selecting A Subset of a R Data Frame

And so permit usa suppose we only want to look at a subset of the information, perhaps only the chicks that were fed nutrition #four?

To do this, we're going to use the subset control. We are as well going to salvage a copy of the results into a new dataframe (which we will phone call testdiet) for easier manipulation and querying. Nrow and length do the rest.

              #  subset in r example  testdiet <- subset(ChickWeight, Diet==4)  nrow(testdiet)   length(unique(testdiet$Chick))                          

Running our row count and unique chick counts again, we determine that our data has a total of 118 observations from the 10 chicks fed nutrition 4.

How to Subset Information in R – Multiple Conditions

The subset control in base R (subset in R) is extremely useful and can be used to filter information using multiple atmospheric condition. For example, perchance we would like to look at but observations taken with a late time value. This allows united states of america to ignore the early "dissonance" in the data and focus our analysis on mature birds. Returning to the subset function, we enter:

              # subset in r data frame multiple conditions  subset(ChickWeight, Diet==4 && Time == 21)            

Yous can also utilise the subset control to select specific fields within your data frame, to simplify processing. In this case, we will filter based on column value.

              # subset in r  testdiet <- subset(ChickWeight, select=c(weight, Time), subset=(Diet==4 && Time > twenty))            

This version of the subset control narrows your data frame down to only the elements yous want to wait at.

Other Ways to Subset A Data Frame in R

There are actually many ways to subset a data frame using R. While the subset command is the simplest and well-nigh intuitive way to handle this, you lot can dispense data directly from the data frame syntax. Consider:

              # subset in r - provisional indexing  testdiet <- ChickWeight[ChickWeight$Diet==iv,]            

This arroyo is referred to every bit provisional indexing. Nosotros can select rows from the data frame column by applying a logical condition to the overall data frame. Any row coming together that condition (within that column) is returned, in this case, the observations from birds fed the test diet. You lot could as well use it to filter out a record from the data set with a missing value in a specific column name – where the data collection process failed, for example…

You tin, in fact, apply this syntax for selections with multiple atmospheric condition (using the column proper name and column value for multiple columns). The code below yields the aforementioned result every bit the examples above.

              # subset in r data frame multiple weather  bigbirds <- ChickWeight[(ChickWeight$Diet==four) && (ChickWeight$Time==21),]            

You can employ logical operators to combine weather condition. The AND operator (&) indicates both logical conditions are required. You as well have the option of using an OR operator, indicating a record should be included in the result it meets either logical condition. A possible instance of this is below.

              # subset in r  endpoints <-ChickWeight[(ChickWeight$Time < 3) | (ChickWeight$Fourth dimension > xx),]            

In this case, we are asking for all of the observations recorded either early in the experiment or tardily in the experiment.

This can be a powerful manner to transform your original data frame, using logical subsetting to prune specific elements (selecting rows with missing value(s) or multiple columns with bad values). This allows you to remove the observation(s) where you suspect external factors (information drove error, special causes) has distorted your results.

At that place is also the which function, which is slightly easier to read.

              # which role in R - select columns returned  ChickWeight[which((ChickWeight$Diet == 4) && (ChickWeight$Time==21)),                    names(ChickWeight) %in% c("weight","Time")]            

This also yields the same basic result every bit the examples above, although nosotros are likewise demonstrating in this example how you can use the which office to reduce the number of columns returned. Nosotros specify that we just want to look at weight and fourth dimension in our subset of data.

Set up for more than? Lets move on to creating your own R data frames from raw information. Or feel free to skip around our tutorial on manipulating a data set using the R language.

  • Inspecting your data
  • Means to Select a Subset of Data From an R Data Frame
  • How To Create an R Data Frame
  • How To Sort an R Data Frame
  • How to Add and Remove Columns
  • Renaming Columns
  • How To Add together and Remove Rows
  • How to Merge Two Data Frames

Ezoic

How To Subset Data In R Based On Condition,

Source: https://www.programmingr.com/examples/r-dataframe/subset-an-r-data-frame/

Posted by: gloverfign1969.blogspot.com

0 Response to "How To Subset Data In R Based On Condition"

Post a Comment

Iklan Atas Artikel

Iklan Tengah Artikel 1

Iklan Tengah Artikel 2

Iklan Bawah Artikel