Homepage > Data Exploration and Manipulation > Manipulating data > The Select Function

The Select Function

The select() function is used to select specific columns within your data and save them as a new data frame. You can use this if you have a large dataset and only want to use a few of the columns, to keep it simple and tidy. Or, you may want to take a column or two from multiple different datasets and combine them.

weeds_select <- select(weeds, soil)

This simply creates the weeds_select dataset, seleting one column - “soil”. As with most tidyverse functions we need to specify the dataset immediately after writing the select function. From here, its simple changes to do use select in new ways

weeds_select <- select(weeds,c(soil, species)) # select two columns, "soil" and "species"

weeds_select <- select(weeds,c(2:4)) # select columns using numbers. In this case, select columns 2 through to 4.

weeds_select <- select(weeds, c(soil:flowers)) # select columns "soil" through to "flowers"

weeds_select <- select(weeds, -soil) # remove "soil"
# similar syntax applys for removing multiple columns, just place a - infront e.g. select(weeds, -c(2:4))

weeds_select <- select(weeds, starts_with("s")) # select any column whose name starts with S.

There are many more like this above example, like “ends_with”, “contains” and “matches” all which refer to the column names.

use the help window ?select for more useful functions with select()