10 > 5
[1] TRUE
Logical operations in R
refer to operations that produce or use logical (TRUE or FALSE) values. These include comparisons (e.g. greater than, less than, equal to), negation, and logical and/or/not operations.
Comparison operations allow to compare values and produce a logical outcome.
For instance,
10 > 5
[1] TRUE
Is 10
greater than 5
equals TRUE
These operators become particularly useful when filtering rows based on conditions.
The list of comparison operator is
Comparison Operator | R symbol |
---|---|
Equal to | == |
Not equal to | != |
Greater than or equal to | >= |
Less than or equal to | <= |
Greater than | > |
Less than | < |
Filtering | %in% |
In this tutorial, we are using the swiss
dataset
data("swiss")
head(swiss)
Fertility Agriculture Examination Education Catholic
Courtelary 80.2 17.0 15 12 9.96
Delemont 83.1 45.1 6 9 84.84
Franches-Mnt 92.5 39.7 5 5 93.40
Moutier 85.8 36.5 12 7 33.77
Neuveville 76.9 43.5 17 15 5.16
Porrentruy 76.1 35.3 9 7 90.57
Infant.Mortality
Courtelary 22.2
Delemont 22.2
Franches-Mnt 20.2
Moutier 20.3
Neuveville 20.6
Porrentruy 26.6
<=
, <
, >=
, >
)Let’s filter the rows (cities) based on Education
being less or equal than 2%
in the swiss
dataset. In essence, we are extracting rows representing populations where no more than 2% have completed primary education.
Let’s run
$Education <= 2 swiss
[1] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
[13] FALSE FALSE FALSE TRUE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
[25] TRUE FALSE FALSE FALSE FALSE FALSE TRUE FALSE TRUE FALSE FALSE FALSE
[37] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
For the rows that satisfy or not the condition, we receive a TRUE
or FALSE
.
To filter out the rows that are TRUE
, we can insert the condition within the square brackets (on the left side of the comma ,
)
$Education <= 2, ] swiss[swiss
Fertility Agriculture Examination Education Catholic Infant.Mortality
Echallens 68.3 72.6 18 2 24.20 21.2
Oron 72.5 71.2 12 1 2.40 21.0
Conthey 75.5 85.9 3 2 99.71 15.1
Herens 77.3 89.7 5 2 100.00 18.3
Now we have all the cities (rows) that satisfy the condition.
We can save it into a new data.frame.
= swiss[swiss$Education <= 2, ]
swiss_edu2 swiss_edu2
Fertility Agriculture Examination Education Catholic Infant.Mortality
Echallens 68.3 72.6 18 2 24.20 21.2
Oron 72.5 71.2 12 1 2.40 21.0
Conthey 75.5 85.9 3 2 99.71 15.1
Herens 77.3 89.7 5 2 100.00 18.3
Fertility
more than 50%Agriculture
less or equal than 10%==
OperatorThe ==
operator is used to test if elements are exactly equal. It compares individual elements in a vector, data frame, or other data structures, and returns a logical vector of TRUE
and FALSE
values, indicating whether each comparison was successful.
$Education == 29 swiss
[1] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
[13] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
[25] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
[37] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE TRUE TRUE
The ==
operator performs element-wise comparison between two vectors or data structures of the same length. Returns a logical vector where each element is TRUE
if the condition is met and FALSE
otherwise.
We can check which positions (e.g. rows) fulfill the condition with which
which(swiss$Education == 29)
[1] 46 47
# rows 46 and 47
We can save the selected rows into a new dataframe.
# save into new dataframe
= swiss[swiss$Education == 29, ]
swiss_edu_29 swiss_edu_29
Fertility Agriculture Examination Education Catholic
Rive Droite 44.7 46.6 16 29 50.43
Rive Gauche 42.8 27.7 22 29 58.33
Infant.Mortality
Rive Droite 18.2
Rive Gauche 19.3
!=
The !=
operator in R is used to determine inequality between elements. It checks if elements in a vector or data structure are not equal to a specified value or another set of elements. The !=
operator is an important tool when you want to exclude certain values or filter out elements that do not match a specific criterion.
If we want to select values not in Education is equal to 29
# save into new dataframe
= swiss[swiss$Education != 29, ]
swiss_edu_not29 1:5, ] swiss_edu_not29[
Fertility Agriculture Examination Education Catholic
Courtelary 80.2 17.0 15 12 9.96
Delemont 83.1 45.1 6 9 84.84
Franches-Mnt 92.5 39.7 5 5 93.40
Moutier 85.8 36.5 12 7 33.77
Neuveville 76.9 43.5 17 15 5.16
Infant.Mortality
Courtelary 22.2
Delemont 22.2
Franches-Mnt 20.2
Moutier 20.3
Neuveville 20.6
%in%
OperatorThe %in%
operator checks if elements from one vector are present in another vector. It is commonly used to filter or select elements that belong to a set of specific values.
Consider checking if the elements of Education
are part of a set of target values: c(2,5,20)
$Education %in% c(2,5,20) swiss
[1] FALSE FALSE TRUE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
[13] FALSE FALSE TRUE TRUE FALSE FALSE TRUE FALSE FALSE FALSE FALSE FALSE
[25] FALSE FALSE FALSE FALSE FALSE FALSE TRUE FALSE TRUE FALSE FALSE FALSE
[37] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
Select the rows and create a new data frame
= swiss$Education %in% c(2,5,20)
swiss_edu_select swiss_edu_select
[1] FALSE FALSE TRUE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
[13] FALSE FALSE TRUE TRUE FALSE FALSE TRUE FALSE FALSE FALSE FALSE FALSE
[25] FALSE FALSE FALSE FALSE FALSE FALSE TRUE FALSE TRUE FALSE FALSE FALSE
[37] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
==
and %in%
The symbols ==
and %in%
are related but distinct.
==
when you need to perform exact comparisons between elements.%in%
when you want to check if elements are among a list or set of values.Logical operations are used to combine different conditions.
Logical Operator | R symbol |
|
---|---|---|
AND | & |
returns TRUE if both the conditions are TRUE |
OR | | |
returns TRUE if any of the conditions is TRUE |
NOT | ! |
returns the inverse of the condition |
For instance, let’s select the rows (cities/provinces) with where the percentage of the population with a primary education is greater than 10% and is over 80% Catholic.
We get only two cities (Sarine and Sion)
# we save the dataset based on the condition in a new object
<- swiss[swiss$Education > 10 & swiss$Catholic > 80, ]
swiss_cond1 swiss_cond1
Fertility Agriculture Examination Education Catholic Infant.Mortality
Sarine 82.9 45.2 16 13 91.38 24.4
Sion 79.3 63.1 13 13 96.83 18.1
Using the NOT
operator !
we can get the inverse of this condition
!(swiss$Education > 10 & swiss$Catholic > 80), ] swiss[
We can use the OR
condition (symbol |
) in a similar manner. For example, if we want to filter rows based on two conditions: less than 20% working in Agriculture or above 20% being Catholic, we would do
$Agriculture < 20 | swiss$Catholic > 20, ] swiss[swiss
Using the swiss
dataset
Agriculture
under 30%?Catholic
population and less than 30% of Agriculture
?catholic
population OR
less than 20% of Agriculture
?Answers 1.
Solutions
swiss[swiss$Agriculture < 30, ]
swiss[swiss$Catholic > 50 & swiss$Agriculture < 30, ]
swiss[swiss$Catholic < 20 | swiss$Agriculture < 20, ]
↩︎