Announcement Icon Online training class for Clinical R programming batch starts on Monday, 02Feb2026. Click here for details.

conditional logic - case_when - .default and missing values


Lesson Description
-
  • When no condition matches, case_when() returns NA unless .default is set.
  • Missing values can be handled explicitly using is.na().
  • Use a separate label for missing values when needed.
library(dplyr)

# Sample data
dm <- tibble(
  USUBJID = c("101-001", "101-002", "101-003", "101-004"),
  AGE = c(35, 64, NA, 17)
)

# .default used when no condition matches
dm1 <- dm %>%
  mutate(agegrp = case_when(
    AGE < 18 ~ "< 18 Years",
    AGE >= 60 ~ ">= 60 Years",
    .default = "18-59 Years"
  ))

# Missing values handled separately
dm2 <- dm %>%
  mutate(agegrp = case_when(
    is.na(AGE) ~ "Age Missing",
    AGE < 18 ~ "< 18 Years",
    AGE >= 60 ~ ">= 60 Years",
    .default = "18-59 Years"
  ))
  • .default supplies a fallback when no condition matches.
  • Handle missing values explicitly with is.na() to avoid misclassification.
# Base R equivalent using ifelse and is.na

dm <- data.frame(
  USUBJID = c("101-001", "101-002", "101-003", "101-004"),
  AGE = c(35, 64, NA, 17),
  stringsAsFactors = FALSE
)

agegrp <- ifelse(
  is.na(dm$AGE), "Age Missing",
  ifelse(dm$AGE < 18, "< 18 Years",
    ifelse(dm$AGE >= 60, ">= 60 Years", "18-59 Years")
  )
)

dm$agegrp <- agegrp
  • Nested ifelse() replicates the case_when() logic.
  • Check is.na() first to label missing values.