Announcement Icon Online training class for Clinical R programming batch starts on Monday, 02Feb2026. Click here for details.

Arrays-repeat same logic on multiple variables


Lesson Description
-
  • Sometimes, we want to work with the concept of "Arrays-repeat same logic on multiple variables" in a clear, repeatable way.
  • This lesson walks through a simple example and shows the key steps.
  • We will see one approach on how we can do it in SAS and R.
data ENRLMENT;
infile datalines dlm='|' dsd missover;
input STUDY : $6. PT : $4. ICDT_RAW : $11. ENRLDT_RAW : $11. RANDDT_RAW : $11.;
label STUDY ='Study Number' PT ='Subject Identifier' ICDT_RAW ='Informed Consent Date' ENRLDT_RAW ='Enrollment Date' RANDDT_RAW ='Randomization Date';
format ;
datalines4;
CSG001|1001|1/JAN/2010||
CSG001|1002|1/JAN/2010|4/JAN/2010|
CSG001|1003|1/JAN/2010|3/JAN/2010|3/JAN/2010
CSG001|1004|1/JAN/2010|4/JAN/2010|5/JAN/2010
CSG001|1005|15/JAN/2010|1/FEB/2010|5/FEB/2010
CSG001|1006|18/FEB/2010|1/MAR/2010|1/MAR/2010
CSG001|1007|4/APR/2010|14/APR/2010|14/APR/2010
CSG001|1008|20/JUN/2010|26/JUN/2010|27/JUN/2010
;;;;
run;

data enrl01;
    set enrlment;
    array rvars[*] icdt_raw enrldt_raw randdt_raw;
    do i=1 to dim(rvars);
        rvars[i]=translate(rvars[i],' ','/');
    end;
run;
  • The SAS code snippet above demonstrates how to use an array to process multiple variables in a dataset. In this example, the dataset "enrlment" is being processed, and an array named "rvars" is created to hold the variables "icdt_raw," "enrldt_raw," and "randdt_raw."
  • A DO loop is then used to iterate through each element of the array. Within the DO loop, the TRANSLATE function is applied to each element of the array using the TRANSLATE(rvars[i],' ','/') syntax. This function replaces any spaces in the variable value with forward slashes ("/"), effectively performing a character replacement operation.
  • Finally, the modified values are stored back into the respective variables in the "enrl01" dataset using the assignment statement rvars[i]=translate(rvars[i],' ','/').
library(tidyverse)
library(stringr)
library(purrr)

enrlment<-tribble(
~study,~pt,~icdt_raw,~enrldt_raw,~randdt_raw,
"CSG001","1001","1/JAN/2010","","",
"CSG001","1002","1/JAN/2010","4/JAN/2010","",
"CSG001","1003","1/JAN/2010","3/JAN/2010","3/JAN/2010",
"CSG001","1004","1/JAN/2010","4/JAN/2010","5/JAN/2010",
"CSG001","1005","15/JAN/2010","1/FEB/2010","5/FEB/2010",
"CSG001","1006","18/FEB/2010","1/MAR/2010","1/MAR/2010",
"CSG001","1007","4/APR/2010","14/APR/2010","14/APR/2010",
"CSG001","1008","20/JUN/2010","26/JUN/2010","27/JUN/2010",
)

enrl01<-enrlment %>% 
  mutate(across(c(icdt_raw, enrldt_raw, randdt_raw), ~ str_replace_all(., "/", " ")))
  • The above R Tidyverse code snippet demonstrates how to use the mutate and across functions to modify multiple variables in a dataset. In this example, the dataset enrlment is being processed.
  • The across function is used to select multiple variables (icdt_raw, enrldt_raw, randdt_raw) to apply a transformation to. The str_replace_all function is then used within mutate to replace all occurrences of the forward slash ("/") with a space (" ") in the selected variables.
  • The tilde (~) notation is used to define an anonymous function to be applied to each selected variable. The dot (.) represents the current variable being processed.
  • Finally, the modified values are stored in the new dataset rl01 using the assignment statement mutate(across(...)).
enrlment <- data.frame(
  study = c("CSG001", "CSG001", "CSG001", "CSG001", "CSG001", "CSG001", "CSG001", "CSG001"),
  pt = c(1001, 1002, 1003, 1004, 1005, 1006, 1007, 1008),
  icdt_raw = c("1/JAN/2010", "1/JAN/2010", "1/JAN/2010", "1/JAN/2010", "15/JAN/2010", "18/FEB/2010", "4/APR/2010", "20/JUN/2010"),
  enrldt_raw = c("", "4/JAN/2010", "3/JAN/2010", "4/JAN/2010", "1/FEB/2010", "1/MAR/2010", "14/APR/2010", "26/JUN/2010"),
  randdt_raw = c("", "", "3/JAN/2010", "5/JAN/2010", "5/FEB/2010", "1/MAR/2010", "14/APR/2010", "27/JUN/2010")
  , stringsAsFactors = FALSE
)

enrl01 <- enrlment

cols <- c("icdt_raw", "enrldt_raw", "randdt_raw")

enrl01[cols] <- lapply(
  enrl01[cols],
  function(x) gsub("/", " ", x)
)
  • Create a working copy before applying repeated transformations.
  • Use the copy to add derived columns consistently.