filename mydir "C:\Users\curio\Desktop\Rough";
data file_list;
length full_filename filename folder basename extension $256;
folder = pathname("mydir");
did = dopen("mydir");
if did > 0 then do;
nfiles = dnum(did);
do i = 1 to nfiles;
filename = dread(did, i);
full_filename = cats(folder, "\", filename);
*Extract extension and basename;
if index(filename, ".") then do;
extension = scan(filename, -1, ".");
basename = substr(filename, 1, length(filename) - length(extension) - 1);
end;
else do;
extension = "";
basename = filename;
end;
output;
end;
rc = dclose(did);
end;
drop did nfiles i rc;
run;
- We are creating a FILENAME reference called mydir that points to a specific folder path on our system
- Inside a DATA step named file_list, we are defining variables to hold the file name, full path, folder name, base name, and extension
- To get the actual path from the filename reference, we are using pathname("mydir") and storing it in the variable folder
- We are opening the folder using dopen("mydir"), which returns a directory ID that we can use for further processing
- If the folder is successfully opened, we are retrieving the number of files using dnum(did)
- We are then looping through each file using a DO loop from 1 to the number of files
- To get the name of each file, we are using dread(did, i) and saving it in the variable filename
- We are constructing the full file path by combining the folder and filename using cats(folder, "\", filename)
- To extract the extension, we are checking if the filename contains a period using index(filename, ".")
- If an extension exists, we are using scan(filename, -1, ".") to extract the last portion as the extension
- To get the base name (filename without extension), we are using substr() and length() to remove the extension part
- If there is no period in the filename, we are setting the entire filename as the base name and leaving the extension blank
- We are using output; to write each processed file to the final dataset
- Once all files are processed, we are closing the folder using dclose(did)
- Finally, we are dropping internal helper variables using the drop statement to keep the dataset clean