Below is a glossary for all macro commands in the macro package. The glossary contains the syntax, a detailed explanation, and examples for each of the available commands.
#% <comment>
Where “<comment>” is any text string. The macro comment is a single line comment. If you need multiple lines, use multiple macro comments.
#%let <variable> <- <value>
#%let <variable> = <value>
#%let <variable>
Where “<variable>” is the name of the variable, and “<value>” is some value assigned to the variable.
Note that macro variable names must begin with an alphabetic character. They can contain only alphabetic characters, numbers, and an underscore (“_“).
Macro variables should be named without a trailing dot (“.”). The trailing dot will be appended automatically when the variable is added to the symbol table.
Either the left arrow (“<-”) or equals sign (“=”) is a valid assignment operator for macro variables.
To clear a macro variable, leave it unassigned. This action will remove the macro variable from the symbol table.
Macro variables are variables that have a temporary existence during pre-processing. The variables are stored in the pre-processor symbol table, and used as text replacement tokens during pre-processing.
The assigned value is treated as a string, and replaced exactly as assigned. That means if the value is assigned without quotes, it will be replaced without quotes. If the value is quoted, the quotes will be retained during replacement.
Here is a basic macro assignment of numeric values:
#% Assignment using left arrow
#%let a <- 1
x <- a.
#% Assignment using equals sign
#%let b = 2
y <- b.
Here is the resolved output:
x <- 1
y <- 2
Both variables a.
an b.
have been replaced
as assigned.
#% Quoted Assignment
#%let a <- "One"
x <- a.
#% Unquoted Assignment
#%let b <- Two
y <- "b."
z <- 'b.'
The above code will resolve as follows:
x <- "One"
y <- "Two"
z <- 'Two'
Observe that single or double quotes have no effect on macro variable resolution. Both are resolved normally. This behavior is different from SAS.
If you do not quote the macro variable value, and do not quote the macro variable to be replaced, it will result in an error:
#% Unquoted Assignment
#%let c <- Three
# Unquoted resolution
z <- c.
The error is:
Error in eval(ei, envir) : object 'Three' not found
The error occurs because the value “Three” is now left unquoted in open code. The R parser thinks it is the name of a variable or object, and will try to look it up in the environment. If it can’t find the name in the environment, the parser generates an error.
If you want to remove an existing macro variable from the symbol table, leave it unassigned. Like this:
#% Macro assignment
#%let b <- 2
x <- b.
#% Clear assignment
#%let b
y <- b.
The above code will generate an error, as the y <- b.
runs after the variable has been cleared. Here is the error that is
generated:
Error in mreplace(ln) :
Macro variable 'b.' not valid for text replacement.
Macro variables can also be assigned in regular R code, as long as it
is done before msource()
is executed. In the below code,
macro variable c.
is assigned the value of 3, and then the
template code file “temp1.R” is executed via msource()
:
# Assignment before msource()
c. <- 3
# Pre-process
msource("./temp1.R", "./temp1_mod.R")
Here is the file “temp1.R”:
# Print macro variable
z <- c.
When the “temp1.R” file is resolved to “temp1_mod.R”, it will look like this:
# Print macro variable
z <- 3
You can see that the macro variable c.
is replaced
normally. The reason is because any variable with a trailing dot (“.”)
in the parent environment will be copied to the macro symbol table when
the pre-processor begins.
Also note that parent macro assignments do not need to take place in
the same code file as msource()
. You can assign them in a
setup file if desired. As long as they exist in the parent environment,
the macro variables will be copied to the symbol table.
#%if <condition>
#%if (<condition>)
#%ifelse <condition>
#%ifelse (<condition>)
#%else
#%end
Where “<condition>” is an expression that evaluates to TRUE or FALSE.
A macro conditional block must begin with an #%if
condition and be finalized with a #%end
. The
#%ifelse
and #%else
blocks are optional.
Failure to finalize the block with an #%end
will result in
an error or other unexpected behavior.
Parenthesis around the conditional expressions are optional but recommended.
The syntax of R conditionals is somewhat different from SAS. One
difference is that the %do
directive in SAS has been
entirely eliminated from the macro package syntax.
Another difference is that the #%ifelse
command is one word
instead of two.
A third difference is that in SAS, each conditional section must be
concluded with an %end
. In R, you only need one
#%end
at the conclusion of the entire conditional. In
general, the R syntax is simpler.
Conditional expressions should be regular R expressions, using R syntax. For comparisons, the expression should use the double-equals (“==”) comparison operator, as in a normal R “if” statement. Likewise, all other comparison operators should follow Base R syntax.
A conditional expression may contain macro variables. These macro variables will be resolved before the expression is evaluated. Reminder that the macro variables are text replacements, and will be resolved accordingly.
The conditional expression may also contain R functions like
any()
and all()
. In most cases, these function
do not need to be wrapped in %sysfunc()
. They will evaluate
properly as part of the expression.
If an expression in a macro “if” block is TRUE, the code inside that block will be emitted during pre-processing. If the expression evaluates to FALSE, code inside that block will be ignored.
Macro conditions may be nested inside one another. There is no limit on the number of nested levels.
Here is a simple condition to construct a path. The source directory can change depending on the environment.
#%let dev_path <- ./dev/data
#%let prod_path <- ./prod/data
#%let env <- dev
# Path to data
#%if ("env." == "prod")
pth <- "prod_path./dm.sas7bdat"
#%else
pth <- "dev_path./dm.sas7bdat"
#%end
When run through the macro processor, the above code will generate the following:
# Path to data
pth <- "./dev/data/dm.sas7bdat"
The pre-processor selected the “dev” path as indicated, constructed the path appropriately, and removed all macro statements.
Macro conditionals may also be nested inside one another. Nesting allows you to construct more complicated logic. Here is an example:
#% Data source SAS or RDS
#%let src <- SAS
#% Select analysis variables
#%let anal_vars <- c("AGE", "AGEG", "SEX", "RACE", "PULSE", "TEMP")
###################
# Get data
###################
#%if ("src." == "SAS")
library(haven)
# Get adsl dataset
adsl <- haven("./data/ADSL.sas7bdat")
#%if (any(c("PULSE", "TEMP", "BP") %in% anal_vars.))
# Get advs dataset
advs <- haven("./data/ADVS.sas7bdat")
#%end
#%else
# Get adsl dataset
adsl <- readRDS("./data/ADSL.rds")
#%if (any(c("PULSE", "TEMP", "BP") %in% anal_vars.))
# Get advs dataset
advs <- readRDS("./data/ADVS.rds")
#%end
#%end
When the nested conditionals resolve, they will look like this:
######################
# Get data
######################
library(haven)
# Get adsl dataset
adsl <- haven("./data/ADSL.sas7bdat")
# Get advs dataset
advs <- haven("./data/ADVS.sas7bdat")
In this way, you can perform complex logic in the macro pre-processor, and still produce a refined output.
#%include <path>
#%include '<path>'
#%include "<path>"
Where “<path>” is a path to the file to include. The file path may be quoted or unquoted. If the path is quoted, single or double quotes may be used.
The #%include()
macro command inserts text from an
external file into the generated output file. This behavior is different
from the Base R source()
function, which only executes the
code from the external file. A macro include actually copies the code
into the generated file. The included code does not have to be fully
functional. It can be a code snippet that only works when combined with
other included code.
The below example includes code from the file “dat01.R”. This file contains a snippet of sample data:
# Create sample data
#%include "./templates/dat01.R"
# Print sample data
print(dat)
The resolved macro will look like this:
# Create sample data
dat <- read.table(header = TRUE, text = '
SUBJID ARM SEX RACE AGE
"001" "ARM A" "F" "WHITE" 19
"002" "ARM B" "F" "WHITE" 21
"003" "ARM C" "F" "WHITE" 23
"004" "ARM D" "F" "BLACK" 28
"005" "ARM A" "M" "WHITE" 37
"006" "ARM B" "M" "WHITE" 34
"007" "ARM C" "M" "WHITE" 36
"008" "ARM D" "M" "WHITE" 30
"009" "ARM A" "F" "ASIAN" 39
"010" "ARM B" "F" "WHITE" 31
"011" "ARM C" "F" "BLACK" 33
"012" "ARM D" "F" "WHITE" 38
"013" "ARM A" "M" "BLACK" 37
"014" "ARM B" "M" "WHITE" 34
"015" "ARM C" "M" "ASIAN" 36
"016" "ARM A" "M" "WHITE" 40')
# Print sample data
print(dat)
Notice that the included code is integrated with the surrounding code from the macro-enabled program. In this way, you can collate a program from multiple code snippets into a unified result.
The macro package currently supports two macro
functions: %sysfunc()
and %symexist()
.
%sysfunc(<expression>, [<format>])
Where “<expression>” is an R expression that resolves to a
numeric value, and “<format>” is an optional format code. The
expression may contain R functions, operators, hard-coded values, and
macro variables. If “<format>” is supplied, the function will
apply the format after the expression is resolved, and return the
formatted value. The format code does not need to be quoted. If the
format is unquoted, the output of the %sysfunc()
function
will be unquoted. If the format is quoted, the output of the
%sysfunc()
function will be quoted.
%symexist(<name>)
Where “<name>” is an unquoted name of a macro variable. If the macro variable exists, the function will return TRUE. Otherwise, the function will return FALSE.
The %sysfunc()
and %symexist()
functions
are helper functions that are usually contained in other macro
statements. Most commonly, they will exist as part of “let” or “if”
statement. For this reason, the syntax does not include a leading
comment symbol (“#”).
The purpose of %sysfunc()
is to evaluate regular R
expressions during macro processing. The evaluated result can then be
assigned to a macro variable, or used in a macro condition.
The purpose of %symexist()
is to check for the existence
of a macro variable. This check is likewise most often performed as part
of a macro conditional expression.
Here are some examples illustrating the basic operation of
%sysfunc()
:
#% Unevaluated assignment
#%let a <- 2 + 2
w <- a.
#% Evaluated assignment
#%let b <- %sysfunc(2 + 2)
x <- b.
#% Unevaluated replacement
#%let c <- a. + b.
y <- c.
#% Evaluated replacement
#%let d <- %sysfunc(a. + b.)
z <- d.
The above code resolves as follows:
w <- 2 + 2
x <- 4
y <- 2 + 2 + 4
z <- 8
Observe the different ways each expression resolves depending on the
usage %sysfunc()
.
Here is another example:
#%let a <- c(1.205, 4.683, 3.812, 6.281, 9.467)
#%if (%sysfunc(mean(a.)) > 5)
x <- "> 5"
#%else
x <- "<= 5"
#%end
print(x)
The above macro function evaluates the mean of a.
,
compares to the hard-coded value 5, and then sets the value of
x
. The %sysfunc()
macro function allows for
immediate evaluation of the Base R mean()
function.
Note that you do not need multiple calls to %sysfunc()
if you have multiple Base R functions in your expression. Just put your
entire expression inside one call to %sysfunc()
and it will
evaluate as desired.
Below is a simple example checking for the existence of some macro variables before using them in a calculation:
#% Check for existance of a
#%if (%symexist(a) == FALSE)
#%let a <- 2
#%end
#% Check for existance of b
#%if (%symexist(b) == FALSE)
#%let b <- 3
#%end
# Calculate a * b
x <- a. * b.
And here is the resolved version:
# Calculate a * b
x <- 2 * 3
Notice that the variable name inside the %symexist()
function is not quoted. For this function, quoting is not necessary, and
will in fact prevent the function from finding the specified variable
name.
Next: Debugging