Chapter 6 Prediction Control Script prediction-user-input-sim.r
This section explains how to write a prediction-user-input-sim.r
control file for predicting the incidence rate using new simulated exposures. The examples showing below can be used as a template by combining all chunks in order(user needs to update the column names and options accordingly).
The control script require package tidyverse
.
library(tidyverse)
6.1 Bins of Observed Exposure Values
The observed exposure values will be grouped into bins to calculate the observed incidence rates and they are compared to the model prediction. If bin_n_obs
is missing, the observed exposures will be grouped into 5 bins by default.
#number of bins for grouping exposure values of observations (the data used in modeling)
<- 7 #Default value is 5 bin_n_obs
6.2 Simulated Exposure Data Set
sim_inc_expo_data
is the file name of the new exposure data set. It must under the same folder as all other modeling results.
#Provide the new data set for prediction
<- "simdata.csv" sim_inc_expo_data
Obs_Expo_list
is the exposure metric (original scale) column names in obsdata.csv, and Sim_Expo_list
is the corresponding names in simdata.csv. In many circumstances, the same exposure metric might have different column names in different data set.
If Obs_Expo_list
is missing, it will be set as all exposure metric column values occurred in obsdata.csv.
If Sim_Expo_list
is missing, it will be set as the same value as Obs_Expo_list
#Selected Exposure Metric(s) column(s) in Obs Data (not sim data) shown on prediction plot
#If Obs_Expo_list is missing, Obs_Expo_list = names(orig.exposureCov) which was saved in the modeling result
<- c("CAVE1", "CAVE2")
Obs_Expo_list
#Metric column(s) in simulation data to be used as Exposure(s) on plot
#the order needs to be consistent with Obs_Expo_list
#If Sim_Expo_list is missing, Sim_Expo_list = Obs_Expo_list
<- c("CAVE1", "CAVE2") Sim_Expo_list
6.3 New Exposure Summary Metric
The center of exposures in each group can be calculated as the geometric mean:
#Statistic of Exposure used to calculate fitted value.
#"median", "geomean";
<- "geomean" #Default value "geomean"
Center_Metric <- "geometric mean" Center_Metric_name
It could also be calculated as the median:
#Statistic of Exposure used to calculate fitted value.
#"median", "geomean";
<- "median" #Default value "geomean"
Center_Metric <- "Median" Center_Metric_name
If Center_Metric
is missing, the default value is “geomean.” If Center_Metric_name
is missing, the metric name will be “geometric mean” or “median” based on Center_Metric
.
6.4 Group Label Variable
grp_colname
is the variable to summary and display exposure values by. grp_colname_tab
is the name to display in tables.
If grp_colname
is missing, the exposure values will be summarized and displayed as one group; if grp_colname_tab
is missing, the default value is “Label.”
#Group Label column name
<- "GROUP" #column name in simdata.csv
grp_colname <- "Treatment" #name to display in tables grp_colname_tab
levels.grp
and labels.grp
are the levels information for the group variable.
#provide levels in Group Label column
#if levels.grp is missing, a vector of the unique values in grp_colname will be levels.grp
<- c("10 mg QD",
levels.grp "30 mg QD",
"50 mg QD",
"100 mg QD")
#provide labels of each level in Group Label column
#if labels.grp is missing, labels.grp = levels.grp
<- c("10 mg once daily",
labels.grp "30 mg once daily",
"50 mg once daily",
"100 mg once daily")
6.5 Filter Condition (Optional)
It is possible to use a subset of simdata.csv without modifying the data file.
# Only include partial exposure records
<- "GROUP != '10 mg QD'" #remove 10 mg QD group
filter_condition
#In the example simdata.csv, only 100 mg QD has C = 1
#filter_condition <- "C == 1" # only keep 100 mg QD group
#filter_condition <- "GROUP == '100 mg QD'"
filter_condition
can be missing if all exposure values will be used.
6.6 Caption for Simulated Exposure Data
Add information for the simulated exposure data. If expo_pred_tab_caption
is missing, the standard description will be used: “Predicted exposure metric for each dose are derived from simulated patients with randomly drawn random effect parameters as described by the final population PK model and body weights sampled from observations.”
<- "Predicted exposure metric values for each dose are derived from 2,500 simulated subjects." expo_pred_tab_caption