12  Experiment Logging

12.1 Log to directory

You can save any rtemis supervised learning model to disk by specifying an output directory using the outdir argument:

iris.cart <- s_CART(iris,
                    outdir = "./Results/iris_CART")

This will save:

  • A text .log file with the console output
  • A PDF with True vs. Fitted for Regression and a confusion matrix for Classification
  • A RDS file with the trained model (i.e. the R6 object iris.cart in the above example)

The RDS files can be shared with others and loaded back into R at any time.

When running a series of experiments it makes sense to use the outdir argument to save models to disk for reference.

12.2 Interactive logging

The above method of specifying an outdir is the main way to save models to disk. In practice, we often train a series of models interactively and would like to keep track of what we have tried and how it worked out. rtemis includes rtModLogger to help with that. You first create a new logger object, think of it as a container that will hold model parameters and error metrics - not the model itself. Once the logger is created you can add any models to it:

Some synthetic data:

x <- rnormmat(400, 400, seed = 2019)
w <- rnorm(400)
y <- c(x %*% w + rnorm(400))

dat <- data.frame(x, y)
res <- resample(dat)
01-07-24 00:31:40 Input contains more than one columns; will stratify on last [resample]
.:Resampling Parameters
    n.resamples: 10 
      resampler: strat.sub 
   stratify.var: y 
        train.p: 0.75 
   strat.n.bins: 4 
01-07-24 00:31:40 Created 10 stratified subsamples [resample]

dat.train <- dat[res$Subsample_1, ]
dat.test <- dat[-res$Subsample_1, ]

Initialize a new logger object:

logger <- rtModLogger$new()
logger
.:.:rtemis Supervised Model Logger

   Contents: no models yet 

12.2.1 Train some models and add them to the logger:

mod.ridge <- s_GLMNET(dat.train, dat.test,
                      alpha = 0, lambda = .01, verbose = F)
logger$add(mod.ridge)
01-07-24 00:31:40 Added 1 model to logger; 1 total [logger$add]

mod.lasso <- s_GLMNET(dat.train, dat.test,
                      alpha = 1, lambda = .01, verbose = F)
logger$add(mod.lasso)
01-07-24 00:31:41 Added 1 model to logger; 2 total [logger$add]

mod.elnet <- s_GLMNET(dat.train, dat.test,
                      alpha = .5, lambda = .01, verbose = F)
logger$add(mod.elnet)
01-07-24 00:31:41 Added 1 model to logger; 3 total [logger$add]

12.2.2 Plot model performance:

logger$plot(names = c("Ridge", "LASSO", "Elastic Net"))

12.2.3 Get a quick summary:

results <- logger$summary()
results
         Train Rsq  Test Rsq
GLMNET_1 0.9999773 0.5522008
GLMNET_2 0.9998142 0.7465513
GLMNET_3 0.9999467 0.7420425
attr(,"metric")
[1] "Rsq"

12.2.4 Write model hyperparameters and performance to a multi-sheet XLSX file:

logger$tabulate(filename = "~/Desktop/Results/model_metrics.xlsx")
Warning in file.create(to[okay]): cannot create file
'~/Desktop/Results/model_metrics.xlsx', reason 'No such file or directory'

In this example, the XLSX file will contain 3 sheets, one per model. We can save the output of tabulate to a list as well:

tbl <- logger$tabulate()
tbl$GLMNET_1
tbl$GLMNET_2
tbl$GLMNET_3