1  Setup

1.1 Install the latest rtemis version from GitHub

install.packages("remotes")
remotes::install_github("egenn/rtemis")

You can run the install_github() command as often as you like: it will only work if there is an update available on GitHub. It will install rtemis with a minimal set of dependencies. A dependency check is run each time a function is called and will inform you if a package is missing. Install the following packages to begin with a reasonable lightweight setup:

packages <- c("data.table", "future", "gbm", "glmnet", "plyr", "ranger", "rpart")
install.packages(packages)

1.2 R

For an introduction to R, see Programming for Data Science in R.

1.3 IDEs: VS Code, RStudio

You can run rtemis in the command line or using the IDE of your choice. VS Code and RStudio are probably the two best options right now.

1.4 macOS

1.4.1 Prerequisites

If you are installing on macOS, make sure you have installed:

Note on R + Java on macOS: In order to run some R packages that use rJava, like bartMachine, you may need to add a link to libjvm.dylib inside your R lib folder as explained here

1.5 External frameworks

The following are all optional - install as needed.

1.5.1 H2O

To use H2O (d.H2OGLRM(), s.H2ODL.R(), s.H2OGBM.R(), s.H2ORF(), u.H2OKMEANS()), you will need to install H2O first. Follow instructions on the H2O website.

1.5.2 Spark

To use Spark’s ML framework (currently s.MLRF()), installation can be performed within R:

install.packages("sparklyr")
sparklyr::spark_install()

1.5.3 Keras + TensorFlow

You can easily install Keras for R and the TensorFlow library:

remotes::install_github("rstudio/keras")
library(keras)
install_keras()

Learn more on the RStudio website

1.6 Load rtemis

library(rtemis)

1.7 Setup project directories

rtemis includes a function and RStudio addin to initialize a simple directory structure under the working directory for your data analysis projects with the following:

  • ./R/
    Directory to save your project .R code files
  • ./Data/
    Directory to save your project data files, e.g. .rds, .csv, etc
  • ./Results/
    Directory to save your output, e.g. rtemis supervised learning output directories (define using outdir, e.g.  outdir = "./Results/Dataset_Algorithm")
  • ./rtInit.log
    Log file with R session info

Call the function directly or use RStudio’s Addins drop down menu:

rtInitProjectDir()

rtemis RStudio addin

rtemis running in RStudio with the rtemis-dark RStudio theme

rtemis running in VS Code with the rtemis-dark VS Code theme