Loading [MathJax]/jax/output/CommonHTML/jax.js
+ - 0:00:00
Notes for current slide
Notes for next slide

Reproducible and scalable data analysis workflows with targets

Dynamic Function-Oriented Make-Like Declarative Pipelines for R

Matthew T. Warkentin, MSc. Ph.D (c) Lunenfeld-Tanenbaum Research Institute, Sinai Health System
Novemer 12, 2020
1
Package: targets
Title: Dynamic Function-Oriented 'Make'-Like Declarative Workflows
Description: The 'targets' package is a pipeline toolkit...
Authors@R: c(
person(
given = c("William", "Michael"),
family = "Landau",
role = c("aut", "cre"),
email = "will.landau@gmail.com",
comment = c(ORCID = "0000-0003-1878-3253")
),
person(
given = c("Matthew", "T."),
family = "Warkentin",
role = "ctb"
),
person(
family = "Eli Lilly and Company",
role = "cph"
))
2
Package: targets
Title: Dynamic Function-Oriented 'Make'-Like Declarative Workflows
Description: The 'targets' package is a pipeline toolkit...
Authors@R: c(
person(
given = c("William", "Michael"),
family = "Landau",
role = c("aut", "cre"),
email = "will.landau@gmail.com",
comment = c(ORCID = "0000-0003-1878-3253")
),
person(
given = c("Matthew", "T."),
family = "Warkentin",
role = "ctb"
),
person(
family = "Eli Lilly and Company",
role = "cph"
))
3

From: Lepore, Mauro
Subject: Would you be willing to review a package for rOpenSci?
To: warkentin@lunenfeld.ca


Dear Matthew,

Hi, this is Mauro . I hope you and your loved ones are safe. I'm writing to ask if you would be willing to review a package for rOpenSci. As you probably know, rOpenSci conducts peer review of R packages contributed to our collection in a manner similar to journals.

The package targets by Will Landau provides make-like pipelines for R. targets supersedes drake, and is submitted to rOpenSci jointly with the package tarchetypes. You can find targets and tarchetypes on GitHub here and here. We conduct our open review process via GitHub as well.

...

Thank you for your time.

Sincerely, Mauro

4

What is {targets}?

The targets package is a Make-like pipeline toolkit for Statistics and data science in R. With targets, you can maintain a reproducible workflow without repeating yourself. targets learns how your pipeline fits together, skips costly runtime for tasks that are already up to date, runs only the necessary computation, supports implicit parallel computing, abstracts files as R objects, and shows tangible evidence that the results match the underlying code and data.

5

What is {targets}?

The targets package is a Make-like pipeline toolkit for Statistics and data science in R. With targets, you can maintain a reproducible workflow without repeating yourself. targets learns how your pipeline fits together, skips costly runtime for tasks that are already up to date, runs only the necessary computation, supports implicit parallel computing, abstracts files as R objects, and shows tangible evidence that the results match the underlying code and data.

  • {targets} is a project workflow tool that is very R-centric

    • Similar tools exist for other languages, such as {GNU make} and {snakemake}
  • It allows you to effectively modularize your data analysis projects to create obvious and reproducible workflows

  • Can easily extend your workflow to massively parallelize tasks

    • With some setup can use external compute resources (e.g. HPC) as a computational back-end for your pipeline
6

What about {drake}?

The drake package is an older and more established R-focused pipeline toolkit. It is has become a key piece of the R ecosystem, and development and support will continue.

The targets package borrows from past learnings, user suggestions, discussions, complaints, success stories, and feature requests, and it improves the user experience in ways that will never be possible in drake.

7

What about {drake}?

The drake package is an older and more established R-focused pipeline toolkit. It is has become a key piece of the R ecosystem, and development and support will continue.

The targets package borrows from past learnings, user suggestions, discussions, complaints, success stories, and feature requests, and it improves the user experience in ways that will never be possible in drake.

targets is more...

  • Efficient

  • Reproducible

  • Maintainable

  • Portable

  • Domain specific

See the Statement of Need for details.
8

9

Why should I use {targets}?

  • Organization

    • Explicitly building your projects with as a cohesive pipeline keeps your project more organized and focused
10

Why should I use {targets}?

  • Organization

    • Explicitly building your projects with as a cohesive pipeline keeps your project more organized and focused
  • Modularity

    • Breaking tasks into small digestible code chunks makes it easier to debug your code and easier to see how all of the parts fit together
11

Why should I use {targets}?

  • Organization

    • Explicitly building your projects with as a cohesive pipeline keeps your project more organized and focused
  • Modularity

    • Breaking tasks into small digestible code chunks makes it easier to debug your code and easier to see how all of the parts fit together
  • Transparency and Reproducibility

    • Out of the box, you get a transparent and reproducible data analysis workflow
12

Why should I use {targets}?

  • Organization

    • Explicitly building your projects with as a cohesive pipeline keeps your project more organized and focused
  • Modularity

    • Breaking tasks into small digestible code chunks makes it easier to debug your code and easier to see how all of the parts fit together
  • Transparency and Reproducibility

    • Out of the box, you get a transparent and reproducible data analysis workflow
  • Caching and History

    • Re-running code that doesn't change often is tedious and time-consuming. Caching results means you only run what is absolutely necessary to get up-to-date results
13

Why should I use {targets}?

  • Organization

    • Explicitly building your projects with as a cohesive pipeline keeps your project more organized and focused
  • Modularity

    • Breaking tasks into small digestible code chunks makes it easier to debug your code and easier to see how all of the parts fit together
  • Transparency and Reproducibility

    • Out of the box, you get a transparent and reproducible data analysis workflow
  • Caching and History

    • Re-running code that doesn't change often is tedious and time-consuming. Caching results means you only run what is absolutely necessary to get up-to-date results
  • Scalability and Parallel Computing

    • Mental models of projects break down at scale. Building projects as workflows scales well and facilitates parallel computing
14

Using {targets}

  • All functions in {targets} are prefixed by tar_*, which makes it easy to work with the package due to low cognitive friction
16

Using {targets}

  • All functions in {targets} are prefixed by tar_*, which makes it easy to work with the package due to low cognitive friction

  • Your 80/20 functions...

    • tar_target() - The unit of interest; targets are the building blocks of your pipeline and represent meaningful components of your project
17

Using {targets}

  • All functions in {targets} are prefixed by tar_*, which makes it easy to work with the package due to low cognitive friction

  • Your 80/20 functions...

    • tar_target() - The unit of interest; targets are the building blocks of your pipeline and represent meaningful components of your project

    • tar_pipeline() - Contains the complete set of targets to be included in the pipeline

18

Using {targets}

  • All functions in {targets} are prefixed by tar_*, which makes it easy to work with the package due to low cognitive friction

  • Your 80/20 functions...

    • tar_target() - The unit of interest; targets are the building blocks of your pipeline and represent meaningful components of your project

    • tar_pipeline() - Contains the complete set of targets to be included in the pipeline

    • tar_option_set() - Set global configuration options, such as default storage formats, packages, memory allocation, storage, deployment...etc.

19

Using {targets}

  • All functions in {targets} are prefixed by tar_*, which makes it easy to work with the package due to low cognitive friction

  • Your 80/20 functions...

    • tar_target() - The unit of interest; targets are the building blocks of your pipeline and represent meaningful components of your project

    • tar_pipeline() - Contains the complete set of targets to be included in the pipeline

    • tar_option_set() - Set global configuration options, such as default storage formats, packages, memory allocation, storage, deployment...etc.

    • tar_make() - Inspects your code/pipeline to understand the dependencies, and builds the pipeline in a separate clean R session

20
tar_target(
name,
command,
pattern = NULL,
tidy_eval = targets::tar_option_get("tidy_eval"),
packages = targets::tar_option_get("packages"),
library = targets::tar_option_get("library"),
format = targets::tar_option_get("format"),
iteration = targets::tar_option_get("iteration"),
error = targets::tar_option_get("error"),
memory = targets::tar_option_get("memory"),
garbage_collection = targets::tar_option_get("garbage_collection"),
deployment = targets::tar_option_get("deployment"),
priority = targets::tar_option_get("priority"),
resources = targets::tar_option_get("resources"),
storage = targets::tar_option_get("storage"),
retrieval = targets::tar_option_get("retrieval"),
cue = targets::tar_option_get("cue")
)
21
Unique name given to a target. TIP: A common prefix can make it easier to refer to families of targets
R code that produces a target value
tar_target(
name,
command,
pattern = NULL,
tidy_eval = targets::tar_option_get("tidy_eval"),
packages = targets::tar_option_get("packages"),
library = targets::tar_option_get("library"),
format = targets::tar_option_get("format"),
iteration = targets::tar_option_get("iteration"),
error = targets::tar_option_get("error"),
memory = targets::tar_option_get("memory"),
garbage_collection = targets::tar_option_get("garbage_collection"),
deployment = targets::tar_option_get("deployment"),
priority = targets::tar_option_get("priority"),
resources = targets::tar_option_get("resources"),
storage = targets::tar_option_get("storage"),
retrieval = targets::tar_option_get("retrieval"),
cue = targets::tar_option_get("cue")
)
22
tar_option_set(
tidy_eval = NULL,
packages = NULL,
library = NULL,
envir = NULL,
format = NULL,
iteration = NULL,
error = NULL,
memory = NULL,
garbage_collection = NULL,
deployment = NULL,
priority = NULL,
resources = NULL,
storage = NULL,
retrieval = NULL,
cue = NULL,
debug = NULL
)
23
Character vector of packages that will be loaded before building your pipeline
tar_option_set(
tidy_eval = NULL,
packages = NULL,
library = NULL,
envir = NULL,
format = NULL,
iteration = NULL,
error = NULL,
memory = NULL,
garbage_collection = NULL,
deployment = NULL,
priority = NULL,
resources = NULL,
storage = NULL,
retrieval = NULL,
cue = NULL,
debug = NULL
)
24
tar_pipeline(...)
  • tar_pipeline() simply accepts an arbitrary number of tar_target() objects, or a list thereof.

Example:

# _targets.R
tar_pipeline(
tar_target(first, f1()),
tar_target(second, f2()),
tar_target(third, f3(first, second))
)
  • Note: The order of targets inside tar_pipeline() does NOT matter. {targets} is smart enough to infer the topology and learn dependencies
25
tar_pipeline(...)
  • tar_pipeline() simply accepts an arbitrary number of tar_target() objects, or a list thereof.

Example:

# _targets.R
tar_pipeline(
tar_target(first, f1()),
tar_target(second, f2()),
tar_target(third, f3(first, second))
)
  • Note: The order of targets inside tar_pipeline() does NOT matter. {targets} is smart enough to infer the topology and learn dependencies
26

Imperative scripting

R/
├── 01-data.R
├── 02-clean.R
├── 03-fit-model.R
├── 04-summarize-results.R
└── 05-tables-figs.R
run_scripts.R
27

Imperative scripting

R/
├── 01-data.R
├── 02-clean.R
├── 03-fit-model.R
├── 04-summarize-results.R
└── 05-tables-figs.R
run_scripts.R
  • Does not scale well to larger/complicated projects

  • You are in charge of storing/loading important objects

  • Everything needs to be ran every time

28

Guiding Design Principles

Defining good targets is more of an art than a science, and it requires personal judgement and context specific to your use case.

29

Guiding Design Principles

Defining good targets is more of an art than a science, and it requires personal judgement and context specific to your use case.

  • Generally speaking, a good target is...

    1. Long enough to eat up a decent chunk of runtime, and

    2. Small enough that tar_make() frequently skips it, and

    3. Meaningful to your project, and

    4. A well-behaved R object that can be stored.

30

Guiding Design Principles

Defining good targets is more of an art than a science, and it requires personal judgement and context specific to your use case.

  • Generally speaking, a good target is...

    1. Long enough to eat up a decent chunk of runtime, and

    2. Small enough that tar_make() frequently skips it, and

    3. Meaningful to your project, and

    4. A well-behaved R object that can be stored.

  • A {targets} pipeline is a directed acyclic graph (DAG) showing all of the tasks (nodes) and their interrelationships (vertices)
31

Think Functional

  • A key design consideration when working with {targets} is to embrace functions

  • Try to abstract important steps in your workflow into functions that do a single obvious task

  • At first, this may seem like extra work, but the downstream payoff is huge

32

Think Functional

  • A key design consideration when working with {targets} is to embrace functions

  • Try to abstract important steps in your workflow into functions that do a single obvious task

  • At first, this may seem like extra work, but the downstream payoff is huge

find_outcomes <- function(data, icd_code) {
# <<some R code>>
return(data_with_outcomes)
}
find_outcomes(my_data, "C34")
  • We now have a function that is easy to maintain and can be used in our targets pipeline
33

Suggested Project Structure

  • Everyone has their own preferred way of organizing their files and projects. This is only a suggested way based on my workflow using {targets} for building R-centric projects

However,

  • _targets.R must exist at the root of the project
├── R/
│ ├── functions.R
├── _targets.R
├── run.R
├── project-name.Rproj
34

Suggested Project Structure

  • Everyone has their own preferred way of organizing their files and projects. This is only a suggested way based on my workflow using {targets} for building R-centric projects

However,

  • _targets.R must exist at the root of the project
├── R/
│ ├── functions.R
├── _targets.R
├── run.R
├── project-name.Rproj
  • A more mature project may have more subdirectories and files, but this serves as the skeleton for most/all of my {targets} projects
35
├── R/
├── _targets.R
├── run.R
├── project-name.Rproj
# _targets.R
library(targets)
# Load functions
source("functions.R")
# Set global options
tar_option_set(...)
# Define targets/pipeline
tar_pipeline(...)
36
├── R/
├── _targets.R
├── run.R
├── project-name.Rproj
# run.R
targets::tar_make()
37
├── R/
├── _targets.R
├── run.R
├── project-name.Rproj
├── R/
│ ├── clean-data.R
│ ├── cv-splits.R
│ ├── fit-model.R
│ ├── summarize-results.R
│ ├── build-report.R
  • I suggest having one script per function/task

  • Name the script the same name as the function contained therein

38
├── R/
├── _targets.R
├── run.R
├── project-name.Rproj
├── R/
│ ├── clean-data.R
│ ├── cv-splits.R
│ ├── fit-model.R
│ ├── summarize-results.R
│ ├── build-report.R
# clean-data.R
clean_data <- function(data) {
"<<some R code...>>"
return(data_clean)
}
39

Data Store

  • After you run your pipeline the first time the data store is created...
40

Data Store

  • After you run your pipeline the first time the data store is created...
_targets/
│ ├── meta/
│ ├── meta
│ ├── progress
│ ├── objects/
│ ├── target_name_1
│ ├── target_name_2
├── R/
├── _targets.R
├── run.R
├── project-name.Rproj
  • It is NOT recommended to directly interact with the _targets/ directory. Instead, inspect the data store and load objects using the suite of available helper functions
41

A simple example...

Artwork by @allison_horst
# _targets.R
tar_pipeline(
tar_target(
data,
palmerpenguins::penguins
),
tar_target(
model,
lm(bill_length_mm ~ species, data = data)
)
)
42

A simple example...

Artwork by @allison_horst
# _targets.R
tar_pipeline(
tar_target(
data,
palmerpenguins::penguins
),
tar_target(
model,
lm(bill_length_mm ~ species, data = data)
)
)
tar_make()
● run target data
● run target model
43

A simple example...

tar_make()
✓ skip target data
✓ skip target model
✓ Already up to date.

A look at the data store...

_targets/
├── meta
│ ├── meta
│ └── progress
└── objects
├── data
└── model
44
tar_read(data) # compare with tar_load()
#> # A tibble: 344 x 8
#> species island bill_length_mm bill_depth_mm flipper_length_…
#> <fct> <fct> <dbl> <dbl> <int>
#> 1 Adelie Torge… 39.1 18.7 181
#> 2 Adelie Torge… 39.5 17.4 186
#> 3 Adelie Torge… 40.3 18 195
#> 4 Adelie Torge… NA NA NA
#> 5 Adelie Torge… 36.7 19.3 193
#> 6 Adelie Torge… 39.3 20.6 190
#> 7 Adelie Torge… 38.9 17.8 181
#> 8 Adelie Torge… 39.2 19.6 195
#> 9 Adelie Torge… 34.1 18.1 193
#> 10 Adelie Torge… 42 20.2 190
#> # … with 334 more rows, and 3 more variables: body_mass_g <int>,
#> # sex <fct>, year <int>

NOTE: tar_read() reads objects into memory, but the user must assign the object into a variable for persistence; tar_load() reads and assigns objects into a variable of the same name

45
tar_read(model) # compare with tar_load()
#> Call:
#> lm(formula = bill_length_mm ~ species, data = data)
#>
#> Coefficients:
#> (Intercept) speciesChinstrap speciesGentoo
#> 38.791 10.042 8.713

NOTE: tar_read() reads objects into memory, but the user must assign the object into a variable for persistence; tar_load() reads and assigns objects into a variable of the same name

46
# _targets.R
tar_pipeline(
tar_target(
data,
palmerpenguins::penguins
),
tar_target(
model,
lm(bill_length_mm ~ species, data = data)
),
tar_target(
summary,
summary(model)
)
)
tar_make()
✓ skip target data
✓ skip target model
● run target summary
47

Depending on external files

  • Sometimes your pipeline might need to look outward and depend on "external" files
    • A common example is depending on a CSV/Excel or PLINK file
    • The same technique applies if your code produces a file
48

Depending on external files

  • Sometimes your pipeline might need to look outward and depend on "external" files
    • A common example is depending on a CSV/Excel or PLINK file
    • The same technique applies if your code produces a file
# _targets.R
tar_pipeline(
tar_target(
data,
read_csv("path/to/data.csv")
)
)
49

Depending on external files

  • Sometimes your pipeline might need to look outward and depend on "external" files
    • A common example is depending on a CSV/Excel or PLINK file
    • The same technique applies if your code produces a file
# _targets.R
tar_pipeline(
tar_target(
data_file,
"path/to/data.csv",
format = "file"
),
tar_target(
data,
read_csv(data_file)
)
)
50

Dynamic branching

The targets packages supports shorthand to create large pipelines. Dynamic branching defines new targets (i.e. branches) while the pipeline is running, and those definitions can be based on prior results from upstream targets.

51

Dynamic branching

The targets packages supports shorthand to create large pipelines. Dynamic branching defines new targets (i.e. branches) while the pipeline is running, and those definitions can be based on prior results from upstream targets.

Patterns: Creates branches (i.e. sub-targets) by repeating a task over a set of arguments

# _targets.R
# Draws from random Normal of various sizes
tar_pipeline(
tar_target(size, seq(1, 1000, by = 100)),
tar_target(draws, rnorm(size), pattern = map(size))
)
  • By default, targets will aggregate each of the sub-targets of draws using vctrs::vec_c(). In this example, this will combine all of our draws into one single vector.
    • We could combine into a list by adding iteration = "list", instead
52

Dynamic branching

Iteration: patterns repeat tasks and iterate over arguments (e.g. using map()), and there are two important aspects of iteration...

  1. Branching

    • How should targets slice the data when creating branches?
  2. Aggregation

    • How should targets combine the results after completing branches?
53

Dynamic branching

Iteration: patterns repeat tasks and iterate over arguments (e.g. using map()), and there are two important aspects of iteration...

  1. Branching

    • How should targets slice the data when creating branches?
  2. Aggregation

    • How should targets combine the results after completing branches?
  • iteration = "vector"

    • Branches are created with vctrs::vec_slice(x, i) and aggregated with vctrs::vec_c(...)
  • iteration = "list"

    • Branches are created with list[[i]] and aggregated with list(...)
54

Advanced Example: Dynamic Branching

  • We will set up a more advanced example using dynamic branching

  • Let's fit a model for how life expectancy has changed over time for each country in the gapminder data set

    • How many countries are in our data set? Not sure and it doesn't matter

lifeExpectancy=time+ϵ

  • After, we will adjust our life expectancy model for GDP per capita
55
# _targets.R
tar_pipeline(
tar_target(
gapminder_file,
"/Users/matt/Library/R/4.0/library/gapminder/extdata/gapminder.tsv",
format = "file"
),
tar_target(
gapminder,
read_tsv(gapminder_file)
),
tar_target(
country,
group_split(gapminder, country), # returns a list of data frames
iteration = "list"
),
tar_target(
model,
lm(lifeExp ~ year, data = country),
pattern = map(country),
iteration = "list"
)
)
56

Advanced Example

tar_visnetwork()

57

Advanced Example

tar_make()
● run target gapminder_file
● run target gapminder
● run target country
● run branch model_55cab078
● run branch model_137fa27c
● run branch model_11168a7a
● run branch model_4fa278a7
● run branch model_f0b9128a
...
● run branch model_b1ee577e
● run branch model_af9237c1
● run branch model_ab0302f4
● run branch model_0b45f3c6
● run branch model_f4b18a5b
58
# _targets.R
tar_pipeline(
tar_target(
gapminder_file,
"/Users/matt/Library/R/4.0/library/gapminder/extdata/gapminder.tsv",
format = "file"
),
tar_target(
gapminder,
read_tsv(gapminder_file)
),
tar_target(
country,
group_split(gapminder, country),
iteration = "list"
),
tar_target(
model,
lm(lifeExp ~ year + gdpPercap, data = country),
pattern = map(country),
iteration = "list"
)
)
59

Advanced Example

tar_visnetwork(label = c("branches", "time"))

60

Advanced Example

# _targets.R
options(clustermq.scheduler = "multiprocess")
tar_make_clustermq(workers = 6)
61

Advanced Example

# _targets.R
options(clustermq.scheduler = "multiprocess")
tar_make_clustermq(workers = 6)
✓ skip target gapminder_file
✓ skip target gapminder
✓ skip target country
● run branch model_137fa27c
● run branch model_11168a7a
● run branch model_4fa278a7
...
● run branch model_0b45f3c6
● run branch model_f4b18a5b
● run branch model_55cab078
Master: [11.1s 77.9% CPU]; Worker: [avg 20.9% CPU, max 8110571.0 Mb]
62

High-performance Computing

targets supports high-performance computing with the tar_make_clustermq() and tar_make_future() functions. These functions are like tar_make(), but they allow multiple targets to run simultaneously over parallel workers. These workers can be processes on your local machine, or they can be jobs on a computing cluster. The main process automatically sends a target to a worker as soon as

  1. The worker is available, and
  2. All the target’s upstream dependency targets have been checked or built.
63

High-performance Computing

  • Assuming you have installed the {clustermq} R package...enabling parallel processing of targets is as simple as...
# Add this line to your _targets.R file
options(clustermq.scheduler = "multiprocess")
# Instead of tar_make()
targets::tar_make_clustermq(workers = 6)
  • Keep in mind, these multiple persistent R processes are running locally, so a good rule-of-thumb is the number of processes should always be less than or equal to the number of available cores
65

High-performance Computing

  • If your project lives on our cluster NFS, you can change the scheduler to "slurm", and targets will run the worker processes as jobs on compute nodes
# Add this line to your _targets.R file
options(
clustermq.scheduler = "slurm",
clustermq.template = "/path/to/slurm_clustermq.tmpl"
)
  • You also need to point clustermq toward a file which provides a template for submitting batch jobs
66

High-performance Computing

  • If your project lives on our cluster NFS, you can change the scheduler to "slurm", and targets will run the worker processes as jobs on compute nodes
# Add this line to your _targets.R file
options(
clustermq.scheduler = "slurm",
clustermq.template = "/path/to/slurm_clustermq.tmpl"
)
  • You also need to point clustermq toward a file which provides a template for submitting batch jobs
targets::tar_make_clustermq(workers = 6)
  • Now these worker processes will run on compute nodes (caveat: currently few nodes support ZeroMQ socket communication but this should be widely available next upgrade)
67

Slurm Job Template

# slurm_clustermq.tmpl
#SBATCH --job-name={{ job_name }}
#SBATCH --partition=default
#SBATCH --output={{ log_file | /dev/null }}
#SBATCH --error={{ log_file | /dev/null }}
#SBATCH --mem-per-cpu={{ memory | 4096 }}
#SBATCH --array=1-{{ n_jobs }}
ulimit -v $(( 1024 * {{ memory | 4096 }} ))
CMQ_AUTH={{ auth }} R --no-save --no-restore -e 'clustermq:::worker("{{ master }}")'
  • Values in double curly-braces will be automatically populated by clustermq (or fall back to default values, when available)

    • Additional job options may be defined in this template (e.g. loading modules, activating environments)
    • You may also choose to pass these wildcards using the resources argument to tar_option_set()
  • Other job schedulers are supported, including: LSF, SGE, PBS, Torque

68

Topics I didn't cover...

  • Seamless integration with AWS/cloud storage capabilities

    • You can store and load objects from AWS S3 buckets as if they existed locally
  • targets offers lots of different storage formats, some of which are faster and more memory efficient

  • tarchetypes is a helper package that "is a collection of target and pipeline archetypes for the targets package"

  • Submitting HPC jobs using SSH - Develop locally and deploy remotely

  • Lots more, this package is in rapid-development, though its current API is pretty stable as this package is preparing for peer-review and a subsequent CRAN release

69

Acknolwedgements

  • Will Landau

    • Author of targets and drake
  • Michael Schubert

    • Author of clustermq
  • Yihui Xie, Garrick Aden-Buie



Kohn's Second Law: An experiment is reproducible until another laboratory tries to repeat it
When we don't use workflow tools... Matt's Second Law: A statistical analysis is reproducible until you try to repeat it
70
Package: targets
Title: Dynamic Function-Oriented 'Make'-Like Declarative Workflows
Description: The 'targets' package is a pipeline toolkit...
Authors@R: c(
person(
given = c("William", "Michael"),
family = "Landau",
role = c("aut", "cre"),
email = "will.landau@gmail.com",
comment = c(ORCID = "0000-0003-1878-3253")
),
person(
given = c("Matthew", "T."),
family = "Warkentin",
role = "ctb"
),
person(
family = "Eli Lilly and Company",
role = "cph"
))
2
Paused

Help

Keyboard shortcuts

, , Pg Up, k Go to previous slide
, , Pg Dn, Space, j Go to next slide
Home Go to first slide
End Go to last slide
Number + Return Go to specific slide
b / m / f Toggle blackout / mirrored / fullscreen mode
c Clone slideshow
p Toggle presenter mode
t Restart the presentation timer
?, h Toggle this help
oTile View: Overview of Slides
Esc Back to slideshow