+ "markdown": "---\ntitle: \"1 - Introduction\"\nsubtitle: \"Machine learning with tidymodels\"\nformat:\n revealjs: \n slide-number: true\n footer: <https://workshops.tidymodels.org>\n include-before-body: header.html\n include-after-body: footer-annotations.html\n theme: [default, tidymodels.scss]\n width: 1280\n height: 720\nknitr:\n opts_chunk: \n echo: true\n collapse: true\n comment: \"#>\"\n---\n\n\n\n\n::: r-fit-text\nWelcome!\n:::\n\n## Who are you?\n\n- You can use the magrittr `%>%` or base R `|>` pipe\n\n- You are familiar with functions from dplyr, tidyr, ggplot2\n\n- You have exposure to basic statistical concepts\n\n- You do **not** need intermediate or expert familiarity with modeling or ML\n\n## Who are tidymodels?\n\n- Simon Couch\n- Hannah Frick\n- Emil Hvitfeldt\n- Max Kuhn\n\n. . .\n\nMany thanks to Davis Vaughan, Julia Silge, David Robinson, Julie Jung, Alison Hill, and Desirée De Leon for their role in creating these materials!\n\n## Asking for help\n\n. . .\n\n🟪 \"I'm stuck and need help!\"\n\n. . .\n\n🟩 \"I finished the exercise\"\n\n\n## 👀 {.annotation}\n\n{.absolute top=\"0\" right=\"0\"}\n\n## Tentative plan for this workshop\n\n::: columns\n::: {.column width=\"50%\"}\n- *Today:* \n\n - Your data budget\n - What makes a model\n - Evaluating models\n:::\n::: {.column width=\"50%\"}\n- *Tomorrow:*\n \n - Feature engineering\n - Tuning hyperparameters\n - Racing methods\n - Iterative search methods\n:::\n:::\n\n## {.center}\n\n### Introduce yourself to your neighbors 👋\n\n<br></br>\n\nCheck Slack (`#ml-ws-2023`) for an RStudio Cloud link.\n\n## What is machine learning?\n\n{fig-align=\"center\"}\n\n::: footer\n<https://xkcd.com/1838/>\n:::\n\n## What is machine learning?\n\n{fig-align=\"center\"}\n\n::: footer\nIllustration credit: <https://vas3k.com/blog/machine_learning/>\n:::\n\n## What is machine learning?\n\n{fig-align=\"center\"}\n\n::: footer\nIllustration credit: <https://vas3k.com/blog/machine_learning/>\n:::\n\n## Your turn {transition=\"slide-in\"}\n\n{.absolute top=\"0\" right=\"0\" width=\"150\" height=\"150\"}\n\n. . .\n\n*How are statistics and machine learning related?*\n\n*How are they similar? Different?*\n\n\n::: {.cell}\n::: {.cell-output-display}\n```{=html}\n<div class=\"countdown\" id=\"statistics-vs-ml\" data-update-every=\"1\" tabindex=\"0\" style=\"right:0;bottom:0;\">\n<div class=\"countdown-controls\"><button class=\"countdown-bump-down\">−</button><button class=\"countdown-bump-up\">+</button></div>\n<code class=\"countdown-time\"><span class=\"countdown-digits minutes\">03</span><span class=\"countdown-digits colon\">:</span><span class=\"countdown-digits seconds\">00</span></code>\n</div>\n```\n:::\n:::\n\n\n::: notes\nthe \"two cultures\"\n\nmodel first vs. data first\n\ninference vs. prediction\n:::\n\n## What is tidymodels? {.absolute top=-20 right=0 width=\"64\" height=\"74.24\"}\n\n\n::: {.cell}\n\n```{.r .cell-code}\nlibrary(tidymodels)\n#> ── Attaching packages ──────────────────────────── tidymodels 1.1.0 ──\n#> ✔ broom 1.0.5 ✔ rsample 1.1.1.9000\n#> ✔ dials 1.2.0 ✔ tibble 3.2.1 \n#> ✔ dplyr 1.1.2 ✔ tidyr 1.3.0 \n#> ✔ infer 1.0.4 ✔ tune 1.1.1.9001\n#> ✔ modeldata 1.1.0 ✔ workflows 1.1.3 \n#> ✔ parsnip 1.1.0.9003 ✔ workflowsets 1.0.1 \n#> ✔ purrr 1.0.1 ✔ yardstick 1.2.0.9001\n#> ✔ recipes 1.0.6\n#> ── Conflicts ─────────────────────────────── tidymodels_conflicts() ──\n#> ✖ purrr::discard() masks scales::discard()\n#> ✖ dplyr::filter() masks stats::filter()\n#> ✖ dplyr::lag() masks stats::lag()\n#> ✖ recipes::step() masks stats::step()\n#> • Use tidymodels_prefer() to resolve common conflicts.\n```\n:::\n\n\n## {background-image=\"images/tm-org.png\" background-size=\"contain\"}\n\n## The whole game\n\nPart of any modelling process is\n\n* Splitting your data into training and test set\n* Using a resampling scheme\n* Fitting models\n* Assessing performance\n* Choosing a model\n* Fitting and assessing the final model\n\n\n## The whole game\n\n\n::: {.cell layout-align=\"center\"}\n::: {.cell-output-display}\n{fig-align='center' width=3543}\n:::\n:::\n\n\n## The whole game\n\n\n::: {.cell layout-align=\"center\"}\n::: {.cell-output-display}\n{fig-align='center' width=3543}\n:::\n:::\n\n\n:::notes\nStress that we are **not** fitting a model on the entire training set other than for illustrative purposes in deck 2.\n:::\n\n## The whole game\n\n\n::: {.cell layout-align=\"center\"}\n::: {.cell-output-display}\n{fig-align='center' width=3543}\n:::\n:::\n\n\n## The whole game\n\n\n::: {.cell layout-align=\"center\"}\n::: {.cell-output-display}\n{fig-align='center' width=3543}\n:::\n:::\n\n\n## The whole game\n\n\n::: {.cell layout-align=\"center\"}\n::: {.cell-output-display}\n{fig-align='center' width=3543}\n:::\n:::\n\n\n## The whole game\n\n\n::: {.cell layout-align=\"center\"}\n::: {.cell-output-display}\n{fig-align='center' width=3543}\n:::\n:::\n\n\n## The whole game\n\n\n::: {.cell layout-align=\"center\"}\n::: {.cell-output-display}\n{fig-align='center' width=3543}\n:::\n:::\n\n\n\n## Let's install some packages\n\nIf you are using your own laptop instead of RStudio Cloud:\n\n\n::: {.cell}\n\n```{.r .cell-code}\ninstall.packages(\"pak\")\n\npkgs <- c(\"bonsai\", \"doParallel\", \"embed\", \"finetune\", \"lightgbm\", \"lme4\", \n \"parallelly\", \"plumber\", \"probably\", \"ranger\", \"rpart\", \"rpart.plot\", \n \"stacks\", \"textrecipes\", \"tidymodels\", \"tidymodels/modeldatatoo\", \n \"vetiver\")\npak::pak(pkgs)\n```\n:::\n\n\n. . .\n\nCheck Slack (`#ml-ws-2023`) for an RStudio Cloud link.\n\n\n## Our versions\n\n\n::: {.cell}\n\n:::\n\n\nbonsai (0.2.1.9000, Github (tidymodels/bonsai@aab79), broom (1.0.5, local), dials (1.2.0, CRAN), doParallel (1.0.17, CRAN), dplyr (1.1.2, CRAN), embed (1.0.0, CRAN), finetune (1.1.0.9000, Github (tidymodels/finetune@52d), ggplot2 (3.4.2, CRAN), lightgbm (3.3.5, CRAN), lme4 (1.1-33, CRAN), modeldata (1.1.0, CRAN), modeldatatoo (0.1.0.9000, Github (tidymodels/modeldatatoo), parallelly (1.36.0, CRAN), parsnip (1.1.0.9003, Github (tidymodels/parsnip@e627), plumber (1.2.1, CRAN), probably (1.0.2, CRAN), purrr (1.0.1, CRAN), ranger (0.15.1, CRAN), recipes (1.0.6, CRAN), rpart (4.1.19, CRAN), rpart.plot (3.1.1, CRAN), rsample (1.1.1.9000, Github (tidymodels/rsample@afc4), scales (1.2.1, CRAN), stacks (1.0.2.9000, local), textrecipes (1.0.2, CRAN), tibble (3.2.1, CRAN), tidymodels (1.1.0, CRAN), tidyr (1.3.0, CRAN), tune (1.1.1.9001, Github (tidymodels/tune@fea8b02), vetiver (0.2.0, CRAN), workflows (1.1.3, CRAN), workflowsets (1.0.1, CRAN), yardstick (1.2.0.9001, Github (tidymodels/yardstick@6c), and Quarto (1.3.433)\n",
0 commit comments