From ba953934ac8429994e1d0b5b5a0c7e1c5f18d4a6 Mon Sep 17 00:00:00 2001 From: Martin Kunz Date: Thu, 20 Nov 2025 09:26:22 +0100 Subject: [PATCH 1/5] fix: Add missing "be" in `cross-validation-curve` notebook --- notebooks/cross_validation_validation_curve.ipynb | 14 +++++++------- 1 file changed, 7 insertions(+), 7 deletions(-) diff --git a/notebooks/cross_validation_validation_curve.ipynb b/notebooks/cross_validation_validation_curve.ipynb index e9c32c1fa..bd4433a0f 100644 --- a/notebooks/cross_validation_validation_curve.ipynb +++ b/notebooks/cross_validation_validation_curve.ipynb @@ -255,13 +255,13 @@ "errors made during the data collection process (besides not measuring the\n", "unobserved input feature).\n", "\n", - "One extreme case could happen if there where samples in the dataset with\n", - "exactly the same input feature values but different values for the target\n", - "variable. That is very unlikely in real life settings, but could the case if\n", - "all features are categorical or if the numerical features were discretized\n", - "or rounded up naively. In our example, we can imagine two houses having\n", - "the exact same features in our dataset, but having different prices because\n", - "of the (unmeasured) seller's rush.\n", + "One extreme case could happen if there where samples in the dataset with exactly\n", + "the same input feature values but different values for the target variable. That\n", + "is very unlikely in real life settings, but could be the case if all features\n", + "are categorical or if the numerical features were discretized or rounded up\n", + "naively. In our example, we can imagine two houses having the exact same\n", + "features in our dataset, but having different prices because of the (unmeasured)\n", + "seller's rush.\n", "\n", "Apart from these extreme case, it's hard to know for sure what should qualify\n", "or not as noise and which kind of \"noise\" as introduced above is dominating.\n", From 1785e3bb500bb77f17036739388c73f1ce2d6edd Mon Sep 17 00:00:00 2001 From: Martin Kunz Date: Thu, 20 Nov 2025 09:31:01 +0100 Subject: [PATCH 2/5] fix: reword "these" -> "this" --- notebooks/cross_validation_validation_curve.ipynb | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/notebooks/cross_validation_validation_curve.ipynb b/notebooks/cross_validation_validation_curve.ipynb index bd4433a0f..58524a8cb 100644 --- a/notebooks/cross_validation_validation_curve.ipynb +++ b/notebooks/cross_validation_validation_curve.ipynb @@ -263,10 +263,10 @@ "features in our dataset, but having different prices because of the (unmeasured)\n", "seller's rush.\n", "\n", - "Apart from these extreme case, it's hard to know for sure what should qualify\n", - "or not as noise and which kind of \"noise\" as introduced above is dominating.\n", - "But in practice, the best ways to make our predictive models robust to noise\n", - "are to avoid overfitting models by:\n", + "Apart from this extreme case, it's hard to know for sure what should qualify or\n", + "not as noise and which kind of \"noise\" as introduced above is dominating. But in\n", + "practice, the best ways to make our predictive models robust to noise are to\n", + "avoid overfitting models by:\n", "\n", "- selecting models that are simple enough or with tuned hyper-parameters as\n", " explained in this module;\n", From 6069c7c6cd98e6eaf54f9c1c202d27e6cff2c3c6 Mon Sep 17 00:00:00 2001 From: Martin Kunz Date: Thu, 20 Nov 2025 12:36:50 +0100 Subject: [PATCH 3/5] Update notebooks/cross_validation_validation_curve.ipynb Co-authored-by: Arturo Amor <86408019+ArturoAmorQ@users.noreply.github.com> --- notebooks/cross_validation_validation_curve.ipynb | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/notebooks/cross_validation_validation_curve.ipynb b/notebooks/cross_validation_validation_curve.ipynb index 58524a8cb..5c0635f17 100644 --- a/notebooks/cross_validation_validation_curve.ipynb +++ b/notebooks/cross_validation_validation_curve.ipynb @@ -265,7 +265,7 @@ "\n", "Apart from this extreme case, it's hard to know for sure what should qualify or\n", "not as noise and which kind of \"noise\" as introduced above is dominating. But in\n", - "practice, the best ways to make our predictive models robust to noise are to\n", + "practice, the best way to make our predictive models robust to noise is to\n", "avoid overfitting models by:\n", "\n", "- selecting models that are simple enough or with tuned hyper-parameters as\n", From 3663ce16ca4bfb21e21d3ef172ec130e4f1d14cc Mon Sep 17 00:00:00 2001 From: Martin Kunz Date: Thu, 20 Nov 2025 12:41:52 +0100 Subject: [PATCH 4/5] fix: Update the python script file --- .../cross_validation_validation_curve.py | 14 +++++++------- 1 file changed, 7 insertions(+), 7 deletions(-) diff --git a/python_scripts/cross_validation_validation_curve.py b/python_scripts/cross_validation_validation_curve.py index 7d400b4c8..9549b7494 100644 --- a/python_scripts/cross_validation_validation_curve.py +++ b/python_scripts/cross_validation_validation_curve.py @@ -198,16 +198,16 @@ # # One extreme case could happen if there where samples in the dataset with # exactly the same input feature values but different values for the target -# variable. That is very unlikely in real life settings, but could the case if -# all features are categorical or if the numerical features were discretized -# or rounded up naively. In our example, we can imagine two houses having -# the exact same features in our dataset, but having different prices because -# of the (unmeasured) seller's rush. +# variable. That is very unlikely in real life settings, but could be the case +# if all features are categorical or if the numerical features were discretized +# or rounded up naively. In our example, we can imagine two houses having the +# exact same features in our dataset, but having different prices because of the +# (unmeasured) seller's rush. # -# Apart from these extreme case, it's hard to know for sure what should qualify +# Apart from this extreme case, it's hard to know for sure what should qualify # or not as noise and which kind of "noise" as introduced above is dominating. # But in practice, the best ways to make our predictive models robust to noise -# are to avoid overfitting models by: +# is to avoid overfitting models by: # # - selecting models that are simple enough or with tuned hyper-parameters as # explained in this module; From 28b47d69b3d816f680c663d29c9f0b408a04ef04 Mon Sep 17 00:00:00 2001 From: Arturo Amor <86408019+ArturoAmorQ@users.noreply.github.com> Date: Thu, 20 Nov 2025 15:22:29 +0100 Subject: [PATCH 5/5] Update python_scripts/cross_validation_validation_curve.py --- python_scripts/cross_validation_validation_curve.py | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/python_scripts/cross_validation_validation_curve.py b/python_scripts/cross_validation_validation_curve.py index 9549b7494..0a0743027 100644 --- a/python_scripts/cross_validation_validation_curve.py +++ b/python_scripts/cross_validation_validation_curve.py @@ -206,7 +206,7 @@ # # Apart from this extreme case, it's hard to know for sure what should qualify # or not as noise and which kind of "noise" as introduced above is dominating. -# But in practice, the best ways to make our predictive models robust to noise +# But in practice, the best way to make our predictive models robust to noise # is to avoid overfitting models by: # # - selecting models that are simple enough or with tuned hyper-parameters as