.. DO NOT EDIT. .. THIS FILE WAS AUTOMATICALLY GENERATED BY SPHINX-GALLERY. .. TO MAKE CHANGES, EDIT THE SOURCE PYTHON FILE: .. "auto_examples/plot_paths_cv_curves.py" .. LINE NUMBERS ARE GIVEN BELOW. .. only:: html .. note:: :class: sphx-glr-download-link-note Click :ref:`here ` to download the full example code .. rst-class:: sphx-glr-example-title .. _sphx_glr_auto_examples_plot_paths_cv_curves.py: ============================================= Coefficient paths and cross-validation curves ============================================= This example demonstrates some of the properties of the FRR approach and compares it to the use of standard ridge regression (RR) on the diabetes dataset that is included in sckit-learn. In standard ridge regression, it is common to select alpha by testing a range of log-spaced values between very minimal regularization and very strong regularization. In fractional ridge regression, we instead select a set of ``frac`` values that represent the desired reduction in the L2-norm of the coefficients. .. GENERATED FROM PYTHON SOURCE LINES 20-22 Imports: .. GENERATED FROM PYTHON SOURCE LINES 22-26 .. code-block:: default import numpy as np import matplotlib.pyplot as plt .. GENERATED FROM PYTHON SOURCE LINES 27-29 This is the Fracridge sklearn-style estimator: .. GENERATED FROM PYTHON SOURCE LINES 29-36 .. code-block:: default from fracridge import FracRidgeRegressor from sklearn import datasets from sklearn.linear_model import Ridge from sklearn.model_selection import cross_val_predict from sklearn.metrics import r2_score .. GENERATED FROM PYTHON SOURCE LINES 37-39 Get example data from scikit learn: .. GENERATED FROM PYTHON SOURCE LINES 39-42 .. code-block:: default X, y = datasets.load_diabetes(return_X_y=True) .. GENERATED FROM PYTHON SOURCE LINES 43-46 Values of alpha for the standard approach are set to be 20 values That are log-spaced from a very small value to a very large value: .. GENERATED FROM PYTHON SOURCE LINES 46-52 .. code-block:: default n_alphas = 20 rr_alphas = np.logspace(-10, 10, n_alphas) rr_coefs = [] rr_coefs = np.zeros((X.shape[-1], n_alphas)) rr_pred = np.zeros((y.shape[-1], n_alphas)) .. GENERATED FROM PYTHON SOURCE LINES 53-55 We calculate the fit and cross-validated prediction for each value of alpha: .. GENERATED FROM PYTHON SOURCE LINES 55-62 .. code-block:: default for aa in range(len(rr_alphas)): RR = Ridge(alpha=rr_alphas[aa], fit_intercept=True) RR.fit(X, y) rr_coefs[:, aa] = RR.coef_ rr_pred[:, aa] = cross_val_predict(RR, X, y) .. GENERATED FROM PYTHON SOURCE LINES 63-68 In contrast, FRR takes as inputs fractions, rather than arbitrarily-chosen values of alpha. The alphas that are generated are selected to produce solutions whos fractional L2-norm relative to the L2-norm of the unregularized solution are these values. Here too, cross-validated predictions are generated: .. GENERATED FROM PYTHON SOURCE LINES 68-74 .. code-block:: default fracs = np.linspace(0, 1, n_alphas) FR = FracRidgeRegressor(fracs=fracs, fit_intercept=True) FR.fit(X, y) fr_pred = cross_val_predict(FR, X, y) .. GENERATED FROM PYTHON SOURCE LINES 75-78 We plot the results. First, the FRR coefficients as a function of requested fractions and then the RR coefficient as a function of the requested log-spaced alpha: .. GENERATED FROM PYTHON SOURCE LINES 78-90 .. code-block:: default fig, ax = plt.subplots(1, 2) ax[0].plot(fracs, FR.coef_.T) ylims = ax[0].get_ylim() ax[0].vlines(fracs, ylims[0], ylims[1], linewidth=0.5, color='gray') ax[0].set_ylim(*ylims) ax[1].plot(np.log(rr_alphas[::-1]), rr_coefs.T) ylims = ax[1].get_ylim() ax[1].vlines(np.log(rr_alphas[::-1]), ylims[0], ylims[1], linewidth=0.5, color='gray') ax[1].set_ylim(*ylims) .. image-sg:: /auto_examples/images/sphx_glr_plot_paths_cv_curves_001.png :alt: plot paths cv curves :srcset: /auto_examples/images/sphx_glr_plot_paths_cv_curves_001.png :class: sphx-glr-single-img .. rst-class:: sphx-glr-script-out Out: .. code-block:: none (-869.3480959251679, 828.4461624666238) .. GENERATED FROM PYTHON SOURCE LINES 91-94 In a second plot, we compare the cross-validated predictions with the original data. This is appropriate as each prediction was generated in a sample that did not include that observation. .. GENERATED FROM PYTHON SOURCE LINES 94-115 .. code-block:: default test_y = np.tile(y, (fr_pred.shape[-1], 1)).T rr_r2 = r2_score(test_y, rr_pred, multioutput="raw_values") fr_r2 = r2_score(test_y, fr_pred, multioutput="raw_values") fig, ax = plt.subplots(1, 2, sharey=True) ax[0].plot(fracs, fr_r2, 'o-') ylims = ax[0].get_ylim() ax[0].vlines(fracs, ylims[0], ylims[1], linewidth=0.5, color='gray') ax[0].set_ylim(*ylims) ax[1].plot(np.log(rr_alphas[::-1]), rr_r2, 'o-') ylims = ax[1].get_ylim() ax[1].vlines(np.log(rr_alphas[::-1]), ylims[0], ylims[1], linewidth=0.5, color='gray') ax[1].set_ylim(*ylims) plt.show() .. image-sg:: /auto_examples/images/sphx_glr_plot_paths_cv_curves_002.png :alt: plot paths cv curves :srcset: /auto_examples/images/sphx_glr_plot_paths_cv_curves_002.png :class: sphx-glr-single-img .. rst-class:: sphx-glr-timing **Total running time of the script:** ( 0 minutes 0.313 seconds) .. _sphx_glr_download_auto_examples_plot_paths_cv_curves.py: .. only :: html .. container:: sphx-glr-footer :class: sphx-glr-footer-example .. container:: sphx-glr-download sphx-glr-download-python :download:`Download Python source code: plot_paths_cv_curves.py ` .. container:: sphx-glr-download sphx-glr-download-jupyter :download:`Download Jupyter notebook: plot_paths_cv_curves.ipynb ` .. only:: html .. rst-class:: sphx-glr-signature `Gallery generated by Sphinx-Gallery `_