Note
Click here to download the full example code
Comparing alpha and fracs¶
Here we compare parameterization using fractional ridge regression (FRR) and standard ridge regression (SRR).
We will use the cross-validation objects implemented for both of these methods.
In the case of SRR, we will use the Scikit Learn implementation in the
sklearn.linear_model.RidgeCV
object. For FRR, we use the
FracRidgeRegressorCV
object, which implements a similar API.
Imports:
import numpy as np
from numpy.linalg import norm
from sklearn.datasets import make_regression
from sklearn.model_selection import train_test_split
from sklearn.metrics import r2_score
from sklearn.linear_model import RidgeCV, LinearRegression
from fracridge import FracRidgeRegressorCV
Here, we use a synthetic dataset. We generate a regression dataset with multiple targets, multiple samples, a large number of features and plenty of redundancy between them (set through the relatively small effective_rank of the design matrix):
np.random.seed(1984)
n_targets = 15
n_features = 80
effective_rank = 20
X, y, coef_true = make_regression(
n_samples=250,
n_features=n_features,
effective_rank=effective_rank,
n_targets=n_targets,
coef=True,
noise=5)
To evaluate and compare the performance of the two algorithms, we split the data into test and train sets:
We will start with SRR. We use a dense grid of alphas with 20 log-spaced values – a common heuristic used to ensure a wide sampling of alpha values
n_alphas = 20
srr_alphas = np.logspace(-10, 10, n_alphas)
srr = RidgeCV(alphas=srr_alphas)
srr.fit(X_train, y_train)
We sample the same number of fractions for FRR, evenly distributed between 1/n_alphas and 1.
Both models are fit and used to predict a left out set. Performance
of the models is compared using the sklearn.metrics.r2_score()
function (coefficient of determination).
Out:
0.46972013119095474
0.45299587544570136
In addition to a direct comparison of performance, we might ask what are the differences in terms of how the models have reached this point. The FRR CV estimator has a property that tells us what has been discovered as the best fraction (or ‘gamma’) to use:
print(frr.best_frac_)
Out:
0.5763157894736842
We can also ask what alpha value was deemed best. For the multi-target case presented here, this will be a vector of values, one for each target:
print(frr.alpha_)
Out:
[[0.12555081 0.13940575 0.13794834 0.14893141 0.12563128 0.11981233
0.13679139 0.11238843 0.15308675 0.13794939 0.0903119 0.09080746
0.11676323 0.09915391 0.08304413]]
In contrast, the SRR estimator has just one value of alpha:
print(srr.alpha_)
Out:
0.026366508987303555
But this one value causes many different changes in the coefficient
Out:
[0.69947223 0.69390466 0.69775046 0.70206726 0.6966108 0.69672747
0.69795199 0.70141742 0.69455634 0.69670361 0.68722928 0.6962125
0.69889076 0.70062607 0.70154649]
[0.88964723 0.88712935 0.89721432 0.89815576 0.89318108 0.88262288
0.89403994 0.88681387 0.9010228 0.91220024 0.87434107 0.87530769
0.88111432 0.89198447 0.88295353]
-35.00654307428932
0.4427244248624048
Total running time of the script: ( 0 minutes 0.786 seconds)