User guide¶
MATLAB¶
The MATLAB implementation provides a single entry-point function called fracridge
.
To use it, arrange your data y
, to have d rows by t columns, where d
represents the number of observations in each target, and t
is the number of
targets (which may be 1). Correspondingly, your design matrix X
should have
d rows and p columns, where p
is the number of parameters. The fractions
frac
are the requested fractions of the L2-norm of the regularized solutions
relative to the unregularized solution.
For detailed documentation, see also API documentation
Python¶
The Python implementation can be used in several different ways. The first
is a function, fracridge.fracridge()
, which is called in a similar
manner to the MATLAB function, with X
, y
and frac
inputs.
The second way is using a Sckit-Learn-compatible
fracridge.FracRidgeRegressor
object. This object is set up
with the desired fractions frac
as input and implements a fit method
that receives the X
and y
inputs to produce coef_ and alpha_
attributes. Finally, a cross-validation fracridge.FracRidgeRegressorCV
object implements a grid search over provided values of frac
to determine
the best value of frac
given the data.
For detailed documentation, see also API documentation
Frequently asked questions¶
How should I set the frac
input?¶
The frac
input determines the L2-norm of the regularized coefficients for
the linear regression problem, relative to the L2-norm of the unregularized
coefficients for the same problem. If you are not sure how to set this, choose
an equally spaced set of fractions between 0 and 1 (not including 0). This
should give you a range of solutions between highly-regularized (small values)
and completely unregularized (1). To choose the ‘right’ solution, you will want
to use a strategy such as cross-validation (see the
Scikit Learn documentation)
for an explanation of this concept.
How do I interpret the alphas output?¶
The values of alpha
generated by the Python fracridge.fracridge()
function or MATLAB fradcridge function represent the degree of regularization
corresponding to the fraction values chosen by the user.
To ascertain which value of the fraction parameter best fits the data, the
Python software provides a fracridge.FracRidgeRegressorCV
object that
automatically uses cross-validation to determine a best fraction for the data
provided.
How do I compare the best fractions across targets?¶
The best fraction might differ across targets. The best fraction for a particular target depends on many factors, including the noise in the data and the degree of redundancy among regressors. However, it is independent of the scale of the target data. This means that it provides a relatively stable assessment of the quality of a model setting: when the selected fraction is high (close to 1), this means that signal-to-noise in the target data is relatively high, while the redundancy between regressors is relatively low.
How do I cite fracridge
if I used it in a publication?¶
If you use fracridge
, please cite our paper: “Fractional ridge regression: a fast, interpretable reparameterization of ridge regression” (2020) GigaScience, Volume 9, Issue 12, December 2020, https://doi.org/10.1093/gigascience/giaa133
For your convenience, here is the bibtex entry:
@ARTICLE{fracridge2020,
title = "Fractional ridge regression: a fast, interpretable
reparameterization of ridge regression",
author = "Rokem, Ariel and Kay, Kendrick",
journal = "Gigascience",
volume = 9,
number = 12,
month = nov,
year = 2020,
doi = https://doi.org/10.1093/gigascience/giaa133
}