User guide

MATLAB

The MATLAB implementation provides a single entry-point function called fracridge.

To use it, arrange your data y, to have d rows by t columns, where d represents the number of observations in each target, and t is the number of targets (which may be 1). Correspondingly, your design matrix X should have d rows and p columns, where p is the number of parameters. The fractions frac are the requested fractions of the L2-norm of the regularized solutions relative to the unregularized solution.

For detailed documentation, see also API documentation

Python

The Python implementation can be used in several different ways. The first is a function, fracridge.fracridge(), which is called in a similar manner to the MATLAB function, with X, y and frac inputs. The second way is using a Sckit-Learn-compatible fracridge.FracRidgeRegressor object. This object is set up with the desired fractions frac as input and implements a fit method that receives the X and y inputs to produce coef_ and alpha_ attributes. Finally, a cross-validation fracridge.FracRidgeRegressorCV object implements a grid search over provided values of frac to determine the best value of frac given the data.

For detailed documentation, see also API documentation

Frequently asked questions

How should I set the frac input?

The frac input determines the L2-norm of the regularized coefficients for the linear regression problem, relative to the L2-norm of the unregularized coefficients for the same problem. If you are not sure how to set this, choose an equally spaced set of fractions between 0 and 1 (not including 0). This should give you a range of solutions between highly-regularized (small values) and completely unregularized (1). To choose the ‘right’ solution, you will want to use a strategy such as cross-validation (see the Scikit Learn documentation) for an explanation of this concept.

How do I interpret the alphas output?

The values of alpha generated by the Python fracridge.fracridge() function or MATLAB fradcridge function represent the degree of regularization corresponding to the fraction values chosen by the user.

To ascertain which value of the fraction parameter best fits the data, the Python software provides a fracridge.FracRidgeRegressorCV object that automatically uses cross-validation to determine a best fraction for the data provided.

How do I compare the best fractions across targets?

The best fraction might differ across targets. The best fraction for a particular target depends on many factors, including the noise in the data and the degree of redundancy among regressors. However, it is independent of the scale of the target data. This means that it provides a relatively stable assessment of the quality of a model setting: when the selected fraction is high (close to 1), this means that signal-to-noise in the target data is relatively high, while the redundancy between regressors is relatively low.

How do I cite fracridge if I used it in a publication?

If you use fracridge, please cite our paper: “Fractional ridge regression: a fast, interpretable reparameterization of ridge regression” (2020) GigaScience, Volume 9, Issue 12, December 2020, https://doi.org/10.1093/gigascience/giaa133

For your convenience, here is the bibtex entry:

@ARTICLE{fracridge2020,
title    = "Fractional ridge regression: a fast, interpretable
            reparameterization of ridge regression",
author   = "Rokem, Ariel and Kay, Kendrick",
journal  = "Gigascience",
volume   =  9,
number   =  12,
month    =  nov,
year     =  2020,
doi      = https://doi.org/10.1093/gigascience/giaa133
}