Title: | Generalized Elastic Nets |
---|---|
Description: | Implements several extensions of the elastic net regularization scheme. These extensions include individual feature penalties for the L1 term, feature-feature penalties for the L2 term, as well as translation coefficients for the latter. |
Authors: | Artem Sokolov |
Maintainer: | Artem Sokolov <[email protected]> |
License: | GPL (>= 3) |
Version: | 1.3.0 |
Built: | 2025-03-02 06:20:56 UTC |
Source: | https://github.com/artemsokolov/gelnet |
Composition operator for GELnet model definition
## S3 method for class 'geldef' lhs + rhs
## S3 method for class 'geldef' lhs + rhs
lhs |
left-hand side of composition (current chain) |
rhs |
right-hand side of composition (new module) |
Generates a graph Laplacian from the graph adjacency matrix.
adj2lapl(A)
adj2lapl(A)
A |
n-by-n adjacency matrix for a graph with n nodes |
A graph Laplacian is defined as:
, if
;
, if
and
is adjacent to
;
and
, otherwise
The n-by-n Laplacian matrix of the graph
Generates a normalized graph Laplacian from the graph adjacency matrix.
adj2nlapl(A)
adj2nlapl(A)
A |
n-by-n adjacency matrix for a graph with n nodes |
A normalized graph Laplacian is defined as:
, if
;
, if
and
is adjacent to
;
and
, otherwise
The n-by-n Laplacian matrix of the graph
Defines initial values for model weights and the bias term
gel_init(w_init = NULL, b_init = NULL)
gel_init(w_init = NULL, b_init = NULL)
w_init |
p-by-1 vector of initial weight values |
b_init |
scalar, initial value for the bias term |
If an initializer is NULL, the values are computed automatically during training
An initializer that can be combined with a model definition using + operator
Starting building block for defining a GELnet model
gelnet(X)
gelnet(X)
X |
n-by-p matrix of n samples in p dimensions |
A GELnet model definition
Evaluates the logistic regression objective function value for a given model. See details.
gelnet_blr_obj(w, b, X, y, l1, l2, balanced = FALSE, d = NULL, P = NULL, m = NULL)
gelnet_blr_obj(w, b, X, y, l1, l2, balanced = FALSE, d = NULL, P = NULL, m = NULL)
w |
p-by-1 vector of model weights |
b |
the model bias term |
X |
n-by-p matrix of n samples in p dimensions |
y |
n-by-1 binary response vector sampled from 0,1 |
l1 |
L1-norm penalty scaling factor |
l2 |
L2-norm penalty scaling factor |
balanced |
boolean specifying whether the balanced model is being evaluated |
d |
p-by-1 vector of feature weights |
P |
p-by-p feature-feature penalty matrix |
m |
p-by-1 vector of translation coefficients |
Computes the objective function value according to
where
When balanced is TRUE, the loss average over the entire data is replaced with averaging over each class separately. The total loss is then computes as the mean over those per-class estimates.
The objective function value.
Constructs a GELnet model for logistic regression using the Newton method.
gelnet_blr_opt(X, y, l1, l2, max_iter = 100L, eps = 1e-05, silent = FALSE, verbose = FALSE, balanced = FALSE, nonneg = FALSE, w_init = NULL, b_init = NULL, d = NULL, P = NULL, m = NULL)
gelnet_blr_opt(X, y, l1, l2, max_iter = 100L, eps = 1e-05, silent = FALSE, verbose = FALSE, balanced = FALSE, nonneg = FALSE, w_init = NULL, b_init = NULL, d = NULL, P = NULL, m = NULL)
X |
n-by-p matrix of n samples in p dimensions |
y |
n-by-1 vector of binary response labels (must be in 0,1) |
l1 |
coefficient for the L1-norm penalty |
l2 |
coefficient for the L2-norm penalty |
max_iter |
maximum number of iterations |
eps |
convergence precision |
silent |
set to TRUE to suppress run-time output to stdout (default: FALSE) |
balanced |
boolean specifying whether the balanced model is being trained |
nonneg |
set to TRUE to enforce non-negativity constraints on the weights (default: FALSE ) |
w_init |
initial parameter estimate for the weights |
b_init |
initial parameter estimate for the bias term |
d |
p-by-1 vector of feature weights |
P |
p-by-p feature association penalty matrix |
m |
p-by-1 vector of translation coefficients |
The method operates by constructing iteratively re-weighted least squares approximations of the log-likelihood loss function and then calling the linear regression routine to solve those approximations. The least squares approximations are obtained via the Taylor series expansion about the current parameter estimates.
A list with two elements:
p-by-1 vector of p model weights
scalar, bias term for the linear model
gelnet.lin
Performs k-fold cross-validation to select the best pair of the L1- and L2-norm penalty values.
gelnet_cv(X, y, nL1, nL2, nFolds = 5, a = rep(1, n), d = rep(1, p), P = diag(p), m = rep(0, p), max.iter = 100, eps = 1e-05, w.init = rep(0, p), b.init = 0, fix.bias = FALSE, silent = FALSE, balanced = FALSE)
gelnet_cv(X, y, nL1, nL2, nFolds = 5, a = rep(1, n), d = rep(1, p), P = diag(p), m = rep(0, p), max.iter = 100, eps = 1e-05, w.init = rep(0, p), b.init = 0, fix.bias = FALSE, silent = FALSE, balanced = FALSE)
X |
n-by-p matrix of n samples in p dimensions |
y |
n-by-1 vector of response values. Must be numeric vector for regression, factor with 2 levels for binary classification, or NULL for a one-class task. |
nL1 |
number of values to consider for the L1-norm penalty |
nL2 |
number of values to consider for the L2-norm penalty |
nFolds |
number of cross-validation folds (default:5) |
a |
n-by-1 vector of sample weights (regression only) |
d |
p-by-1 vector of feature weights |
P |
p-by-p feature association penalty matrix |
m |
p-by-1 vector of translation coefficients |
max.iter |
maximum number of iterations |
eps |
convergence precision |
w.init |
initial parameter estimate for the weights |
b.init |
initial parameter estimate for the bias term |
fix.bias |
set to TRUE to prevent the bias term from being updated (regression only) (default: FALSE) |
silent |
set to TRUE to suppress run-time output to stdout (default: FALSE) |
balanced |
boolean specifying whether the balanced model is being trained (binary classification only) (default: FALSE) |
Cross-validation is performed on a grid of parameter values. The user specifies the number of values to consider for both the L1- and the L2-norm penalties. The L1 grid values are equally spaced on [0, L1s], where L1s is the smallest meaningful value of the L1-norm penalty (i.e., where all the model weights are just barely zero). The L2 grid values are on a logarithmic scale centered on 1.
A list with the following elements:
the best value of the L1-norm penalty
the best value of the L2-norm penalty
p-by-1 vector of p model weights associated with the best (l1,l2) pair.
scalar, bias term for the linear model associated with the best (l1,l2) pair. (omitted for one-class models)
performance value associated with the best model. (Likelihood of data for one-class, AUC for binary classification, and -RMSE for regression)
Evaluates the linear regression objective function value for a given model. See details.
gelnet_lin_obj(w, b, X, z, l1, l2, a = NULL, d = NULL, P = NULL, m = NULL)
gelnet_lin_obj(w, b, X, z, l1, l2, a = NULL, d = NULL, P = NULL, m = NULL)
w |
p-by-1 vector of model weights |
b |
the model bias term |
X |
n-by-p matrix of n samples in p dimensions |
z |
n-by-1 response vector |
l1 |
L1-norm penalty scaling factor |
l2 |
L2-norm penalty scaling factor |
a |
n-by-1 vector of sample weights |
d |
p-by-1 vector of feature weights |
P |
p-by-p feature-feature penalty matrix |
m |
p-by-1 vector of translation coefficients |
Computes the objective function value according to
where
The objective function value.
Constructs a GELnet model for linear regression using coordinate descent.
gelnet_lin_opt(X, z, l1, l2, max_iter = 100L, eps = 1e-05, fix_bias = FALSE, silent = FALSE, verbose = FALSE, nonneg = FALSE, w_init = NULL, b_init = NULL, a = NULL, d = NULL, P = NULL, m = NULL)
gelnet_lin_opt(X, z, l1, l2, max_iter = 100L, eps = 1e-05, fix_bias = FALSE, silent = FALSE, verbose = FALSE, nonneg = FALSE, w_init = NULL, b_init = NULL, a = NULL, d = NULL, P = NULL, m = NULL)
X |
n-by-p matrix of n samples in p dimensions |
z |
n-by-1 vector of response values |
l1 |
coefficient for the L1-norm penalty |
l2 |
coefficient for the L2-norm penalty |
max_iter |
maximum number of iterations |
eps |
convergence precision |
fix_bias |
set to TRUE to prevent the bias term from being updated (default: FALSE) |
silent |
set to TRUE to suppress run-time output; overwrites verbose (default: FALSE) |
verbose |
set to TRUE to see extra output; is overwritten by silent (default: FALSE) |
nonneg |
set to TRUE to enforce non-negativity constraints on the weights (default: FALSE ) |
w_init |
initial parameter estimate for the weights |
b_init |
initial parameter estimate for the bias term |
a |
n-by-1 vector of sample weights |
d |
p-by-1 vector of feature weights |
P |
p-by-p feature association penalty matrix |
m |
p-by-1 vector of translation coefficients |
The method operates through cyclical coordinate descent. The optimization is terminated after the desired tolerance is achieved, or after a maximum number of iterations.
A list with two elements:
p-by-1 vector of p model weights
scalar, bias term for the linear model
Evaluates the one-class objective function value for a given model See details.
gelnet_oclr_obj(w, X, l1, l2, d = NULL, P = NULL, m = NULL)
gelnet_oclr_obj(w, X, l1, l2, d = NULL, P = NULL, m = NULL)
w |
p-by-1 vector of model weights |
X |
n-by-p matrix of n samples in p dimensions |
l1 |
L1-norm penalty scaling factor |
l2 |
L2-norm penalty scaling factor |
d |
p-by-1 vector of feature weights |
P |
p-by-p feature-feature penalty matrix |
m |
p-by-1 vector of translation coefficients |
Computes the objective function value according to
where
The objective function value.
Constructs a GELnet model for one-class regression using the Newton method.
gelnet_oclr_opt(X, l1, l2, max_iter = 100L, eps = 1e-05, silent = FALSE, verbose = FALSE, nonneg = FALSE, w_init = NULL, d = NULL, P = NULL, m = NULL)
gelnet_oclr_opt(X, l1, l2, max_iter = 100L, eps = 1e-05, silent = FALSE, verbose = FALSE, nonneg = FALSE, w_init = NULL, d = NULL, P = NULL, m = NULL)
X |
n-by-p matrix of n samples in p dimensions |
l1 |
coefficient for the L1-norm penalty |
l2 |
coefficient for the L2-norm penalty |
max_iter |
maximum number of iterations |
eps |
convergence precision |
silent |
set to TRUE to suppress run-time output to stdout (default: FALSE) |
nonneg |
set to TRUE to enforce non-negativity constraints on the weights (default: FALSE ) |
w_init |
initial parameter estimate for the weights |
d |
p-by-1 vector of feature weights |
P |
p-by-p feature association penalty matrix |
m |
p-by-1 vector of translation coefficients |
The function optimizes the following objective:
where
The method operates by constructing iteratively re-weighted least squares approximations of the log-likelihood loss function and then calling the linear regression routine to solve those approximations. The least squares approximations are obtained via the Taylor series expansion about the current parameter estimates.
A list with one element:
p-by-1 vector of p model weights
Trains a model on the definition constructed by gelnet()
gelnet_train(modeldef, max_iter = 100L, eps = 1e-05, silent = FALSE, verbose = FALSE)
gelnet_train(modeldef, max_iter = 100L, eps = 1e-05, silent = FALSE, verbose = FALSE)
modeldef |
model definition constructed through gelnet() arithmetic |
max_iter |
maximum number of iterations |
eps |
convergence precision |
silent |
set to TRUE to suppress run-time output to stdout; overrides verbose (default: FALSE) |
verbose |
set to TRUE to see extra output; is overridden by silent (default: FALSE) |
The training is performed through cyclical coordinate descent, and the optimization is terminated after the desired tolerance is achieved or after a maximum number of iterations.
A GELNET model, expressed as a list with two elements:
p-by-1 vector of p model weights
scalar, bias term for the linear model (omitted for one-class models)
Infers the problem type and learns the appropriate kernel model.
gelnet.ker(K, y, lambda, a, max.iter = 100, eps = 1e-05, v.init = rep(0, nrow(K)), b.init = 0, fix.bias = FALSE, silent = FALSE, balanced = FALSE)
gelnet.ker(K, y, lambda, a, max.iter = 100, eps = 1e-05, v.init = rep(0, nrow(K)), b.init = 0, fix.bias = FALSE, silent = FALSE, balanced = FALSE)
K |
n-by-n matrix of pairwise kernel values over a set of n samples |
y |
n-by-1 vector of response values. Must be numeric vector for regression, factor with 2 levels for binary classification, or NULL for a one-class task. |
lambda |
scalar, regularization parameter |
a |
n-by-1 vector of sample weights (regression only) |
max.iter |
maximum number of iterations (binary classification and one-class problems only) |
eps |
convergence precision (binary classification and one-class problems only) |
v.init |
initial parameter estimate for the kernel weights (binary classification and one-class problems only) |
b.init |
initial parameter estimate for the bias term (binary classification only) |
fix.bias |
set to TRUE to prevent the bias term from being updated (regression only) (default: FALSE) |
silent |
set to TRUE to suppress run-time output to stdout (default: FALSE) |
balanced |
boolean specifying whether the balanced model is being trained (binary classification only) (default: FALSE) |
The entries in the kernel matrix K can be interpreted as dot products
in some feature space . The corresponding weight vector can be
retrieved via
. However, new samples can be
classified without explicit access to the underlying feature space:
The method determines the problem type from the labels argument y. If y is a numeric vector, then a ridge regression model is trained by optimizing the following objective function:
If y is a factor with two levels, then the function returns a binary classification model, obtained by optimizing the following objective function:
where
Finally, if no labels are provided (y == NULL), then a one-class model is constructed using the following objective function:
where
In all cases, and the method solves for
.
A list with two elements:
n-by-1 vector of kernel weights
scalar, bias term for the linear model (omitted for one-class models)
Computes the smallest value of the LASSO coefficient L1 that leads to an all-zero weight vector for a given gelnet model
L1_ceiling(modeldef)
L1_ceiling(modeldef)
modeldef |
model definition constructed through gelnet() arithmetic |
The cyclic coordinate descent updates the model weight using a soft threshold operator
that clips the value of the weight to zero, whenever the absolute
value of the first argument falls below
. From here, it is straightforward to compute
the smallest value of
, such that all weights are clipped to zero.
The largest meaningful value of the L1 parameter (i.e., the smallest value that yields a model with all zero weights)
Defines a binary logistic regression task
model_blr(y, nonneg = FALSE, balanced = FALSE)
model_blr(y, nonneg = FALSE, balanced = FALSE)
y |
n-by-1 factor with two levels |
nonneg |
set to TRUE to enforce non-negativity constraints on the weights (default: FALSE) |
balanced |
boolean specifying whether the balanced model is being trained (default: FALSE) |
The binary logistic regression objective function is defined as
where
A GELnet task definition that can be combined with gelnet() output
Defines a linear regression task
model_lin(y, a = NULL, nonneg = FALSE, fix_bias = FALSE)
model_lin(y, a = NULL, nonneg = FALSE, fix_bias = FALSE)
y |
n-by-1 numeric vector of response values |
a |
n-by-1 vector of sample weights |
nonneg |
set to TRUE to enforce non-negativity constraints on the weights (default: FALSE) |
fix_bias |
set to TRUE to prevent the bias term from being updated (default: FALSE) |
The objective function is given by
A GELnet task definition that can be combined with gelnet() output
Defines a one-class logistic regression (OCLR) task
model_oclr(nonneg = FALSE)
model_oclr(nonneg = FALSE)
nonneg |
set to TRUE to enforce non-negativity constraints on the weights (default: FALSE) |
The OCLR objective function is defined as
where
A GELnet task definition that can be combined with gelnet() output
Given a linear model, perturbs its i^th coefficient by delta
perturb.gelnet(model, i, delta)
perturb.gelnet(model, i, delta)
model |
The model to perturb |
i |
Index of the coefficient to modify, or 0 for the bias term |
delta |
The value to perturb by |
Modified GELnet model
Defines an L1 regularizer with optional per-feature weights
rglz_L1(l1, d = NULL)
rglz_L1(l1, d = NULL)
l1 |
coefficient for the L1-norm penalty |
d |
p-by-1 vector of feature weights |
The L1 regularization term is defined by
A regularizer definition that can be combined with a model definition using + operator
Defines an L2 regularizer with optional feature-feature penalties and translation coefficients
rglz_L2(l2, P = NULL, m = NULL)
rglz_L2(l2, P = NULL, m = NULL)
l2 |
coefficient for the L2-norm penalty |
P |
p-by-p feature association penalty matrix |
m |
p-by-1 vector of translation coefficients |
The L2 regularizer term is define by
A regularizer definition that can be combined with a model definition using + operator
Defines an L1 regularizer that results in the desired number of non-zero feature weights
rglz_nf(nFeats, d = NULL)
rglz_nf(nFeats, d = NULL)
nFeats |
desired number of features with non-zero weights in the model |
d |
p-by-1 vector of feature weights |
The corresponding regularization coefficient is determined through binary search
A regularizer definition that can be combined with a model definition using + operator