Obtain polynomial representation
nn2poly.RdImplements the main NN2Poly algorithm to obtain a polynomial representation of a trained neural network using its weights and Taylor expansion of its activation functions.
Usage
nn2poly(
object,
max_order = 2,
keep_layers = FALSE,
taylor_orders = 8,
...,
all_partitions = NULL
)Arguments
- object
An object for which the computation of the NN2Poly algorithm is desired. Currently supports models from the following deep learning frameworks:
tensorflow/kerasmodels built as a sequential model.torch/luzmodels built as a sequential model.
It also supports a named
listas input which allows to introduce by hand a model from any other source. Thislistshould be of length L (number of hidden layers + 1) containing the weights matrix for each layer. Each element of the list should be named as the activation function used at each layer. Currently supported activation functions are"tanh","softplus","sigmoid"and"linear".At any layer \(l\), the expected shape of such matrices is of the form \((h_{(l-1)} + 1)*(h_l)\), that is, the number of rows is the number of neurons in the previous layer plus the bias vector, and the number of columns is the number of neurons in the current layer L. Therefore, each column corresponds to the weight vector affecting each neuron in that layer. The bias vector should be in the first row.
- max_order
integerthat determines the maximum order that will be forced in the final polynomial, discarding terms of higher order that would naturally arise when considering all Taylor expansions allowed bytaylor_orders.- keep_layers
Boolean that determines if all polynomials computed in the internal layers have to be stored and given in the output (
TRUE), or if only the polynomials from the last layer are needed (FALSE). Default set toFALSE.- taylor_orders
integerorvectorof length L that sets the degree at which Taylor expansion is truncated at each layer. If a single value is used, that value is set for each non linear layer and 1 for linear at each layer activation function. Default set to8.- ...
Ignored.
- all_partitions
Optional argument containing the needed multipartitions as list of lists of lists. If set to
NULL, nn2poly will compute said multipartitions. This step can be computationally expensive when the chosen polynomial order or the dimension are too high. In such cases, it is encouraged that the multipartitions are stored and reused when possible. Default set toNULL.
Value
Returns an object of class nn2poly.
If keep_layers = FALSE (default case), it returns a list with two
items:
An item named
labelsthat is a list of integer vectors. Those vectors represent each monomial in the polynomial, where each integer in the vector represents each time one of the original variables appears in that term. As an example, vector c(1,1,2) represents the term \(x_1^2x_2\). Note that the variables are numbered from 1 to p, with the intercept is represented by zero.An item named
valueswhich contains a matrix in which each column contains the coefficients of the polynomial associated with an output neuron. That is, if the neural network has a single output unit, the matrixvalueswill have a single column and if it has multiple output units, the matrixvalueswill have several columns. Each row will be the coefficient associated with the label in the same position in the labels list.
If keep_layers = TRUE, it returns a list of length the number of
layers (represented by layer_i), where each one is another list with
input and output elements. Each of those elements contains an
item as explained before. The last layer output item will be the same element
as if keep_layers = FALSE.
The polynomials obtained at the hidden layers are not needed to represent the NN but can be used to explore other insights from the NN.
See also
Predict method for nn2poly output predict.nn2poly().
Examples
# Build a NN estructure with random weights, with 2 (+ bias) inputs,
# 4 (+bias) neurons in the first hidden layer with "tanh" activation
# function, 4 (+bias) neurons in the second hidden layer with "softplus",
# and 1 "linear" output unit
weights_layer_1 <- matrix(rnorm(12), nrow = 3, ncol = 4)
weights_layer_2 <- matrix(rnorm(20), nrow = 5, ncol = 4)
weights_layer_3 <- matrix(rnorm(5), nrow = 5, ncol = 1)
# Set it as a list with activation functions as names
nn_object = list("tanh" = weights_layer_1,
"softplus" = weights_layer_2,
"linear" = weights_layer_3)
# Obtain the polynomial representation (order = 3) of that neural network
final_poly <- nn2poly(nn_object, max_order = 3)
# Change the last layer to have 3 outputs (as in a multiclass classification)
# problem
weights_layer_4 <- matrix(rnorm(20), nrow = 5, ncol = 4)
# Set it as a list with activation functions as names
nn_object = list("tanh" = weights_layer_1,
"softplus" = weights_layer_2,
"linear" = weights_layer_4)
# Obtain the polynomial representation of that neural network
# In this case the output is formed by several polynomials with the same
# structure but different coefficient values
final_poly <- nn2poly(nn_object, max_order = 3)
# Polynomial representation of each hidden neuron is given by
final_poly <- nn2poly(nn_object, max_order = 3, keep_layers = TRUE)