Documentation¶
API documentation for PyDeep.
pydeep¶
Root package directory containing all subpackages og the library.
Version: | 1.1.0 |
---|---|
Date: | 19.03.2017 |
Author: | Jan Melchior |
Contact: | |
License: | Copyright (C) 2017 Jan Melchior This file is part of the Python library PyDeep. PyDeep is free software: you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation, either version 3 of the License, or (at your option) any later version. This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details. You should have received a copy of the GNU General Public License along with this program. If not, see <http://www.gnu.org/licenses/>. |
ae¶
Module initializer includes all sub-modules for the autoencoder module.
Version: | 1.0 |
---|---|
Date: | 21.01.2018 |
Author: | Jan Melchior |
Contact: | |
License: | Copyright (C) 2018 Jan Melchior This program is free software: you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation, either version 3 of the License, or (at your option) any later version. This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details. You should have received a copy of the GNU General Public License along with this program. If not, see <http://www.gnu.org/licenses/>. |
model¶
This module provides a general implementation of a 3 layer tied weights Auto-encoder (x-h-y). The code is focused on readability and clearness, while keeping the efficiency and flexibility high. Several activation functions are available for visible and hidden units which can be mixed arbitrarily. The code can easily be adapted to AEs without tied weights. For deep AEs the FFN code can be adapted.
Implemented: |
|
---|---|
Info: | http://ufldl.stanford.edu/wiki/index.php/Sparse_Coding:_Autoencoder_Interpretation |
Version: | 1.0 |
Date: | 08.02.2016 |
Author: | Jan Melchior |
Contact: | |
License: | Copyright (C) 2016 Jan Melchior This program is free software: you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation, either version 3 of the License, or (at your option) any later version. This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details. You should have received a copy of the GNU General Public License along with this program. If not, see <http://www.gnu.org/licenses/>. |
AutoEncoder¶
-
class
pydeep.ae.model.
AutoEncoder
(number_visibles, number_hiddens, data=None, visible_activation_function=<class 'pydeep.base.activationfunction.Sigmoid'>, hidden_activation_function=<class 'pydeep.base.activationfunction.Sigmoid'>, cost_function=<class 'pydeep.base.costfunction.CrossEntropyError'>, initial_weights='AUTO', initial_visible_bias='AUTO', initial_hidden_bias='AUTO', initial_visible_offsets='AUTO', initial_hidden_offsets='AUTO', dtype=<type 'numpy.float64'>)[source]¶ Class for a 3 Layer Auto-encoder (x-h-y) with tied weights.
-
_AutoEncoder__get_sparse_penalty_gradient_part
(h, desired_sparseness)¶ This function computes the desired part of the gradient for the sparse penalty term. Only used for efficiency.
Parameters: - h: hidden activations
-type: numpy array [num samples, input dim]
- desired_sparseness: Desired average hidden activation.
-type: float
Returs: The computed gradient part is returned
-type: numpy array [1, hidden dim]
-
__init__
(number_visibles, number_hiddens, data=None, visible_activation_function=<class 'pydeep.base.activationfunction.Sigmoid'>, hidden_activation_function=<class 'pydeep.base.activationfunction.Sigmoid'>, cost_function=<class 'pydeep.base.costfunction.CrossEntropyError'>, initial_weights='AUTO', initial_visible_bias='AUTO', initial_hidden_bias='AUTO', initial_visible_offsets='AUTO', initial_hidden_offsets='AUTO', dtype=<type 'numpy.float64'>)[source]¶ This function initializes all necessary parameters and data structures. It is recommended to pass the training data to initialize the network automatically.
Parameters: - number_visibles: Number of the visible variables.
-type: int
- number_hiddens Number of hidden variables.
-type: int
- data: The training data for parameter
initialization if ‘AUTO’ is chosen.
- -type: None or
numpy array [num samples, input dim] or List of numpy arrays [num samples, input dim]
- visible_activation_function: A non linear transformation function
for the visible units (default: Sigmoid)
-type: Subclass of ActivationFunction()
- hidden_activation_function: A non linear transformation function
for the hidden units (default: Sigmoid)
-type: Subclass of ActivationFunction
- cost_function A cost function (default: CrossEntropyError())
-type: subclass of FNNCostFunction()
- initial_weights: Initial weights.’AUTO’ is random
- -type: ‘AUTO’, scalar or
numpy array [input dim, output_dim]
- initial_visible_bias: Initial visible bias.
‘AUTO’ is random ‘INVERSE_SIGMOID’ is the inverse Sigmoid of
the visilbe mean
- -type: ‘AUTO’,’INVERSE_SIGMOID’, scalar or
numpy array [1, input dim]
- initial_hidden_bias: Initial hidden bias.
‘AUTO’ is random ‘INVERSE_SIGMOID’ is the inverse Sigmoid of the hidden mean
- -type: ‘AUTO’,’INVERSE_SIGMOID’, scalar or
numpy array [1, output_dim]
- initial_visible_offsets: Initial visible mean values.
AUTO=data mean or 0.5 if not data is given.
- -type: ‘AUTO’, scalar or
numpy array [1, input dim]
- initial_hidden_offsets: Initial hidden mean values.
AUTO = 0.5
- -type: ‘AUTO’, scalar or
numpy array [1, output_dim]
- dtype: Used data type i.e. numpy.float64
- -type: numpy.float32 or numpy.float64 or
numpy.longdouble
-
_decode
(h)[source]¶ - The function propagates the activation of the hidden
- layer reverse through the network to the input layer.
Parameters: - h: Output of the network
-type: numpy array [num samples, hidden dim]
Returns: Input of the network.
-type: array [num samples, input dim]
-
_encode
(x)[source]¶ - The function propagates the activation of the input
- layer through the network to the hidden/output layer.
Parameters: - x: Input of the network.
-type: numpy array [num samples, input dim]
Returns: Pre and Post synaptic output.
-type: List of arrays [num samples, hidden dim]
-
_get_contractive_penalty
(a_h, factor)[source]¶ Calculates contractive penalty cost for a data point x.
Parameters: - a_h: Pre-synaptic activation of h: a_h = (Wx+c).
-type: numpy array [num samples, hidden dim]
- factor: Influence factor (lambda) for the penalty.
-type: float
Returns: Contractive penalty costs for x.
-type: numpy array [num samples]
-
_get_contractive_penalty_gradient
(x, a_h, df_a_h)[source]¶ This function computes the gradient for the contractive penalty term.
Parameters: - x: Training data.
-type: numpy array [num samples, input dim]
- a_h: Untransformed hidden activations
-type: numpy array [num samples, input dim]
- df_a_h: Derivative of untransformed hidden activations
-type: numpy array [num samples, input dim]
Returs: The computed gradient is returned
-type: numpy array [input dim, hidden dim]
-
_get_gradients
(x, a_h, h, a_y, y, reg_contractive, reg_sparseness, desired_sparseness, reg_slowness, x_next, a_h_next, h_next)[source]¶ Computes the gradients of weights, visible and the hidden bias. Depending on whether contractive penalty and or sparse penalty is used the gradient changes.
Parameters: - x: Training data.
-type: numpy array [num samples, input dim]
- a_h: Pre-synaptic activation of h: a_h = (Wx+c).
-type: numpy array [num samples, output dim]
- h Post-synaptic activation of h: h = f(a_h).
-type: numpy array [num samples, output dim]
- a_y: Pre-synaptic activation of y: a_y = (Wh+b).
-type: numpy array [num samples, input dim]
- y Post-synaptic activation of y: y = f(a_y).
-type: numpy array [num samples, input dim]
- reg_contractive: Contractive influence factor (lambda).
-type: float
- reg_sparseness: Sparseness influence factor (lambda).
-type: float
- desired_sparseness: Desired average hidden activation.
-type: float
- reg_slowness: Slowness influence factor.
-type: float
- x_next: Next Training data in Sequence.
-type: numpy array [num samples, input dim]
- a_h_next: Next pre-synaptic activation of h: a_h = (Wx+c).
-type: numpy array [num samples, output dim]
- h_next Next post-synaptic activation of h: h = f(a_h).
-type: numpy array [num samples, input dim]
-
_get_slowness_penalty
(h, h_next, factor)[source]¶ - Calculates slowness penalty cost for a data point x.
Warning
Different penalties are used depending on the hidden activation function.
Parameters: - h: hidden activation.
-type: numpy array [num samples, hidden dim]
- h_next: hidden activation of the next data point in a sequence.
-type: numpy array [num samples, hidden dim]
- factor: Influence factor (beta) for the penalty.
-type: float
Returns: Sparseness penalty costs for x.
-type: numpy array [num samples]
-
_get_slowness_penalty_gradient
(x, x_next, h, h_next, df_a_h, df_a_h_next)[source]¶ This function computes the gradient for the slowness penalty term.
Parameters: - x: Training data.
-type: numpy array [num samples, input dim]
- x_next: Next training data points in Sequence.
-type: numpy array [num samples, input dim]
- h: Corresponding hidden activations.
-type: numpy array [num samples, output dim]
- h_next: Corresponding next hidden activations.
-type: numpy array [num samples, output dim]
- df_a_h: Derivative of untransformed hidden activations.
-type: numpy array [num samples, input dim]
- df_a_h_next: Derivative of untransformed next hidden activations.
-type: numpy array [num samples, input dim]
Returs: The computed gradient is returned
-type: numpy array [input dim, hidden dim]
-
_get_sparse_penalty
(h, factor, desired_sparseness)[source]¶ - Calculates sparseness penalty cost for a data point x.
Warning
Different penalties are used depending on the hidden activation function.
Parameters: - h: hidden activation.
-type: numpy array [num samples, hidden dim]
- factor: Influence factor (beta) for the penalty.
-type: float
- desired_sparseness: Desired average hidden activation.
-type: float
Returns: Sparseness penalty costs for x.
-type: numpy array [num samples]
-
_get_sparse_penalty_gradient
(h, df_a_h, desired_sparseness)[source]¶ This function computes the gradient for the sparse penalty term.
Parameters: - h: hidden activations
-type: numpy array [num samples, input dim]
- df_a_h: Derivative of untransformed hidden activations
-type: numpy array [num samples, input dim]
- desired_sparseness: Desired average hidden activation.
-type: float
Returs: The computed gradient part is returned
-type: numpy array [1, hidden dim]
-
decode
(h)[source]¶ - The function propagates the activation of the hidden
- layer reverse through the network to the input layer.
Parameters: - h: Output of the network
-type: numpy array [num samples, hidden dim]
Returns: Pre and Post synaptic input.
-type: List of arrays [num samples, input dim]
-
encode
(x)[source]¶ - The function propagates the activation of the input
- layer through the network to the hidden/output layer.
Parameters: - x: Input of the network.
-type: numpy array [num samples, input dim]
Returns: Output of the network.
-type: array [num samples, hidden dim]
-
energy
(x, contractive_penalty=0.0, sparse_penalty=0.0, desired_sparseness=0.01, x_next=None, slowness_penalty=0.0)[source]¶ Calculates the energy/cost for a data point x.
Parameters: - x: Data points.
-type: numpy array [num samples, input dim]
- contractive_penalty: If a value > 0.0 is given the cost is also
calculated on the contractive penalty.
-type: float
- sparse_penalty: If a value > 0.0 is given the cost is also
calculated on the sparseness penalty.
-type: float
- desired_sparseness: Desired average hidden activation.
-type: float
- x_next: Next data points.
-type: None or numpy array [num samples, input dim]
- slowness_penalty: If a value > 0.0 is given the cost is also
calculated on the slowness penalty.
-type: float
Returns: Costs for x.
-type: numpy array [num samples]
-
finit_differences
(data, delta, reg_sparseness, desired_sparseness, reg_contractive, reg_slowness, data_next)[source]¶ Finite differences test for AEs. The finite differences test involves all functions of the model except init and reconstruction_error
- data: The training data
- -type: numpy array [num samples, input dim]
- delta: The learning rate.
- -type: numpy array[num parameters]
- reg_sparseness: The parameter (epsilon) for the sparseness regularization.
- -type: float
- desired_sparseness: Desired average hidden activation.
- -type: float
- reg_contractive: The parameter (epsilon) for the contractive regularization.
- -type: float
- reg_slowness: The parameter (epsilon) for the slowness regularization.
- -type: float
- data_next: The next training data in the sequence.
- -type: numpy array [num samples, input dim]
-
reconstruction_error
(x, absolut=False)[source]¶ Calculates the reconstruction error for given training data.
Parameters: - x: Datapoints
-type: numpy array [num samples, input dim]
- absolut: If true the absolute error is caluclated.
-type: bool
Returns: Reconstruction error.
-type: List of arrays [num samples, 1]
-
sae¶
Helper class for stacked auto encoder networks.
Version: | 1.1.0 |
---|---|
Date: | 21.01.2018 |
Author: | Jan Melchior |
Contact: | |
License: | Copyright (C) 2018 Jan Melchior This file is part of the Python library PyDeep. PyDeep is free software: you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation, either version 3 of the License, or (at your option) any later version. This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details. You should have received a copy of the GNU General Public License along with this program. If not, see <http://www.gnu.org/licenses/>. |
SAE¶
-
class
pydeep.ae.sae.
SAE
(list_of_autoencoders)[source]¶ Stack of auto encoders.
-
__init__
(list_of_autoencoders)[source]¶ Initializes the network with auto encoders.
Parameters: list_of_autoencoders (list) – List of auto-encoders
-
trainer¶
This module provides implementations for training different variants of Auto-encoders, modifications on standard gradient decent are provided (centering, denoising, dropout, sparseness, contractiveness, slowness L1-decay, L2-decay, momentum, gradient restriction)
Implemented: |
|
---|---|
Info: | http://ufldl.stanford.edu/wiki/index.php/Sparse_Coding:_Autoencoder_Interpretation |
Version: | 1.0 |
Date: | 21.01.2018 |
Author: | Jan Melchior |
Contact: | |
License: | Copyright (C) 2018 Jan Melchior This program is free software: you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation, either version 3 of the License, or (at your option) any later version. This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details. You should have received a copy of the GNU General Public License along with this program. If not, see <http://www.gnu.org/licenses/>. |
GDTrainer¶
-
class
pydeep.ae.trainer.
GDTrainer
(model)[source]¶ Auto encoder trainer using gradient descent.
-
__init__
(model)[source]¶ The constructor takes the model as input
Parameters: - model: An auto-encoder object which should be trained.
-type: AutoEncoder
-
_train
(data, epsilon, momentum, update_visible_offsets, update_hidden_offsets, corruptor, reg_L1Norm, reg_L2Norm, reg_sparseness, desired_sparseness, reg_contractive, reg_slowness, data_next, restrict_gradient, restriction_norm)[source]¶ The training for one batch is performed using gradient descent.
Parameters: - data: The training data
-type: numpy array [num samples, input dim]
- epsilon: The learning rate.
-type: numpy array[num parameters]
- momentum: The momentum term.
-type: numpy array[num parameters]
- update_visible_offsets: The update step size for the models
visible offsets. Good value if functionality is used: 0.001
-type: float
- update_hidden_offsets: The update step size for the models hidden
offsets. Good value if functionality is used: 0.001
-type: float
- corruptor: Defines if and how the data gets corrupted.
(e.g. Gauss noise, dropout, Max out)
-type: corruptor
- reg_L1Norm: The parameter for the L1 regularization
-type: float
- reg_L2Norm: The parameter for the L2 regularization,
also know as weight decay.
-type: float
- reg_sparseness: The parameter (epsilon) for the sparseness regularization.
-type: float
- desired_sparseness: Desired average hidden activation.
-type: float
- reg_contractive: The parameter (epsilon) for the contractive regularization.
-type: float
- reg_slowness: The parameter (epsilon) for the slowness regularization.
-type: float
- data_next: The next training data in the sequence.
-type: numpy array [num samples, input dim]
- restrict_gradient: If a scalar is given the norm of the
weight gradient is restricted to stay below this value.
-type: None, float
- restriction_norm: restricts the column norm, row norm or
Matrix norm.
-type: string: ‘Cols’,’Rows’, ‘Mat’
-
train
(data, num_epochs=1, epsilon=0.1, momentum=0.0, update_visible_offsets=0.0, update_hidden_offsets=0.0, corruptor=None, reg_L1Norm=0.0, reg_L2Norm=0.0, reg_sparseness=0.0, desired_sparseness=0.01, reg_contractive=0.0, reg_slowness=0.0, data_next=None, restrict_gradient=False, restriction_norm='Mat')[source]¶ The training for one batch is performed using gradient descent.
Parameters: - data: The data used for training.
- -type: list of numpy arrays
[num samples input dimension]
- num_epochs: Number of epochs to train.
-type: int
- epsilon: The learning rate.
-type: numpy array[num parameters]
- momentum: The momentum term.
-type: numpy array[num parameters]
- update_visible_offsets: The update step size for the models
visible offsets. Good value if functionality is used: 0.001
-type: float
- update_hidden_offsets: The update step size for the models hidden
offsets. Good value if functionality is used: 0.001
-type: float
- corruptor: Defines if and how the data gets corrupted.
-type: corruptor
- reg_L1Norm: The parameter for the L1 regularization
-type: float
- reg_L2Norm: The parameter for the L2 regularization,
also know as weight decay. -type: float
- reg_sparseness: The parameter (epsilon) for the sparseness regularization.
-type: float
- desired_sparseness: Desired average hidden activation.
-type: float
- reg_contractive: The parameter (epsilon) for the contractive regularization.
-type: float
- reg_slowness: The parameter (epsilon) for the slowness regularization.
-type: float
- data_next: The next training data in the sequence.
-type: numpy array [num samples, input dim]
- restrict_gradient: If a scalar is given the norm of the
weight gradient is restricted to stay below this value.
-type: None, float
- restriction_norm: restricts the column norm, row norm or
Matrix norm.
-type: string: ‘Cols’,’Rows’, ‘Mat’
-
base¶
Package providing basic/fundamental functions/structures such as cost-functions, activation-functions, preprocessing …
Version: | 1.1.0 |
---|---|
Date: | 13.03.2017 |
Author: | Jan Melchior |
Contact: | |
License: | Copyright (C) 2017 Jan Melchior This file is part of the Python library PyDeep. PyDeep is free software: you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation, either version 3 of the License, or (at your option) any later version. This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details. You should have received a copy of the GNU General Public License along with this program. If not, see <http://www.gnu.org/licenses/>. |
activationfunction¶
Different kind of non linear activation functions and their derivatives.
Implemented: |
---|
- # Unbounded
- # Linear
- Identity
- # Piecewise-linear
- Rectifier
- RestrictedRectifier (hard bounded)
- LeakyRectifier
- # Soft-linear
- ExponentialLinear
- SigmoidWeightedLinear
- SoftPlus
- # Bounded
- # Step
- Step
- # Soft-Step
- Sigmoid
- SoftSign
- HyperbolicTangent
- SoftMax
- K-Winner takes all
- # Symmetric, periodic
- Radial Basis function
- Sinus
Info: | |
---|---|
Version: | 1.1.1 |
Date: | 16.01.2018 |
Author: | Jan Melchior |
Contact: | |
License: | Copyright (C) 2018 Jan Melchior This file is part of the Python library PyDeep. PyDeep is free software: you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation, either version 3 of the License, or (at your option) any later version. This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details. You should have received a copy of the GNU General Public License along with this program. If not, see <http://www.gnu.org/licenses/>. |
Identity¶
-
class
pydeep.base.activationfunction.
Identity
[source]¶ Identity function.
Info: http://www.wolframalpha.com/input/?i=line -
classmethod
ddf
(x)[source]¶ Calculates the second derivative of the identity function value for a given input x.
Parameters: x (scalar or numpy array.) – Inout data. Returns: Value of the second derivative of the identity function for x. Return type: scalar or numpy array with the same shape as x.
-
classmethod
df
(x)[source]¶ Calculates the derivative of the identity function value for a given input x.
Parameters: x (scalar or numpy array.) – Input data. Returns: Value of the derivative of the identity function for x. Return type: scalar or numpy array with the same shape as x.
-
classmethod
dg
(y)[source]¶ Calculates the derivative of the inverse identity function value for a given input y.
Parameters: y (scalar or numpy array.) – Input data. Returns: Value of the derivative of the inverse identity function for y. Return type: scalar or numpy array with the same shape as y.
-
classmethod
Rectifier¶
-
class
pydeep.base.activationfunction.
Rectifier
[source]¶ Rectifier activation function function.
Info: http://www.wolframalpha.com/input/?i=max%280%2Cx%29&dataset=&asynchronous=false&equal=Submit -
classmethod
ddf
(x)[source]¶ Calculates the second derivative of the Rectifier function value for a given input x.
Parameters: x (scalar or numpy array.) – Input data. Returns: Value of the 2nd derivative of the Rectifier function for x. Return type: scalar or numpy array with the same shape as x.
-
classmethod
RestrictedRectifier¶
-
class
pydeep.base.activationfunction.
RestrictedRectifier
(restriction=1.0)[source]¶ Restricted Rectifier activation function function.
Info: http://www.wolframalpha.com/input/?i=max%280%2Cx%29&dataset=&asynchronous=false&equal=Submit -
__init__
(restriction=1.0)[source]¶ Constructor.
Parameters: restriction (float.) – Restriction value / upper limit value.
-
LeakyRectifier¶
-
class
pydeep.base.activationfunction.
LeakyRectifier
(negativeSlope=0.01, positiveSlope=1.0)[source]¶ Leaky Rectifier activation function function.
Info: https://en.wikipedia.org/wiki/Activation_function -
__init__
(negativeSlope=0.01, positiveSlope=1.0)[source]¶ Constructor.
Parameters: - negativeSlope (scalar) – Slope when x < 0
- positiveSlope (scalar) – Slope when x >= 0
-
ExponentialLinear¶
-
class
pydeep.base.activationfunction.
ExponentialLinear
(alpha=1.0)[source]¶ Exponential Linear activation function function.
Info: https://en.wikipedia.org/wiki/Activation_function
SigmoidWeightedLinear¶
-
class
pydeep.base.activationfunction.
SigmoidWeightedLinear
(beta=1.0)[source]¶ Sigmoid weighted linear units (also named Swish)
Info: https://arxiv.org/pdf/1702.03118v1.pdf and for Swish: https://arxiv.org/pdf/1710.05941.pdf
SoftPlus¶
-
class
pydeep.base.activationfunction.
SoftPlus
[source]¶ Soft Plus function.
Info: http://www.wolframalpha.com/input/?i=log%28exp%28x%29%2B1%29 -
classmethod
ddf
(x)[source]¶ Calculates the second derivative of the SoftPlus function value for a given input x.
Parameters: x (scalar or numpy array) – Input data. Returns: Value of the 2nd derivative of the SoftPlus function for x. Return type: scalar or numpy array with the same shape as x.
-
classmethod
df
(x)[source]¶ Calculates the derivative of the SoftPlus function value for a given input x.
Parameters: x (scalar or numpy array.) – Input data. Returns: Value of the derivative of the SoftPlus function for x. Return type: scalar or numpy array with the same shape as x.
-
classmethod
dg
(y)[source]¶ Calculates the derivative of the inverse SoftPlus function value for a given input y.
Parameters: y (scalar or numpy array.) – Input data. Returns: Value of the derivative of the inverse SoftPlus function for x. Return type: scalar or numpy array with the same shape as y.
-
classmethod
Step¶
-
class
pydeep.base.activationfunction.
Step
[source]¶ Step activation function function.
-
classmethod
ddf
(x)[source]¶ Calculates the second derivative of the step function value for a given input x.
Parameters: x (scalar or numpy array.) – Input data. Returns: Value of the derivative of the Step function for x. Return type: scalar or numpy array with the same shape as x.
-
classmethod
Sigmoid¶
-
class
pydeep.base.activationfunction.
Sigmoid
[source]¶ Sigmoid function.
Info: http://www.wolframalpha.com/input/?i=sigmoid -
classmethod
ddf
(x)[source]¶ Calculates the second derivative of the Sigmoid function value for a given input x.
Parameters: x (scalar or numpy array.) – Input data. Returns: Value of the second derivative of the Sigmoid function for x. Return type: scalar or numpy array with the same shape as x.
-
classmethod
df
(x)[source]¶ Calculates the derivative of the Sigmoid function value for a given input x.
Parameters: x (scalar or numpy array.) – Input data. Returns: Value of the derivative of the Sigmoid function for x. Return type: scalar or numpy array with the same shape as x.
-
classmethod
dg
(y)[source]¶ Calculates the derivative of the inverse Sigmoid function value for a given input y.
Parameters: y (scalar or numpy array.) – Input data. Returns: Value of the derivative of the inverse Sigmoid function for y. Return type: scalar or numpy array with the same shape as y.
-
classmethod
SoftSign¶
-
class
pydeep.base.activationfunction.
SoftSign
[source]¶ SoftSign function.
Info: http://www.wolframalpha.com/input/?i=x%2F%281%2Babs%28x%29%29 -
classmethod
ddf
(x)[source]¶ Calculates the second derivative of the SoftSign function value for a given input x.
Parameters: x (scalar or numpy array.) – Input data. Returns: Value of the 2nd derivative of the SoftSign function for x. Return type: scalar or numpy array with the same shape as x.
-
classmethod
HyperbolicTangent¶
-
class
pydeep.base.activationfunction.
HyperbolicTangent
[source]¶ HyperbolicTangent function.
Info: http://www.wolframalpha.com/input/?i=tanh -
classmethod
ddf
(x)[source]¶ Calculates the second derivative of the Hyperbolic Tangent function value for a given input x.
Parameters: x (scalar or numpy array.) – Input data. Returns: Value of the second derivative of the Hyperbolic Tangent function for x. Return type: scalar or numpy array with the same shape as x.
-
classmethod
df
(x)[source]¶ Calculates the derivative of the Hyperbolic Tangent function value for a given input x.
Parameters: x (scalar or numpy array.) – Input data. Returns: Value of the derivative of the Hyperbolic Tangent function for x. Return type: scalar or numpy array with the same shape as x.
-
classmethod
dg
(y)[source]¶ Calculates the derivative of the inverse Hyperbolic Tangent function value for a given input y.
Parameters: y (scalar or numpy array.) – Input data. Returns: Value the derivative of the inverse Hyperbolic Tangent function for x. Return type: scalar or numpy array with the same shape as y.
-
classmethod
SoftMax¶
-
class
pydeep.base.activationfunction.
SoftMax
[source]¶ Soft Max function.
Info: https://en.wikipedia.org/wiki/Activation_function
RadialBasis¶
-
class
pydeep.base.activationfunction.
RadialBasis
(mean=0.0, variance=1.0)[source]¶ Radial Basis function.
Info: http://www.wolframalpha.com/input/?i=Gaussian -
__init__
(mean=0.0, variance=1.0)[source]¶ Constructor.
Parameters: - mean (scalar or numpy array) – Mean of the function.
- variance (scalar or numpy array) – Variance of the function.
-
ddf
(x)[source]¶ Calculates the second derivative of the Radial Basis function value for a given input x.
Parameters: x (scalar or numpy array) – Input data. Returns: Value of the second derivative of the Radial Basis function for x. Return type: scalar or numpy array with the same shape as x.
-
Sinus¶
-
class
pydeep.base.activationfunction.
Sinus
[source]¶ Sinus function.
Info: http://www.wolframalpha.com/input/?i=sin(x) -
classmethod
ddf
(x)[source]¶ Calculates the second derivative of the Sinus function value for a given input x.
Parameters: x (scalar or numpy array) – Input data. Returns: Value of the second derivative of the Sinus function for x. Return type: scalar or numpy array with the same shape as x.
-
classmethod
KWinnerTakeAll¶
-
class
pydeep.base.activationfunction.
KWinnerTakeAll
(k, axis=1, activation_function=<pydeep.base.activationfunction.Identity object>)[source]¶ K Winner take all activation function.
WARNING: The derivative gets already calcluated in the forward pass. Thus, for the same data-point the order should always be forward_pass, backward_pass! -
__init__
(k, axis=1, activation_function=<pydeep.base.activationfunction.Identity object>)[source]¶ Constructor.
Parameters: - k (Instance of an activation function) – Number of active units.
- axis (int) – Axis to compute the maximum.
- k – activation_function
-
basicstructure¶
This module provides basic structural elements, which different models have in common.
Implemented: |
|
---|---|
Version: | 1.1.0 |
Date: | 06.04.2017 |
Author: | Jan Melchior |
Contact: | |
License: | Copyright (C) 2017 Jan Melchior This file is part of the Python library PyDeep. PyDeep is free software: you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation, either version 3 of the License, or (at your option) any later version. This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details. You should have received a copy of the GNU General Public License along with this program. If not, see <http://www.gnu.org/licenses/>. |
BipartiteGraph¶
-
class
pydeep.base.basicstructure.
BipartiteGraph
(number_visibles, number_hiddens, data=None, visible_activation_function=<class 'pydeep.base.activationfunction.Sigmoid'>, hidden_activation_function=<class 'pydeep.base.activationfunction.Sigmoid'>, initial_weights='AUTO', initial_visible_bias='AUTO', initial_hidden_bias='AUTO', initial_visible_offsets='AUTO', initial_hidden_offsets='AUTO', dtype=<type 'numpy.float64'>)[source]¶ Implementation of a bipartite graph structure.
-
__init__
(number_visibles, number_hiddens, data=None, visible_activation_function=<class 'pydeep.base.activationfunction.Sigmoid'>, hidden_activation_function=<class 'pydeep.base.activationfunction.Sigmoid'>, initial_weights='AUTO', initial_visible_bias='AUTO', initial_hidden_bias='AUTO', initial_visible_offsets='AUTO', initial_hidden_offsets='AUTO', dtype=<type 'numpy.float64'>)[source]¶ This function initializes all necessary parameters and data structures. It is recommended to pass the training data to initialize the network automatically.
Parameters: - number_visibles (int) – Number of the visible variables.
- number_hiddens (int) – Number of the hidden variables.
- data (None or numpy array [num samples, input dim]) – The training data for parameter initialization if ‘AUTO’ is chosen for the corresponding parameter.
- visible_activation_function (pydeep.base.activationFunction) – Activation function for the visible units.
- hidden_activation_function (pydeep.base.activationFunction) – Activation function for the hidden units.
- initial_weights ('AUTO', scalar or numpy array [input dim, output_dim]) – Initial weights. ‘AUTO’ and a scalar are random init.
- initial_visible_bias ('AUTO','INVERSE_SIGMOID', scalar or numpy array [1, input dim]) – Initial visible bias. ‘AUTO’ is random, ‘INVERSE_SIGMOID’ is the inverse Sigmoid of the visible mean. If a scalar is passed all values are initialized with it.
- initial_hidden_bias ('AUTO','INVERSE_SIGMOID', scalar or numpy array [1, output_dim]) – Initial hidden bias. ‘AUTO’ is random, ‘INVERSE_SIGMOID’ is the inverse Sigmoid of the hidden mean. If a scalar is passed all values are initialized with it.
- initial_visible_offsets ('AUTO', scalar or numpy array [1, input dim]) – Initial visible offset values. AUTO=data mean or 0.5 if no data is given. If a scalar is passed all values are initialized with it
- initial_hidden_offsets ('AUTO', scalar or numpy array [1, output_dim]) – Initial hidden offset values. AUTO = 0.5 If a scalar is passed all values are initialized with it.
- dtype (numpy.float32 or numpy.float64 or numpy.longdouble) – Used data type i.e. numpy.float64.
This function adds new hidden units at the given position to the model. .. Warning:: If the parameters are changed. the trainer needs to be reinitialized.
Parameters: - num_new_hiddens (int) – The number of new hidden units to add.
- position (int) – Position where the units should be added.
- initial_weights ('AUTO' or scalar or numpy array [input_dim, num_new_hiddens]) – The initial weight values for the hidden units.
- initial_bias ('AUTO' or scalar or numpy array [1, num_new_hiddens]) – The initial hidden bias values.
- initial_offsets ('AUTO' or scalar or numpy array [1, num_new_hiddens]) – The initial hidden mean values.
-
_add_visible_units
(num_new_visibles, position=0, initial_weights='AUTO', initial_bias='AUTO', initial_offsets='AUTO', data=None)[source]¶ - This function adds new visible units at the given position to the model.
Warning
If the parameters are changed. the trainer needs to be reinitialized.
Parameters: - num_new_visibles (int) – The number of new hidden units to add
- position (int) – Position where the units should be added.
- initial_weights ('AUTO' or scalar or numpy array [num_new_visibles, output_dim]) – The initial weight values for the hidden units.
- initial_bias (numpy array [1, num_new_visibles]) – The initial hidden bias values.
- initial_offsets (numpy array [1, num_new_visibles]) – The initial visible offset values.
- data (numpy array [num datapoints, num_new_visibles]) – Data for AUTO initialization.
Computes the Hidden (post) activations from hidden pre-activations.
Parameters: pre_act_h (numpy array [num data points, output_dim]) – Hidden pre-activations. Returns: Hidden activations. Return type: numpy array [num data points, output_dim]
Computes the Hidden pre-activations from visible activations.
Parameters: v (numpy array [num data points, input_dim]) – Visible activations. Returns: Hidden pre-synaptic activations. Return type: numpy array [num data points, output_dim]
This function removes the hidden units whose indices are given. .. Warning:: If the parameters are changed. the trainer needs to be reinitialized.
Parameters: indices (int or list of int or numpy array of int) – Indices to remove.
-
_remove_visible_units
(indices)[source]¶ - This function removes the visible units whose indices are given.
Warning
If the parameters are changed. the trainer needs to be reinitialized.
Parameters: indices (int or list of int or numpy array of int) – Indices of units to be remove.
-
_visible_post_activation
(pre_act_v)[source]¶ Computes the visible (post) activations from visible pre-activations.
Parameters: pre_act_v (numpy array [num data points, input_dim]) – Visible pre-activations. Returns: Visible activations. Return type: numpy array [num data points, input_dim]
-
_visible_pre_activation
(h)[source]¶ Computes the visible pre-activations from hidden activations.
Parameters: h (numpy array [num data points, output_dim]) – Hidden activations. Returns: Visible pre-synaptic activations. Return type: numpy array [num data points, input_dim]
-
get_parameters
()[source]¶ This function returns all model parameters in a list.
Returns: The parameter references in a list. Return type: list
Computes the Hidden (post) activations from visible activations.
Parameters: v (numpy array [num data points, input_dim]) – Visible activations. Returns: Hidden activations. Return type: numpy array [num data points, output_dim]
-
update_offsets
(new_visible_offsets=0.0, new_hidden_offsets=0.0, update_visible_offsets=1.0, update_hidden_offsets=1.0)[source]¶ - This function updates the visible and hidden offsets. | –> update_offsets(0,0,1,1) reparameterizes to the normal binary RBM.
Parameters:
-
StackOfBipartiteGraphs¶
-
class
pydeep.base.basicstructure.
StackOfBipartiteGraphs
(list_of_layers)[source]¶ Stacked network layers
-
__init__
(list_of_layers)[source]¶ Initializes the network with auto encoders.
Parameters: list_of_layers (list) – List of Layers i.e. BipartiteGraph.
-
_check_network
()[source]¶ Check whether the network is consistent and raise an exception if it is not the case.
-
append_layer
(layer)[source]¶ Appends the model to the network.
Parameters: layer (Layer object i.e. BipartiteGraph.) – Layer object.
-
backward_propagate
(output_data)[source]¶ Propagates the output back through the input.
Parameters: output_data (numpy array [batchsize x output dim]) – Output data. Returns: Input of the network. Return type: numpy array [batchsize x input dim]
-
depth
¶ Networks depth/ number of layers.
-
forward_propagate
(input_data)[source]¶ Propagates the data through the network.
Parameters: input_data (numpy array [batchsize x input dim]) – Input data. Returns: Output of the network. Return type: numpy array [batchsize x output dim]
-
num_layers
¶ Networks depth/ number of layers.
-
corruptor¶
This module provides implementations for corrupting the training data.
Implemented: |
|
---|---|
Info: | http://ufldl.stanford.edu/wiki/index.php/Sparse_Coding:_Autoencoder_Interpretation |
Version: | 1.1.0 |
Date: | 13.03.2017 |
Author: | Jan Melchior |
Contact: | |
License: | Copyright (C) 2017 Jan Melchior This file is part of the Python library PyDeep. PyDeep is free software: you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation, either version 3 of the License, or (at your option) any later version. This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details. You should have received a copy of the GNU General Public License along with this program. If not, see <http://www.gnu.org/licenses/>. |
Identity¶
AdditiveGaussNoise¶
MultiGaussNoise¶
SamplingBinary¶
Dropout¶
RandomPermutation¶
-
class
pydeep.base.corruptor.
RandomPermutation
(permutation_percentage=0.2)[source]¶ RandomPermutation corruption, a fix number of units change their activation values.
KeepKWinner¶
costfunction¶
Different kind of cost functions and their derivatives.
Implemented: |
|
---|---|
Version: | 1.1.0 |
Date: | 13.03.2017 |
Author: | Jan Melchior |
Contact: | |
License: | Copyright (C) 2017 Jan Melchior This file is part of the Python library PyDeep. PyDeep is free software: you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation, either version 3 of the License, or (at your option) any later version. This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details. You should have received a copy of the GNU General Public License along with this program. If not, see <http://www.gnu.org/licenses/>. |
SquaredError¶
-
class
pydeep.base.costfunction.
SquaredError
[source]¶ Mean Squared error.
-
classmethod
df
(x, t)[source]¶ Calculates the derivative of the Squared Error value for a given input x and target t.
Parameters: - x (scalar or numpy array) – Input data.
- t (scalar or numpy array) – Target vales.
Returns: Value of the derivative of the cost function for x and t.
Return type: scalar or numpy array with the same shape as x and t.
-
classmethod
f
(x, t)[source]¶ Calculates the Squared Error value for a given input x and target t.
Parameters: - x (scalar or numpy array) – Input data.
- t (scalar or numpy array) – Target vales
Returns: Value of the cost function for x and t.
Return type: scalar or numpy array with the same shape as x and t.
-
classmethod
AbsoluteError¶
-
class
pydeep.base.costfunction.
AbsoluteError
[source]¶ Absolute error.
-
classmethod
df
(x, t)[source]¶ Calculates the derivative of the absolute error value for a given input x and target t.
Parameters: - x (scalar or numpy array) – Input data.
- t (scalar or numpy array) – Target vales.
Returns: Value of the derivative of the cost function for x and t.
Return type: scalar or numpy array with the same shape as x and t.
-
classmethod
f
(x, t)[source]¶ Calculates the absolute error value for a given input x and target t.
Parameters: - x (scalar or numpy array) – Input data.
- t (scalar or numpy array) – Target vales
Returns: Value of the cost function for x and t.
Return type: scalar or numpy array with the same shape as x and t.
-
classmethod
CrossEntropyError¶
-
class
pydeep.base.costfunction.
CrossEntropyError
[source]¶ Cross entropy functions.
-
classmethod
df
(x, t)[source]¶ Calculates the derivative of the cross entropy value for a given input x and target t.
Parameters: - x (scalar or numpy array) – Input data.
- t (scalar or numpy array) – Target vales.
Returns: Value of the derivative of the cost function for x and t.
Return type: scalar or numpy array with the same shape as x and t.
-
classmethod
f
(x, t)[source]¶ Calculates the cross entropy value for a given input x and target t.
Parameters: - x (scalar or numpy array) – Input data.
- t (scalar or numpy array) – Target vales
Returns: Value of the cost function for x and t.
Return type: scalar or numpy array with the same shape as x and t.
-
classmethod
NegLogLikelihood¶
-
class
pydeep.base.costfunction.
NegLogLikelihood
[source]¶ Negative log likelihood function.
-
classmethod
df
(x, t)[source]¶ Calculates the derivative of the negative log-likelihood value for a given input x and target t.
Parameters: - x (scalar or numpy array) – Input data.
- t (scalar or numpy array) – Target vales.
Returns: Value of the derivative of the cost function for x and t.
Return type: scalar or numpy array with the same shape as x and t.
-
classmethod
f
(x, t)[source]¶ Calculates the negative log-likelihood value for a given input x and target t.
Parameters: - x (scalar or numpy array) – Input data.
- t (scalar or numpy array) – Target vales
Returns: Value of the cost function for x and t.
Return type: scalar or numpy array with the same shape as x and t.
-
classmethod
numpyextension¶
This module provides different math functions that extend the numpy library.
Implemented: |
|
---|---|
Version: | 1.1.0 |
Date: | 13.03.2017 |
Author: | Jan Melchior |
Contact: | |
License: | Copyright (C) 2017 Jan Melchior This file is part of the Python library PyDeep. PyDeep is free software: you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation, either version 3 of the License, or (at your option) any later version. This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details. You should have received a copy of the GNU General Public License along with this program. If not, see <http://www.gnu.org/licenses/>. |
log_sum_exp¶
-
numpyextension.
log_sum_exp
(axis=0)¶ Calculates the logarithm of the sum of e to the power of input ‘x’. The method tries to avoid overflows by using the relationship: log(sum(exp(x))) = alpha + log(sum(exp(x-alpha))).
Parameters: Returns: Logarithm of the sum of exp of x.
Return type: float or numpy array.
log_diff_exp¶
-
numpyextension.
log_diff_exp
(axis=0)¶ Calculates the logarithm of the diffs of e to the power of input ‘x’. The method tries to avoid overflows by using the relationship: log(diff(exp(x))) = alpha + log(diff(exp(x-alpha))).
Parameters: Returns: Logarithm of the diff of exp of x.
Return type: float or numpy array.
multinominal_batch_sampling¶
-
numpyextension.
multinominal_batch_sampling
(isnormalized=True)¶ Sample states where only one entry is one and the rest is zero according to the given probablities.
Parameters: - probabilties (numpy array [batchsize, number of states]) – Matrix containing probabilities the rows have to sum to one, otherwise chosen normalized=False.
- isnormalized (bool) – If True the probabilities are assumed to be normalized. If False the probabilities are normalized.
Returns: Sampled multinominal states.
Return type: numpy array [batchsize, number of states]
get_norms¶
restrict_norms¶
-
numpyextension.
restrict_norms
(max_norm, axis=0)¶ This function restricts a matrix, its columns or rows to a given norm.
Parameters: Returns: Restricted matrix
Return type: numpy array [num rows, num columns]
resize_norms¶
-
numpyextension.
resize_norms
(norm, axis=0)¶ This function resizes a matrix, its columns or rows to a given norm.
Parameters: Returns: Resized matrix, however it is inplace
Return type: numpy array [num rows, num columns]
angle_between_vectors¶
get_2d_gauss_kernel¶
-
numpyextension.
get_2d_gauss_kernel
(height, shift=0, var=[1.0, 1.0])¶ Creates a 2D Gauss kernel of size NxM with variance 1.
Parameters: - width (int) – Number of pixels first dimension.
- height (int) – Number of pixels second dimension.
- shift (int, 1D numpy array) – The Gaussian is shifted by this amount from the center of the image.Passing a scalar -> x,y shifted by the same valuePassing a vector -> x,y shifted accordingly
- var (int, 1D numpy array or 2D numpy array) – Variances or Covariance matrix.Passing a scalar -> Isotropic GaussianPassing a vector -> Spherical covariance with vector values on the diagonals.Passing a matrix -> Full Gaussian
Returns: Bit array containing the states.
Return type: numpy array [num samples, bit_length]
generate_binary_code¶
-
numpyextension.
generate_binary_code
(batch_size_exp=None, batch_number=0)¶ This function can be used to generate all possible binary vectors of length ‘bit_length’. It is possible to generate only a particular batch of the data, where ‘batch_size_exp’ controls the size of the batch (batch_size = 2**batch_size_exp) and ‘batch_number’ is the index of the batch that should be generated.
Example: bit_length = 2, batchSize = 2-> All combination = 2^bit_length = 2^2 = 4-> All_combinations / batchSize = 4 / 2 = 2 batches-> _generate_bit_array(2, 2, 0) = [0,0],[0,1]-> _generate_bit_array(2, 2, 1) = [1,0],[1,1]Parameters: Returns: Bit array containing the states .
Return type: numpy array [num samples, bit_length]
get_binary_label¶
-
numpyextension.
get_binary_label
()¶ This function converts a 1D-array with integers labels into a 2D-array containing binary labels.
Example: -> [3,1,0]|-> [[1,0,0,0],[0,0,1,0],[0,0,0,1]]Parameters: int_array (int) – 1D array containing integers Returns: 2D array with binary labels. Return type: numpy array [num samples, num labels]
compare_index_of_max¶
-
numpyextension.
compare_index_of_max
(target)¶ Compares data rows by comparing the index of the maximal value e.g. Classifier output and true labels.
Example: [0.3,0.5,0.2],[0.2,0.6,0.2] -> 0[0.3,0.5,0.2],[0.6,0.2,0.2] -> 1Parameters: - output (numpy array [batchsize, output_dim]) – vectors usually containing label probabilties.
- target (numpy array [batchsize, output_dim]) – vectors usually containing true labels.
Returns: Int array containging 0 is the two rows hat the maximum at the same index, 1 otherwise.
Return type: numpy array [num samples, num labels]
shuffle_dataset¶
-
numpyextension.
shuffle_dataset
(label)¶ Shuffles the data points and the labels correspondingly.
Parameters: - data (numpy array [num_datapoints, dim_datapoints]) – Datapoints.
- label (numpy array [num_datapoints]) – Labels.
Returns: Shuffled datapoints and labels.
Return type: List of numpy arrays
rotation_sequence¶
-
numpyextension.
rotation_sequence
(width, height, steps)¶ Rotates a 2D image given as a 1D vector with shape[width*height] in ‘steps’ number of steps.
Parameters: Returns: Bool array containging True is the two rows hat the maximum at the same index, False otherwise.
Return type: numpy array [num samples, num labels]
generate_2d_connection_matrix¶
-
numpyextension.
generate_2d_connection_matrix
(input_y_dim, field_x_dim, field_y_dim, overlap_x_dim, overlap_y_dim, wrap_around=True)¶ This function constructs a connection matrix, which can be used to force the weights to have local receptive fields.
Example: input_x_dim = 3,input_y_dim = 3,field_x_dim = 2,field_y_dim = 2,overlap_x_dim = 1,overlap_y_dim = 1,wrap_around=False)leads to numx.array([[1,1,0,1,1,0,0,0,0],[0,1,1,0,1,1,0,0,0],[0,0,0,1,1,0,1,1,0],[0,0,0,0,1,1,0,1,1]]).TParameters: - input_x_dim (int) – Input dimension.
- input_y_dim (int) – Output dimension.
- field_x_dim (int) – Size of the receptive field in dimension x.
- field_y_dim (int) – Size of the receptive field in dimension y.
- overlap_x_dim (int) – Overlap of the receptive fields in dimension x.
- overlap_y_dim (int) – Overlap of the receptive fields in dimension y.
- wrap_around (bool) – If true teh overlap has warp around in both dimensions.
Returns: Connection matrix.
Return type: numpy arrays [input dim, output dim]
misc¶
Package providing miscellaneous functionalities such as datsets, input-output, visualization, profiling methods …
Version: | 1.1.0 |
---|---|
Date: | 19.03.2017 |
Author: | Jan Melchior |
Contact: | |
License: | Copyright (C) 2017 Jan Melchior This file is part of the Python library PyDeep. PyDeep is free software: you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation, either version 3 of the License, or (at your option) any later version. This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details. You should have received a copy of the GNU General Public License along with this program. If not, see <http://www.gnu.org/licenses/>. |
io¶
This class contains methods to read and write data.
Implemented: |
|
---|---|
Version: | 1.1.0 |
Date: | 29.03.2018 |
Author: | Jan Melchior |
Contact: | |
License: | Copyright (C) 2018 Jan Melchior This file is part of the Python library PyDeep. PyDeep is free software: you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation, either version 3 of the License, or (at your option) any later version. This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details. You should have received a copy of the GNU General Public License along with this program. If not, see <http://www.gnu.org/licenses/>. |
save_object¶
save_image¶
-
io.
save_image
(path, ext='bmp')¶ Saves a numpy array to an image file.
Parameters: - array (numpy array [width, height]) – Data to save
- path (string) – Path and name of the directory to save the image at.
- ext (string) – Extension for the image.
load_object¶
load_image¶
download_file¶
-
io.
download_file
(path, buffer_size=1048576)¶ Downloads an saves a dataset from a given url.
Parameters:
load_mnist¶
-
io.
load_mnist
(binary=False)¶ Loads the MNIST digit data in binary [0,1] or real values [0,1].
Parameters: - path (string) – Path and name of the file to load.
- binary (bool) – If True returns binary images, real valued between [0,1] if False.
Returns: MNIST dataset [train_set, train_lab, valid_set, valid_lab, test_set, test_lab]
Return type: list of numpy arrays
load_caltech¶
-
io.
load_caltech
()¶ Loads the Caltech dataset.
Parameters: path (string) – Path and name of the file to load. Returns: CAltech dataset [train_set, train_lab, valid_set, valid_lab, test_set, test_lab] Return type: list of numpy arrays
load_cifar¶
load_natural_image_patches¶
-
io.
load_natural_image_patches
()¶ - Loads the natural image patches used in the publication ‘Gaussian-binary restricted Boltzmann machines for modeling natural image statistics’.
Parameters: path (string) – Path and name of the file to load. Returns: Natural image dataset Return type: numpy array
load_olivetti_faces¶
measuring¶
This module provides functions for measuring like time measuring for executed code.
Version: | 1.1.0 |
---|---|
Date: | 19.03.2017 |
Author: | Jan Melchior |
Contact: | |
License: | Copyright (C) 2017 Jan Melchior This file is part of the Python library PyDeep. PyDeep is free software: you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation, either version 3 of the License, or (at your option) any later version. This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details. You should have received a copy of the GNU General Public License along with this program. If not, see <http://www.gnu.org/licenses/>. |
print_progress¶
-
measuring.
print_progress
(num_steps, gauge=False, length=50, decimal_place=1)¶ Prints the progress of a system at state ‘step’.
Parameters:
Stopwatch¶
-
class
pydeep.misc.measuring.
Stopwatch
[source]¶ This class provides a stop watch for measuring the execution time of code.
-
__init__
()[source]¶ Constructor sets the starting time to the current time.
Info: Will be overwritten by calling start()!
-
get_expected_end_time
(iteration, num_iterations)[source]¶ Returns the expected end time.
Parameters: Returns: Expected end time.
Return type: datetime
-
get_expected_interval
(iteration, num_iterations)[source]¶ Returns the expected interval/Time needed till ending.
Parameters: Returns: Expected interval.
Return type: timedelta
-
get_interval
()[source]¶ Returns the current interval.
Returns: Current interval: Return type: timedelta
-
update
(factor=1.0)[source]¶ - Updates the internal variables. | Factor can be used to sum up not regular events in a loop: | Lets assume you have a loop over 100 sets and only every 10th | step you execute a function, then use update(factor=0.1) to | measure it.
Parameters: factor (float) – Sums up factor*current interval
-
sshthreadpool¶
Provides a thread/script pooling mechanism based on ssh + screen.
Version: | 1.1.0 |
---|---|
Date: | 19.03.2017 |
Author: | Jan Melchior |
Contact: | |
License: | Copyright (C) 2017 Jan Melchior This file is part of the Python library PyDeep. PyDeep is free software: you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation, either version 3 of the License, or (at your option) any later version. This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details. You should have received a copy of the GNU General Public License along with this program. If not, see <http://www.gnu.org/licenses/>. |
SSHConnection¶
-
class
pydeep.misc.sshthreadpool.
SSHConnection
(hostname, username, password, max_cpus_usage=2)[source]¶ Handles a SSH connection.
-
__init__
(hostname, username, password, max_cpus_usage=2)[source]¶ Constructor takes hostname, username, password.
Parameters: - hostname (string) – Hostname or address of host.
- username (string) – SSH username.
- password (string) – SSH password.
- max_cpus_usage (int) – Maximal number of cores to be used
-
connect
()[source]¶ Connects to the server.
Returns: turns True is the connection was sucessful Return type: bool
-
classmethod
decrypt
(connection, password)[source]¶ Decrypts a connection object and returns it
Parameters: - connection (string) – SSHConnection to be decrypted
- password (string) – Encryption password
Returns: Decrypted object
Return type:
-
encrypt
(password)[source]¶ Encrypts the connection object.
Parameters: password (string) – Encryption password Returns: Encrypted object Return type: object
-
execute_command
(command)[source]¶ Executes a command on the server and returns stdin, stdout, and stderr
Parameters: command (string) – Command to be executed. Returns: stdin, stdout, and stderr Return type: list
-
execute_command_in_screen
(command)[source]¶ - Executes a command in a screen on the server which is automatically detached and returns stdin, stdout, and stderr. Screen closes automatically when the job is
- done.
Parameters: command (string) – Command to be executed. Returns: stdin, stdout, and stderr Return type: list
-
get_number_users_processes
()[source]¶ Gets number of processes of the user on the server.
Returns: number of processes Return type: int or None
-
get_number_users_screens
()[source]¶ Gets number of users screens on the server.
Returns: number of users screens on the server. Return type: int or None
-
get_server_info
()[source]¶ Get the server info like number of cpus, meomory size and stores it in the corresponding variables.
Returns: online or offline FLAG Return type: string
-
get_server_load
()[source]¶ Get the current cpu and memory of the server.
Returns: Average CPU(s) usage last 1 min,Average CPU(s) usage last 5 min,Average CPU(s) usage last 15 min,Average memory usage,Return type: list
-
kill_all_processes
()[source]¶ Kills all processes.
Returns: stdin, stdout, and stderr Return type: list
-
SSHJob¶
SSHPool¶
-
class
pydeep.misc.sshthreadpool.
SSHPool
(servers)[source]¶ Handles a pool of servers and allows to distribute jobs over the pool.
-
__init__
(servers)[source]¶ Constructor takes a list of SSHConnections.
Parameters: servers (list) – List of SSHConnections.
-
broadcast_command
(command)[source]¶ Executes a command an all servers.
Parameters: command (string) – Command to be executed Returns: list of all stdin, stdout, and stderr Return type: list
-
broadcast_kill_all
()[source]¶ Kills all processes on the server of the corresponding user.
Returns: list of all stdin, stdout, and stderr Return type: list
-
broadcast_kill_all_screens
()[source]¶ Kills all screens on the server of the corresponding user.
Returns: list of all stdin, stdout, and stderr Return type: list
-
distribute_jobs
(jobs, status=False, ignore_load=False, sort_server=True)[source]¶ Distributes the jobs over the servers.
Parameters: - jobs (string or SSHConnection) – List of SSHJobs to be executeed on the servers.
- status (bool) – If true prints info about which job was started on which server.
- ignore_load (bool) – If true starts the job without caring about the current load.
- sort_server (bool) – If True Servers will be sorted by load.
Returns: List of all started jobs and list of all remaining jobs
Return type:
-
execute_command
(host, command)[source]¶ Executes a command on a given server servers.
Parameters: - host (string or SSHConnection) – Hostname or connection object
- command (string) – Command to be executed
Returns: Return type:
-
execute_command_in_screen
(host, command)[source]¶ Executes a command in a screen on a given server servers.
Parameters: - host (string or SSHConnection) – Hostname or connection object
- command (string) – Command to be executed
Returns: list of all stdin, stdout, and stderr
Return type:
-
get_servers_info
(status=True)[source]¶ - Reads the status of all servers, the information is stored
- in the SSHConnection objects. Additionally print to the console if status == True.
Parameters: status (bool) – If true prints info.
-
get_servers_status
()[source]¶ Reads the status of all servers and returns it a list. Additionally print to the console if status == True.
Returns: list of header and list corresponding status information Return type: list, list
-
toyproblems¶
This class contains some example toy problems for RBMs.
Implemented: |
|
---|---|
Version: | 1.1.0 |
Date: | 19.03.2017 |
Author: | Jan Melchior |
Contact: | |
License: | Copyright (C) 2017 Jan Melchior This file is part of the Python library PyDeep. PyDeep is free software: you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation, either version 3 of the License, or (at your option) any later version. This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details. You should have received a copy of the GNU General Public License along with this program. If not, see <http://www.gnu.org/licenses/>. |
generate_2d_mixtures¶
-
toyproblems.
generate_2d_mixtures
(mean=0.0, scale=0.7071067811865476)¶ Creates a dataset containing 2D data points from a random mixtures of two independent Laplacian distributions.
Info: Every sample is a 2-dimensional mixture of two sources. The sources can either be super_gauss or sub_gauss. If x is one sample generated by mixing s, i.e. x = A*s, then the mixing_matrix is A.
Parameters: Returns: Data and mixing matrix.
Return type: list of numpy arrays ([num samples, 2], [2,2])
generate_bars_and_stripes¶
generate_bars_and_stripes_complete¶
generate_shifting_bars¶
-
toyproblems.
generate_shifting_bars
(bar_length, num_samples, random=False, flipped=False)¶ Creates a dataset containing random positions of a bar of length “bar_length” in a strip of “length” dimensions.
Parameters: Returns: Samples of the shifting bars dataset.
Return type: numpy array [samples, dimensions]
generate_shifting_bars_complete¶
-
toyproblems.
generate_shifting_bars_complete
(bar_length, random=False, flipped=False)¶ Creates a dataset containing all possible positions of a bar of length “bar_length” can take in a strip of “length” dimensions.
Parameters: Returns: Complete shifting bars dataset.
Return type: numpy array [samples, dimensions]
visualization¶
This module provides functions for displaying and visualize data. It extends the matplotlib.pyplot.
Implemented: |
|
---|---|
Version: | 1.1.0 |
Date: | 19.03.2017 |
Author: | Jan Melchior, Nan Wang |
Contact: | |
License: | Copyright (C) 2017 Jan Melchior This file is part of the Python library PyDeep. PyDeep is free software: you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation, either version 3 of the License, or (at your option) any later version. This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details. You should have received a copy of the GNU General Public License along with this program. If not, see <http://www.gnu.org/licenses/>. |
tile_matrix_columns¶
-
visualization.
tile_matrix_columns
(tile_width, tile_height, num_tiles_x, num_tiles_y, border_size=1, normalized=True)¶ Creates a matrix with tiles from columns.
Parameters: - matrix (numpy array 2D) – Matrix to display.
- tile_width (int) – Tile width dimension.
- tile_height (int) – Tile height dimension.
- num_tiles_x (int) – Number of tiles horizontal.
- num_tiles_y (int) – Number of tiles vertical.
- border_size (int) – Size of the border.
- normalized (bool) – If true each image gets normalized to be between 0..1.
Returns: Matrix showing the 2D patches.
Return type: 2D numpy array
tile_matrix_rows¶
-
visualization.
tile_matrix_rows
(tile_width, tile_height, num_tiles_x, num_tiles_y, border_size=1, normalized=True)¶ Creates a matrix with tiles from rows.
Parameters: - matrix (numpy array 2D) – Matrix to display.
- tile_width (int) – Tile width dimension.
- tile_height (int) – Tile height dimension.
- num_tiles_x (int) – Number of tiles horizontal.
- num_tiles_y (int) – Number of tiles vertical.
- border_size (int) – Size of the border.
- normalized (bool) – If true each image gets normalized to be between 0..1.
Returns: Matrix showing the 2D patches.
Return type: 2D numpy array
imshow_matrix¶
-
visualization.
imshow_matrix
(windowtitle, interpolation='nearest')¶ Displays a matrix in gray-scale.
Parameters: - matrix (numpy array) – Data to display
- windowtitle (string) – Figure title
- interpolation (string) – Interpolation style
imshow_plot¶
-
visualization.
imshow_plot
(windowtitle)¶ Plots the colums of a matrix.
Parameters: - matrix (numpy array) – Data to plot
- windowtitle (string) – Figure title
imshow_histogram¶
-
visualization.
imshow_histogram
(windowtitle, num_bins=10, normed=False, cumulative=False, log_scale=False)¶ Shows a image of the histogram.
Parameters:
plot_2d_weights¶
-
visualization.
plot_2d_weights
(bias=array([[0., 0.]]), scaling_factor=1.0, color='random', bias_color='random')¶ Parameters: - weights (numpy array [2,2]) – Weight matrix (weights per column).
- bias (numpy array [1,2]) – Bias value.
- scaling_factor (float) – If not 1.0 the weights will be scaled by this factor.
- color (string) – Color for the weights.
- bias_color (string) – Color for the bias.
plot_2d_data¶
-
visualization.
plot_2d_data
(alpha=0.1, color='navy', point_size=5)¶ Plots the data into the current figure.
Parameters:
plot_2d_contour¶
-
visualization.
plot_2d_contour
(value_range=[-5.0, 5.0, -5.0, 5.0], step_size=0.01, levels=20, stylev=None, colormap='jet')¶ Plots the data into the current figure.
Parameters: - probability_function (python method) – Probability function must take 2D array [number of datapoint x 2]
- value_range (list with four float entries) – Min x, max x , min y, max y.
- step_size (float) – Step size for evaluating the pdf.
- levels (int) – Number of contour lines or array of contour height.
- stylev (string or None) – None as normal contour, ‘filled’ as filled contour, ‘image’ as contour image
- colormap (string) – Selected colormap .. seealso:: http://www.scipy.org/Cookbook/Matplotlib/…/Show_colormaps
imshow_standard_rbm_parameters¶
-
visualization.
imshow_standard_rbm_parameters
(v1, v2, h1, h2, whitening=None, window_title='')¶ Saves the weights and biases of a given RBM at the given location.
Parameters: - rbm (RBM object) – RBM which weights and biases should be saved.
- v1 (int) – Visible bias and the single weights will be saved as an image with size
- v2 (int) – Visible bias and the single weights will be saved as an image with size
- h1 (int) – Hidden bias and the image containing all weights will be saved as an image with size h1 x h2.
- h2 (int) – Hidden bias and the image containing all weights will be saved as an image with size h1 x h2.
- whitening (preprocessing object or None) – If the data is PCA whitened it is useful to dewhiten the filters to wee the structure!
- window_title (string) – Title for this rbm.
generate_samples¶
-
visualization.
generate_samples
(data, iterations, stepsize, v1, v2, sample_states=False, whitening=None)¶ Generates samples from the given RBM model.
Parameters: - rbm (RBM model object.) – RBM model.
- data (numpy array [num samples, dimensions]) – Data to start sampling from.
- iterations (int) – Number of Gibbs sampling steps.
- stepsize (int) – After how many steps a sample should be plotted.
- v1 (int) – X-Axis of the reorder image patch.
- v2 (int) – Y-Axis of the reorder image patch.
- sample_states (bool) – If true returns the sates , probabilities otherwise.
- whitening (preprocessing object or None) – If the data has been preprocessed it needs to be undone.
Returns: Matrix with image patches order along X-Axis and it’s evolution in Y-Axis.
Return type: numpy array
imshow_filter_tuning_curve¶
imshow_filter_optimal_gratings¶
imshow_filter_frequency_angle_histogram¶
filter_frequency_and_angle¶
-
visualization.
filter_frequency_and_angle
(num_of_angles=40)¶ Analyze the filters by calculating the responses when gratings, i.e. sinusoidal functions, are input to them.
Info: Hyv/”arinen, A. et al. (2009) Natural image statistics, Page 144-146
Parameters: - filters (numpy array) – Filters to analyze
- num_of_angles (int) – Number of angles steps to check
Returns: The optimal frequency (pixels/cycle) of the filters, the optimal orientation angle (rad) of the filters
Return type: numpy array, numpy array
filter_frequency_response¶
-
visualization.
filter_frequency_response
(num_of_angles=40)¶ Compute the response of filters w.r.t. different frequency.
Parameters: - filters (numpy array) – Filters to analyze
- num_of_angles (int) – Number of angles steps to check
Returns: Frequency response as output_dim x max_wavelength-1 index of the
Return type: numpy array, numpy array
filter_angle_response¶
-
visualization.
filter_angle_response
(num_of_angles=40)¶ Compute the angle response of the given filter.
Parameters: - filters (numpy array) – Filters to analyze
- num_of_angles (int) – Number of angles steps to check
Returns: Angle response as output_dim x num_of_ang, index of angles
Return type: numpy array, numpy array
calculate_amari_distance¶
-
visualization.
calculate_amari_distance
(matrix_two, version=1)¶ Calculate the Amari distance between two input matrices.
Parameters: - matrix_one (numpy array) – the first matrix
- matrix_two (numpy array) – the second matrix
- version (int) – Variant to use.
Returns: The amari distance between two input matrices.
Return type:
preprocessing¶
This module contains several classes for data preprocessing.
Implemented: |
|
---|---|
Version: | 1.1.0 |
Date: | 04.04.2017 |
Author: | Jan Melchior |
Contact: | |
License: | Copyright (C) 2017 Jan Melchior This file is part of the Python library PyDeep. PyDeep is free software: you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation, either version 3 of the License, or (at your option) any later version. This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details. You should have received a copy of the GNU General Public License along with this program. If not, see <http://www.gnu.org/licenses/>. |
binarize_data¶
-
preprocessing.
binarize_data
()¶ Converts data to binary values. For data out of [a,b] a data point p will become zero if p < 0.5*(b-a) one otherwise.
Parameters: data (numpy array [num data point, data dimension]) – Data to be binarized. Returns: Binarized data. Return type: numpy array [num data point, data dimension]
rescale_data¶
-
preprocessing.
rescale_data
(new_min=0.0, new_max=1.0)¶ Normalize the values of a matrix. e.g. [min,max] -> [new_min,new_max]
Parameters: Returns: Return type: numpy array [num data point, data dimension]
remove_rows_means¶
-
preprocessing.
remove_rows_means
(return_means=False)¶ Remove the individual mean of each row.
Parameters: - data (numpy array [num data point, data dimension]) – Data to be normalized
- return_means (bool) – If True returns also the means
Returns: Data without row means, row means (optional).
Return type: numpy array [num data point, data dimension], Means of the data (optional)
remove_cols_means¶
-
preprocessing.
remove_cols_means
(return_means=False)¶ Remove the individual mean of each column.
Parameters: - data (numpy array [num data point, data dimension]) – Data to be normalized
- return_means (bool) – If True returns also the means
Returns: Data without column means, column means (optional).
Return type: numpy array [num data point, data dimension], Means of the data (optional)
STANDARIZER¶
-
class
pydeep.preprocessing.
STANDARIZER
(input_dim)[source]¶ Shifts the data having zero mean and scales it having unit variances along the axis.
-
project
(data)[source]¶ Projects the data to normalized space.
Parameters: data (numpy array [num data point, data dimension]) – Data to project. Returns: Projected data. Return type: numpy array [num data point, data dimension]
-
PCA¶
-
class
pydeep.preprocessing.
PCA
(input_dim, whiten=False)[source]¶ Principle component analysis (PCA) using Singular Value Decomposition (SVD)
-
project
(data, num_components=None)[source]¶ Projects the data to Eigenspace.
Info: projection_matrix has its projected vectors as its columns. i.e. if we project x by W into y where W is the projection_matrix, then y = W.T * x
Parameters: Returns: Projected data.
Return type: numpy array [num data point, data dimension]
-
train
(data)[source]¶ Training the model (full batch).
Parameters: data (numpy array [num data point, data dimension]) – data for training.
-
unproject
(data, num_components=None)[source]¶ Projects the data from Eigenspace to normal space.
Parameters: - data (numpy array [num data point, data dimension]) – Data to be unprojected.
- num_components (int) – Number of components to project.
Returns: Unprojected data.
Return type: numpy array [num data point, num_components]
-
ZCA¶
ICA¶
-
class
pydeep.preprocessing.
ICA
(input_dim)[source]¶ Independent Component Analysis using FastICA.
-
log_likelihood
(data)[source]¶ Calculates the Log-Likelihood (LL) for the given data.
Parameters: data (numpy array [num data point, data dimension]) – data to calculate the Log-Likelihood for. Returns: log-likelihood. Return type: numpy array [num data point]
-
train
(data, iterations=1000, convergence=0.0, status=False)[source]¶ Training the model (full batch).
Parameters: - data (numpy array [num data point, data dimension]) – data for training.
- iterations (int) – Number of iterations
- convergence (double) – If the angle (in degrees) between filters of two updates is less than the given value, training is terminated.
- status (bool) – If true the progress is printed to the console.
-
rbm¶
Package providing rbm models and corresponding sampler, trainer and estimator.
Version: | 1.1.0 |
---|---|
Date: | 04.04.2017 |
Author: | Jan Melchior |
Contact: | |
License: | Copyright (C) 2017 Jan Melchior This file is part of the Python library PyDeep. PyDeep is free software: you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation, either version 3 of the License, or (at your option) any later version. This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details. You should have received a copy of the GNU General Public License along with this program. If not, see <http://www.gnu.org/licenses/>. |
dbn¶
Helper class for deep believe networks.
Version: | 1.1.0 |
---|---|
Date: | 06.04.2017 |
Author: | Jan Melchior |
Contact: | |
License: | Copyright (C) 2017 Jan Melchior This file is part of the Python library PyDeep. PyDeep is free software: you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation, either version 3 of the License, or (at your option) any later version. This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details. You should have received a copy of the GNU General Public License along with this program. If not, see <http://www.gnu.org/licenses/>. |
DBN¶
-
class
pydeep.rbm.dbn.
DBN
(list_of_rbms)[source]¶ Deep believe network.
-
__init__
(list_of_rbms)[source]¶ Initializes the network with rbms.
Parameters: list_of_rbms (list) – List of rbms.
-
backward_propagate
(output_data, sample=False)[source]¶ Propagates the output back through the input.
Parameters: - output_data (numpy array [batchsize x output dim]) – Output data.
- sample (bool) – If true the states are sampled, otherwise the probabilities are used.
Returns: Input of the network.
Return type: numpy array [batchsize x input dim]
-
forward_propagate
(input_data, sample=False)[source]¶ Propagates the data through the network.
Parameters: - input_data (numpy array [batchsize x input dim]) – Input data
- sample (bool) – If true the states are sampled, otherwise the probabilities are used.
Returns: Output of the network.
Return type: numpy array [batchsize x output dim]
-
reconstruct
(input_data, sample=False)[source]¶ Reconstructs the data by propagating the data to the output and back to the input.
Parameters: - input_data (numpy array [batchsize x input dim]) – Input data.
- sample (bool) – If true the states are sampled, otherwise the probabilities are used.
Returns: Output of the network.
Return type: numpy array [batchsize x output dim]
-
reconstruct_sample_top_layer
(input_data, sampling_steps=100, sample_forward_backward=False)[source]¶ Reconstructs data by propagating the data forward, sampling the top most layer and propagating the result backward.
Parameters: Returns: reconstruction of the network.
Return type: numpy array [batchsize x output dim]
-
sample_top_layer
(sampling_steps=100, initial_state=None, sample=True)[source]¶ Samples the top most layer, if initial_state is None the current state is used otherwise sampling is started from the given initial state
Parameters: Returns: Output of the network.
Return type: numpy array [batchsize x output dim]
-
estimator¶
This module provides methods for estimating the model performance (running on the CPU). Provided performance measures are for example the reconstruction error (RE) and the log-likelihood (LL). For estimating the LL we need to know the value of the partition function Z. If at least one layer is binary it is possible to calculate the value by factorizing over the binary values. Since it involves calculating all possible binary states, it is only possible for small models i.e. less than 25 (e.g. ~2^25 = 33554432 states). For bigger models we can estimate the partition function using annealed importance sampling (AIS).
Implemented: |
|
---|---|
Info: | For the derivations .. seealso:: https://www.ini.rub.de/PEOPLE/wiskott/Reprints/Melchior-2012-MasterThesis-RBMs.pdf |
Version: | 1.1.0 |
Date: | 04.04.2017 |
Author: | Jan Melchior |
Contact: | |
License: | Copyright (C) 2017 Jan Melchior This file is part of the Python library PyDeep. PyDeep is free software: you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation, either version 3 of the License, or (at your option) any later version. This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details. You should have received a copy of the GNU General Public License along with this program. If not, see <http://www.gnu.org/licenses/>. |
reconstruction_error¶
-
estimator.
reconstruction_error
(data, k=1, beta=None, use_states=False, absolut_error=False)¶ This function calculates the reconstruction errors for a given model and data.
Parameters: - model (Valid RBM model) – The model.
- data (numpy array [num samples, num dimensions] or numpy array [num batches, num samples in batch, num dimensions]) – The data as 2D array or 3D array.
- k (int) – Number of Gibbs sampling steps.
- beta (None, float or numpy array [batchsize,1]) – Temperature(s) for the models energy.
- use_states (bool) – If false (default) the probabilities are used as reconstruction, if true states are sampled.
- absolut_error (boll) – If false (default) the squared error is used, the absolute error otherwise.
Returns: Reconstruction errors of the data.
Return type: nump array [num samples]
log_likelihood_v¶
-
estimator.
log_likelihood_v
(logz, data, beta=None)¶ Computes the log-likelihood (LL) for a given model and visible data given its log partition function.
Info: logz needs to be the partition function for the same beta (i.e. beta = 1.0)! Parameters: - model (Valid RBM model.) – The model.
- logz (float) – The logarithm of the partition function.
- data (2D array [num samples, num input dim] or 3D type numpy array [num batches, num samples in batch, num input dim]) – The visible data.
- beta (None, float, numpy array [batchsize,1]) – Inverse temperature(s) for the models energy.
Returns: The log-likelihood for each sample.
Return type: numpy array [num samples]
log_likelihood_h¶
-
estimator.
log_likelihood_h
(logz, data, beta=None)¶ Computes the log-likelihood (LL) for a given model and hidden data given its log partition function.
Info: logz needs to be the partition function for the same beta (i.e. beta = 1.0)! Parameters: - model (Valid RBM model.) – The model.
- logz (float) – The logarithm of the partition function.
- data (2D array [num samples, num output dim] or 3D type numpy array [num batches, num samples in batch, num output dim]) – The hidden data.
- beta (None, float, numpy array [batchsize,1]) – Inverse temperature(s) for the models energy.
Returns: The log-likelihood for each sample.
Return type: numpy array [num samples]
partition_function_factorize_v¶
-
estimator.
partition_function_factorize_v
(beta=None, batchsize_exponent='AUTO', status=False)¶ Computes the true partition function for the given model by factoring over the visible units.
Info: Exponential increase of computations by the number of visible units. (16 usually ~ 20 seconds) Parameters: Returns: Log Partition function for the model.
Return type:
partition_function_factorize_h¶
-
estimator.
partition_function_factorize_h
(beta=None, batchsize_exponent='AUTO', status=False)¶ Computes the true partition function for the given model by factoring over the hidden units.
Info: Exponential increase of computations by the number of visible units. (16 usually ~ 20 seconds) Parameters: Returns: Log Partition function for the model.
Return type:
annealed_importance_sampling¶
-
estimator.
annealed_importance_sampling
(num_chains=100, k=1, betas=10000, status=False)¶ Approximates the partition function for the given model using annealed importance sampling.
See also
Accurate and Conservative Estimates of MRF Log-likelihood using Reverse Annealing http://arxiv.org/pdf/1412.8566.pdf
Parameters: Returns: Mean estimated log partition function,Mean +3std estimated log partition function,Mean -3std estimated log partition function.Return type:
reverse_annealed_importance_sampling¶
-
estimator.
reverse_annealed_importance_sampling
(num_chains=100, k=1, betas=10000, status=False, data=None)¶ Approximates the partition function for the given model using reverse annealed importance sampling.
See also
Accurate and Conservative Estimates of MRF Log-likelihood using Reverse Annealing http://arxiv.org/pdf/1412.8566.pdf
Parameters: - model (Valid RBM model.) – The model.
- num_chains (int) – Number of AIS runs.
- k (int) – Number of Gibbs sampling steps.
- betas (int, numpy array [num_betas]) – Number or a list of inverse temperatures to sample from.
- status (bool) – If true prints the progress on console.
- data (numpy array) – If data is given, initial sampling is started from data samples.
Returns: Mean estimated log partition function,Mean +3std estimated log partition function,Mean -3std estimated log partition function.Return type:
model¶
This module provides restricted Boltzmann machines (RBMs) with different types of units. The structure is very close to the mathematical derivations to simplify the understanding. In addition, the modularity helps to create other kind of RBMs without adapting the training algorithms.
Implemented: |
# Models without implementation of p(v),p(h),p(v,h) -> AIS, PT, true gradient, … cannot be used! - centered BinaryBinaryLabel RBM (BBL-RBM) - centered GaussianBinaryLabel RBM (GBL-RBM) # Models with intractable p(v),p(h),p(v,h) -> AIS, PT, true gradient, … cannot be used! - centered BinaryRect RBM (BR-RBM) - centered RectBinary RBM (RB-RBM) - centered RectRect RBM (RR-RBM) - centered GaussianRect RBM (GR-RBM) - centered GaussianRectVariance RBM (GRV-RBM) |
---|---|
Info: | For the derivations .. seealso:: https://www.ini.rub.de/PEOPLE/wiskott/Reprints/Melchior-2012-MasterThesis-RBMs.pdf A usual way to create a new unit is to inherit from a given RBM class and override the functions that changed, e.g. Gaussian-Binary RBM inherited from the Binary-Binary RBM. |
Version: | 1.1.0 |
Date: | 04.04.2017 |
Author: | Jan Melchior |
Contact: | |
License: | Copyright (C) 2017 Jan Melchior This file is part of the Python library PyDeep. PyDeep is free software: you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation, either version 3 of the License, or (at your option) any later version. This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details. You should have received a copy of the GNU General Public License along with this program. If not, see <http://www.gnu.org/licenses/>. |
BinaryBinaryRBM¶
-
class
pydeep.rbm.model.
BinaryBinaryRBM
(number_visibles, number_hiddens, data=None, initial_weights='AUTO', initial_visible_bias='AUTO', initial_hidden_bias='AUTO', initial_visible_offsets='AUTO', initial_hidden_offsets='AUTO', dtype=<type 'numpy.float64'>)[source]¶ Implementation of a centered restricted Boltzmann machine with binary visible and binary hidden units.
-
__init__
(number_visibles, number_hiddens, data=None, initial_weights='AUTO', initial_visible_bias='AUTO', initial_hidden_bias='AUTO', initial_visible_offsets='AUTO', initial_hidden_offsets='AUTO', dtype=<type 'numpy.float64'>)[source]¶ This function initializes all necessary parameters and data structures. It is recommended to pass the training data to initialize the network automatically.
Parameters: - number_visibles (int) – Number of the visible variables.
- number_hiddens (int) – Number of hidden variables.
- data (None or numpy array [num samples, input dim]) – The training data for parameter initialization if ‘AUTO’ is chosen for the corresponding parameter.
- initial_weights ('AUTO', scalar or numpy array [input dim, output_dim]) – Initial weights. ‘AUTO’ and a scalar are random init.
- initial_visible_bias ('AUTO','INVERSE_SIGMOID', scalar or numpy array [1, input dim]) – Initial visible bias. ‘AUTO’ is random, ‘INVERSE_SIGMOID’ is the inverse Sigmoid of the visilbe mean. If a scalar is passed all values are initialized with it.
- initial_hidden_bias ('AUTO','INVERSE_SIGMOID', scalar or numpy array [1, output_dim]) – Initial hidden bias. ‘AUTO’ is random, ‘INVERSE_SIGMOID’ is the inverse Sigmoid of the hidden mean. If a scalar is passed all values are initialized with it.
- initial_visible_offsets ('AUTO', scalar or numpy array [1, input dim]) – Initial visible offset values. AUTO=data mean or 0.5 if no data is given. If a scalar is passed all values are initialized with it.
- initial_hidden_offsets ('AUTO', scalar or numpy array [1, output_dim]) – Initial hidden offset values. AUTO = 0.5 If a scalar is passed all values are initialized with it.
- dtype (numpy.float32 or numpy.float64 or numpy.longdouble) – Used data type i.e. numpy.float64
-
_add_visible_units
(num_new_visibles, position=0, initial_weights='AUTO', initial_bias='AUTO', initial_offsets='AUTO', data=None)[source]¶ - This function adds new visible units at the given position to the model. .. Warning:: If the parameters are changed. the trainer needs to be
- reinitialized.
Parameters: - num_new_visibles (int) – The number of new hidden units to add
- position (int) – Position where the units should be added.
- initial_weights ('AUTO', scalar or numpy array [input num_new_visibles, output_dim]) – Initial weights. ‘AUTO’ and a scalar are random init.
- initial_bias ('AUTO' or scalar or numpy array [1, num_new_visibles]) – Initial visible bias. ‘AUTO’ is random, ‘INVERSE_SIGMOID’ is the inverse Sigmoid of the visilbe mean. If a scalar is passed all values are initialized with it.
- initial_offsets ('AUTO' or scalar or numpy array [1, num_new_visibles]) – The initial visible offset values.
- data (numpy array [num datapoints, num_new_visibles]) – If data is given and the offset and bias is initzialized accordingly, if ‘AUTO’ is chosen.
-
_base_log_partition
(use_base_model=False)[source]¶ Returns the base partition function for a given visible bias. .. Note:: that for AIS we need to be able to calculate the partition function of the base distribution exactly. Furthermore it is beneficial if the base distribution is a good approximation of the target distribution. A good choice is therefore the maximum likelihood estimate of the visible bias, given the data.
Parameters: use_base_model (bool) – If true uses the base model, i.e. the MLE of the bias values. Returns: Partition function for zero parameters. Return type: float
This function calculates the gradient for the hidden biases.
Parameters: h (numpy arrays [batch size, output dim]) – Hidden activations. Returns: Hidden bias gradient. Return type: numpy arrays [1, output dim]
-
_calculate_visible_bias_gradient
(v)[source]¶ This function calculates the gradient for the visible biases.
Parameters: v (numpy arrays [batch_size, input dim]) – Visible activations. Returns: Visible bias gradient. Return type: numpy arrays [1, input dim]
-
_calculate_weight_gradient
(v, h)[source]¶ This function calculates the gradient for the weights from the visible and hidden activations.
Parameters: - v (numpy arrays [batchsize, input dim]) – Visible activations.
- h (numpy arrays [batchsize, output dim]) – Hidden activations.
Returns: Weight gradient.
Return type: numpy arrays [input dim, output dim]
-
_getbasebias
()[source]¶ Returns the maximum likelihood estimate of the visible bias, given the data. If no data is given the RBMs bias value is return, but is highly recommended to pass the data.
Returns: Base bias. Return type: numpy array [1, input dim]
-
_remove_visible_units
(indices)[source]¶ - This function removes the visible units whose indices are given.
Warning
If the parameters are changed. the trainer needs to be reinitialized.
Parameters: indices (int or list of int or numpy array of int) – Indices of units to be remove.
-
calculate_gradients
(v, h)[source]¶ This function calculates all gradients of this RBM and returns them as a list of arrays. This keeps the flexibility of adding parameters which will be updated by the training algorithms.
Parameters: - v (numpy arrays [batch size, output dim]) – Visible activations.
- h (numpy arrays [batch size, output dim]) – Hidden activations.
Returns: Gradients for all parameters.
Return type: list of numpy arrays (num parameters x [parameter.shape])
-
energy
(v, h, beta=None, use_base_model=False)[source]¶ Compute the energy of the RBM given observed variable states v and hidden variables state h.
Parameters: - v (numpy array [batch size, input dim]) – Visible states.
- h (numpy array [batch size, output dim]) – Hidden states.
- beta (None, float or numpy array [batch size, 1]) – Allows to sample from a given inverse temperature beta, or if a vector is given to sample from different betas simultaneously. None is equivalent to pass the value 1.0
- use_base_model (bool) – If true uses the base model, i.e. the MLE of the bias values.
Returns: Energy of v and h.
Return type: numpy array [batch size,1]
-
log_probability_h
(logz, h, beta=None, use_base_model=False)[source]¶ Computes the log-probability / LogLikelihood(LL) for the given hidden units for this model. To estimate the LL we need to know the logarithm of the partition function Z. For small models it is possible to calculate Z, however since this involves calculating all possible hidden states, it is intractable for bigger models. As an estimation method annealed importance sampling (AIS) can be used instead.
Parameters: - logz (float) – The logarithm of the partition function.
- h (numpy array [batch size, output dim]) – Hidden states.
- beta (None, float or numpy array [batch size, 1]) – Allows to sample from a given inverse temperature beta, or if a vector is given to sample from different betas simultaneously. None is equivalent to pass the value 1.0
- use_base_model (bool) – If true uses the base model, i.e. the MLE of the bias values.
Returns: Log probability for hidden_states.
Return type: numpy array [batch size, 1]
-
log_probability_v
(logz, v, beta=None, use_base_model=False)[source]¶ Computes the log-probability / LogLikelihood(LL) for the given visible units for this model. To estimate the LL we need to know the logarithm of the partition function Z. For small models it is possible to calculate Z, however since this involves calculating all possible hidden states, it is intractable for bigger models. As an estimation method annealed importance sampling (AIS) can be used instead.
Parameters: - logz (float) – The logarithm of the partition function.
- v (numpy array [batch size, input dim]) – Visible states.
- beta (None, float or numpy array [batch size, 1]) – Allows to sample from a given inverse temperature beta, or if a vector is given to sample from different betas simultaneously.None is equivalent to pass the value 1.0.
- use_base_model (bool) – If true uses the base model, i.e. the MLE of the bias values.
Returns: Log probability for visible_states.
Return type: numpy array [batch size, 1]
-
log_probability_v_h
(logz, v, h, beta=None, use_base_model=False)[source]¶ Computes the joint log-probability / LogLikelihood(LL) for the given visible and hidden units for this model. To estimate the LL we need to know the logarithm of the partition function Z. For small models it is possible to calculate Z, however since this involves calculating all possible hidden states, it is intractable for bigger models. As an estimation method annealed importance sampling (AIS) can be used instead.
Parameters: - logz (float) – The logarithm of the partition function.
- v (numpy array [batch size, input dim]) – Visible states.
- h (numpy array [batch size, output dim]) – Hidden states.
- beta (None, float or numpy array [batch size, 1]) – Allows to sample from a given inverse temperature beta, or if a vector is given to sample from different betas simultaneously. None is equivalent to pass the value 1.0
- use_base_model (bool) – If true uses the base model, i.e. the MLE of the bias values.
Returns: Joint log probability for v and h.
Return type: numpy array [batch size, 1]
-
probability_h_given_v
(v, beta=None, use_base_model=False)[source]¶ Calculates the conditional probabilities of h given v.
Parameters: - v (numpy array [batch size, input dim]) – Visible states.
- beta (None, float or numpy array [batch size, 1]) – Allows to sample from a given inverse temperature beta, or if a vector is given to sample from different betas simultaneously. None is equivalent to pass the value 1.0
- use_base_model (bool) – DUMMY variable, since we do not use a base hidden bias.
Returns: Conditional probabilities h given v.
Return type: numpy array [batch size, output dim]
-
probability_v_given_h
(h, beta=None, use_base_model=False)[source]¶ Calculates the conditional probabilities of v given h.
Parameters: - h (numpy array [batch size, output dim]) – Hidden states.
- beta (None, float or numpy array [batch size, 1]) – Allows to sample from a given inverse temperature beta, or if a vector is given to sample from different betas simultaneously. None is equivalent to pass the value 1.0
- use_base_model (bool) – If true uses the base model, i.e. the MLE of the bias values.
Returns: Conditional probabilities v given h.
Return type: numpy array [batch size, input d
-
sample_h
(h, beta=None, use_base_model=False)[source]¶ Samples the hidden variables from the conditional probabilities h given v.
Parameters: - h (numpy array [batch size, output dim]) – Conditional probabilities of h given v.
- beta (None) – DUMMY Variable. The sampling in other types of units like Gaussian-Binary RBMs will be affected by beta.
- use_base_model (bool) – If true uses the base model, i.e. the MLE of the bias values. (DUMMY in this case)
Returns: States for h.
Return type: numpy array [batch size, output dim]
-
sample_v
(v, beta=None, use_base_model=False)[source]¶ Samples the visible variables from the conditional probabilities v given h.
Parameters: - v (numpy array [batch size, input dim]) – Conditional probabilities of v given h.
- beta (None) – DUMMY Variable. The sampling in other types of units like Gaussian-Binary RBMs will be affected by beta.
- use_base_model (bool) – If true uses the base model, i.e. the MLE of the bias values. (DUMMY in this case)
Returns: States for v.
Return type: numpy array [batch size, input dim]
-
unnormalized_log_probability_h
(h, beta=None, use_base_model=False)[source]¶ Computes the unnormalized log probabilities of h.
Parameters: - h (numpy array [batch size, output dim]) – Hidden states.
- beta (None, float or numpy array [batch size, 1]) – Allows to sample from a given inverse temperature beta, or if a vector is given to sample from different betas simultaneously.None is equivalent to pass the value 1.0.
- use_base_model (bool) – If true uses the base model, i.e. the MLE of the bias values.
Returns: Unnormalized log probability of h.
Return type: numpy array [batch size, 1]
-
unnormalized_log_probability_v
(v, beta=None, use_base_model=False)[source]¶ Computes the unnormalized log probabilities of v.
Parameters: - v (numpy array [batch size, input dim]) – Visible states.
- beta (None, float or numpy array [batch size, 1]) – Allows to sample from a given inverse temperature beta, or if a vector is given to sample from different betas simultaneously.None is equivalent to pass the value 1.0.
- use_base_model (bool) – If true uses the base model, i.e. the MLE of the bias values.
Returns: Unnormalized log probability of v.
Return type: numpy array [batch size, 1]
-
GaussianBinaryRBM¶
-
class
pydeep.rbm.model.
GaussianBinaryRBM
(number_visibles, number_hiddens, data=None, initial_weights='AUTO', initial_visible_bias='AUTO', initial_hidden_bias='AUTO', initial_sigma='AUTO', initial_visible_offsets='AUTO', initial_hidden_offsets='AUTO', dtype=<type 'numpy.float64'>)[source]¶ Implementation of a centered Restricted Boltzmann machine with Gaussian visible and binary hidden units.
-
__init__
(number_visibles, number_hiddens, data=None, initial_weights='AUTO', initial_visible_bias='AUTO', initial_hidden_bias='AUTO', initial_sigma='AUTO', initial_visible_offsets='AUTO', initial_hidden_offsets='AUTO', dtype=<type 'numpy.float64'>)[source]¶ This function initializes all necessary parameters and data structures. It is recommended to pass the training data to initialize the network automatically.
Parameters: - number_visibles (int) – Number of the visible variables.
- number_hiddens (int) – Number of hidden variables.
- data (None or numpy array [num samples, input dim]) – The training data for parameter initialization if ‘AUTO’ is chosen for the corresponding parameter.
- initial_weights ('AUTO', scalar or numpy array [input dim, output_dim]) – Initial weights. ‘AUTO’ and a scalar are random init.
- initial_visible_bias ('AUTO','INVERSE_SIGMOID', scalar or numpy array [1, input dim]) – Initial visible bias. ‘AUTO’ is random, ‘INVERSE_SIGMOID’ is the inverse Sigmoid of the visilbe mean. If a scalar is passed all values are initialized with it.
- initial_hidden_bias ('AUTO','INVERSE_SIGMOID', scalar or numpy array [1, output_dim]) – Initial hidden bias. ‘AUTO’ is random, ‘INVERSE_SIGMOID’ is the inverse Sigmoid of the hidden mean. If a scalar is passed all values are initialized with it.
- initial_sigma ('AUTO', scalar or numpy array [1, input_dim]) – Initial standard deviation for the model.
- initial_visible_offsets ('AUTO', scalar or numpy array [1, input dim]) – Initial visible offset values. AUTO=data mean or 0.5 if no data is given. If a scalar is passed all values are initialized with it.
- initial_hidden_offsets ('AUTO', scalar or numpy array [1, output_dim]) – Initial hidden offset values. AUTO = 0.5 If a scalar is passed all values are initialized with it.
- dtype (numpy.float32 or numpy.float64 or numpy.longdouble) – Used data type i.e. numpy.float64
- This function adds new hidden units at the given position to the model.
Warning
If the parameters are changed. the trainer needs to be reinitialized.
Parameters: - num_new_hiddens (int) – The number of new hidden units to add.
- position (int) – Position where the units should be added.
- initial_weights ('AUTO' or scalar or numpy array [input_dim, num_new_hiddens]) – The initial weight values for the hidden units.
- initial_bias ('AUTO' or scalar or numpy array [1, num_new_hiddens]) – The initial hidden bias values.
- initial_offsets ('AUTO' or scalar or numpy array [1, num_new_hidden) – he initial hidden mean values.
-
_add_visible_units
(num_new_visibles, position=0, initial_weights='AUTO', initial_bias='AUTO', initial_sigmas=1.0, initial_offsets='AUTO', data=None)[source]¶ - This function adds new visible units at the given position to the model.
Warning
If the parameters are changed. the trainer needs to be reinitialized.
Parameters: - num_new_visibles (int) – The number of new hidden units to add
- position (int) – Position where the units should be added.
- initial_weights ('AUTO', scalar or numpy array [input num_new_visibles, output_dim]) – Initial weights. ‘AUTO’ and a scalar are random init.
- initial_bias ('AUTO' or scalar or numpy array [1, num_new_visibles]) – Initial visible bias. ‘AUTO’ is random, ‘INVERSE_SIGMOID’ is the inverse Sigmoid of the visilbe mean. If a scalar is passed all values are initialized with it.
- initial_sigmas ('AUTO' or scalar or numpy array [1, num_new_visibles]) – The initial standard deviation for the model.
- initial_offsets ('AUTO' or scalar or numpy array [1, num_new_visibles]) – The initial visible offset values.
- data (numpy array [num datapoints, num_new_visibles]) – If data is given and the offset and bias is initzialized accordingly, if ‘AUTO’ is chosen.
-
_base_log_partition
(use_base_model=False)[source]¶ Returns the base partition function which needs to be calculateable.
Parameters: use_base_model (bool) – DUMMY sicne the integral does not change if the mean is shifted. Returns: Partition function for zero parameters. Return type: float
-
_calculate_visible_bias_gradient
(v)[source]¶ This function calculates the gradient for the visible biases.
Parameters: v (numpy arrays [batch_size, input dim]) – Visible activations. Returns: Visible bias gradient. Return type: numpy arrays [1, input dim]
-
_calculate_weight_gradient
(v, h)[source]¶ This function calculates the gradient for the weights from the visible and hidden activations.
Parameters: - v (numpy arrays [batchsize, input dim]) – Visible activations.
- h (numpy arrays [batchsize, output dim]) – Hidden activations.
Returns: Weight gradient.
Return type: numpy arrays [input dim, output dim]
-
_remove_visible_units
(indices)[source]¶ - This function removes the visible units whose indices are given.
Warning
If the parameters are changed. the trainer needs to be reinitialized.
Parameters: indices (int or list of int or numpy array of int) – Indices of units to be remove.
-
energy
(v, h, beta=None, use_base_model=False)[source]¶ Compute the energy of the RBM given observed variable states v and hidden variables state h.
Parameters: - v (numpy array [batch size, input dim]) – Visible states.
- h (numpy array [batch size, output dim]) – Hidden states.
- beta (None, float or numpy array [batch size, 1]) – Allows to sample from a given inverse temperature beta, or if a vector is given to sample from different betas simultaneously. None is equivalent to pass the value 1.0
- use_base_model (bool) – If true uses the base model, i.e. the MLE of the bias values.
Returns: Energy of v and h.
Return type: numpy array [batch size,1]
-
probability_h_given_v
(v, beta=None, use_base_model=False)[source]¶ Calculates the conditional probabilities h given v.
Parameters: - v (numpy array [batch size, input dim]) – Visible states / data.
- beta (None, float or numpy array [batch size, 1]) – Allows to sample from a given inverse temperature beta, or if a vector is given to sample from different betas simultaneously. None is equivalent to pass the value 1.0
- use_base_model (bool) – If true uses the base model, i.e. the MLE of the bias values.
Returns: Conditional probabilities h given v.
Return type: numpy array [batch size, output dim]
-
probability_v_given_h
(h, beta=None, use_base_model=False)[source]¶ Calculates the conditional probabilities of v given h.
Parameters: - h (numpy array [batch size, output dim]) – Hidden states.
- beta (None, float or numpy array [batch size, 1]) – Allows to sample from a given inverse temperature beta, or if a vector is given to sample from different betas simultaneously. None is equivalent to pass the value 1.0
- use_base_model (bool) – If true uses the base model, i.e. the MLE of the bias values.
Returns: Conditional probabilities v given h.
Return type: numpy array [batch size, input dim]
-
sample_v
(v, beta=None, use_base_model=False)[source]¶ Samples the visible variables from the conditional probabilities v given h.
Parameters: - v (numpy array [batch size, input dim]) – Conditional probabilities of v given h.
- beta (None) – DUMMY Variable The sampling in other types of units like Gaussian-Binary RBMs will be affected by beta.
- use_base_model (bool) – If true uses the base model, i.e. the MLE of the bias values. (DUMMY in this case)
Returns: States for v.
Return type: numpy array [batch size, input dim]
-
unnormalized_log_probability_h
(h, beta=None, use_base_model=False)[source]¶ Computes the unnormalized log probabilities of h.
Parameters: - h (numpy array [batch size, output dim]) – Hidden states.
- beta (None, float or numpy array [batch size, 1]) – Allows to sample from a given inverse temperature beta, or if a vector is given to sample from different betas simultaneously.None is equivalent to pass the value 1.0.
- use_base_model (bool) – If true uses the base model, i.e. the MLE of the bias values.
Returns: Unnormalized log probability of h.
Return type: numpy array [batch size, 1]
-
unnormalized_log_probability_v
(v, beta=None, use_base_model=False)[source]¶ - Computes the unnormalized log probabilities of v.
- ln(z*p(v)) = ln(p(v))-ln(z)+ln(z) = ln(p(v)).
Parameters: - v (numpy array [batch size, input dim]) – Visible states.
- beta (None, float or numpy array [batch size, 1]) – Allows to sample from a given inverse temperature beta, or if a vector is given to sample from different betas simultaneously.None is equivalent to pass the value 1.0.
- use_base_model (bool) – If true uses the base model, i.e. the MLE of the bias values.
Returns: Unnormalized log probability of v.
Return type: numpy array [batch size, 1]
-
GaussianBinaryVarianceRBM¶
-
class
pydeep.rbm.model.
GaussianBinaryVarianceRBM
(number_visibles, number_hiddens, data=None, initial_weights='AUTO', initial_visible_bias='AUTO', initial_hidden_bias='AUTO', initial_sigma='AUTO', initial_visible_offsets=0.0, initial_hidden_offsets=0.0, dtype=<type 'numpy.float64'>)[source]¶ Implementation of a Restricted Boltzmann machine with Gaussian visible having trainable variances and binary hidden units.
-
__init__
(number_visibles, number_hiddens, data=None, initial_weights='AUTO', initial_visible_bias='AUTO', initial_hidden_bias='AUTO', initial_sigma='AUTO', initial_visible_offsets=0.0, initial_hidden_offsets=0.0, dtype=<type 'numpy.float64'>)[source]¶ This function initializes all necessary parameters and data structures. It is recommended to pass the training data to initialize the network automatically.
Parameters: - number_visibles (int) – Number of the visible variables.
- number_hiddens (int) – Number of hidden variables.
- data (None or numpy array [num samples, input dim]) – The training data for parameter initialization if ‘AUTO’ is chosen for the corresponding parameter.
- initial_weights ('AUTO', scalar or numpy array [input dim, output_dim]) – Initial weights. ‘AUTO’ and a scalar are random init.
- initial_visible_bias ('AUTO','INVERSE_SIGMOID', scalar or numpy array [1, input dim]) – Initial visible bias. ‘AUTO’ is random, ‘INVERSE_SIGMOID’ is the inverse Sigmoid of the visilbe mean. If a scalar is passed all values are initialized with it.
- initial_hidden_bias ('AUTO','INVERSE_SIGMOID', scalar or numpy array [1, output_dim]) – Initial hidden bias. ‘AUTO’ is random, ‘INVERSE_SIGMOID’ is the inverse Sigmoid of the hidden mean. If a scalar is passed all values are initialized with it.
- initial_sigma ('AUTO', scalar or numpy array [1, input_dim]) – Initial standard deviation for the model.
- initial_visible_offsets ('AUTO', scalar or numpy array [1, input dim]) – Initial visible offset values. AUTO=data mean or 0.5 if no data is given. If a scalar is passed all values are initialized with it.
- initial_hidden_offsets ('AUTO', scalar or numpy array [1, output_dim]) – Initial hidden offset values. AUTO = 0.5 If a scalar is passed all values are initialized with it.
- dtype (numpy.float32 or numpy.float64 or numpy.longdouble) – Used data type i.e. numpy.float64
-
_calculate_sigma_gradient
(v, h)[source]¶ This function calculates the gradient for the variance of the RBM.
Parameters: - v (numpy arrays [batchsize, input dim]) – States of the visible variables.
- h (numpy arrays [batchsize, output dim]) – Probs/States of the hidden variables.
Returns: Sigma gradient.
Return type: list of numpy arrays [input dim,1]
-
calculate_gradients
(v, h)[source]¶ his function calculates all gradients of this RBM and returns them as an ordered array. This keeps the flexibility of adding parameters which will be updated by the training algorithms.
Parameters: - v (numpy arrays [batchsize, input dim]) – States of the visible variables.
- h (numpy arrays [batchsize, output dim]) – Probabilities of the hidden variables.
Returns: Gradients for all parameters.
Return type: numpy arrays (num parameters x [parameter.shape])
-
BinaryBinaryLabelRBM¶
-
class
pydeep.rbm.model.
BinaryBinaryLabelRBM
(number_visibles, number_labels, number_hiddens, data=None, initial_weights='AUTO', initial_visible_bias='AUTO', initial_hidden_bias='AUTO', initial_visible_offsets='AUTO', initial_hidden_offsets='AUTO', dtype=<type 'numpy.float64'>)[source]¶ Implementation of a centered Restricted Boltzmann machine with Binary visible plus Softmax label units and binary hidden units.
-
__init__
(number_visibles, number_labels, number_hiddens, data=None, initial_weights='AUTO', initial_visible_bias='AUTO', initial_hidden_bias='AUTO', initial_visible_offsets='AUTO', initial_hidden_offsets='AUTO', dtype=<type 'numpy.float64'>)[source]¶ This function initializes all necessary parameters and data structures. It is recommended to pass the training data to initialize the network automatically.
Parameters: - number_visibles (int) – Number of the visible variables.
- number_labels (int) – Number of the label variables.
- number_hiddens (int) – Number of hidden variables.
- data (None or numpy array [num samples, input dim]) – The training data for parameter initialization if ‘AUTO’ is chosen for the corresponding parameter.
- initial_weights ('AUTO', scalar or numpy array [input dim, output_dim]) – Initial weights. ‘AUTO’ and a scalar are random init.
- initial_visible_bias ('AUTO','INVERSE_SIGMOID', scalar or numpy array [1, input dim]) – Initial visible bias. ‘AUTO’ is random, ‘INVERSE_SIGMOID’ is the inverse Sigmoid of the visilbe mean. If a scalar is passed all values are initialized with it.
- initial_hidden_bias ('AUTO','INVERSE_SIGMOID', scalar or numpy array [1, output_dim]) – Initial hidden bias. ‘AUTO’ is random, ‘INVERSE_SIGMOID’ is the inverse Sigmoid of the hidden mean. If a scalar is passed all values are initialized with it.
- initial_visible_offsets ('AUTO', scalar or numpy array [1, input dim]) – Initial visible offset values. AUTO=data mean or 0.5 if no data is given. If a scalar is passed all values are initialized with it.
- initial_hidden_offsets ('AUTO', scalar or numpy array [1, output_dim]) – Initial hidden offset values. AUTO = 0.5 If a scalar is passed all values are initialized with it.
- dtype (numpy.float32 or numpy.float64 or numpy.longdouble) – Used data type i.e. numpy.float64
-
sample_v
(v, beta=None, use_base_model=False)[source]¶ Samples the visible variables from the conditional probabilities v given h.
Parameters: - v (numpy array [batch size, input dim]) – Conditional probabilities of v given h.
- beta (None) – DUMMY Variable. The sampling in other types of units like Gaussian-Binary RBMs will be affected by beta.
- use_base_model (bool) – If true uses the base model, i.e. the MLE of the bias values. (DUMMY in this case)
Returns: States for v.
Return type: numpy array [batch size, input dim]
-
SoftMaxSigmoid¶
GaussianBinaryLabelRBM¶
-
class
pydeep.rbm.model.
GaussianBinaryLabelRBM
(number_visibles, number_labels, number_hiddens, data=None, initial_weights='AUTO', initial_visible_bias='AUTO', initial_hidden_bias='AUTO', initial_sigma='AUTO', initial_visible_offsets='AUTO', initial_hidden_offsets='AUTO', dtype=<type 'numpy.float64'>)[source]¶ Implementation of a centered Restricted Boltzmann machine with Gaussian visible plus Softmax label units and binary hidden units.
-
__init__
(number_visibles, number_labels, number_hiddens, data=None, initial_weights='AUTO', initial_visible_bias='AUTO', initial_hidden_bias='AUTO', initial_sigma='AUTO', initial_visible_offsets='AUTO', initial_hidden_offsets='AUTO', dtype=<type 'numpy.float64'>)[source]¶ This function initializes all necessary parameters and data structures. It is recommended to pass the training data to initialize the network automatically.
Parameters: - number_visibles (int) – Number of the visible variables.
- number_labels (int) – Number of the label variables.
- number_hiddens (int) – Number of hidden variables.
- data (None or numpy array [num samples, input dim]) – The training data for parameter initialization if ‘AUTO’ is chosen for the corresponding parameter.
- initial_weights ('AUTO', scalar or numpy array [input dim, output_dim]) – Initial weights. ‘AUTO’ and a scalar are random init.
- initial_visible_bias ('AUTO','INVERSE_SIGMOID', scalar or numpy array [1, input dim]) – Initial visible bias. ‘AUTO’ is random, ‘INVERSE_SIGMOID’ is the inverse Sigmoid of the visilbe mean. If a scalar is passed all values are initialized with it.
- initial_hidden_bias ('AUTO','INVERSE_SIGMOID', scalar or numpy array [1, output_dim]) – Initial hidden bias. ‘AUTO’ is random, ‘INVERSE_SIGMOID’ is the inverse Sigmoid of the hidden mean. If a scalar is passed all values are initialized with it.
- initial_sigma ('AUTO', scalar or numpy array [1, input_dim]) – Initial standard deviation for the model.
- initial_visible_offsets ('AUTO', scalar or numpy array [1, input dim]) – Initial visible offset values. AUTO=data mean or 0.5 if no data is given. If a scalar is passed all values are initialized with it.
- initial_hidden_offsets ('AUTO', scalar or numpy array [1, output_dim]) – Initial hidden offset values. AUTO = 0.5 If a scalar is passed all values are initialized with it.
- dtype (numpy.float32 or numpy.float64 or numpy.longdouble) – Used data type i.e. numpy.float64
-
sample_v
(v, beta=None, use_base_model=False)[source]¶ Samples the visible variables from the conditional probabilities v given h.
Parameters: - v (numpy array [batch size, input dim]) – Conditional probabilities of v given h.
- beta (None) – DUMMY Variable. The sampling in other types of units like Gaussian-Binary RBMs will be affected by beta.
- use_base_model (bool) – If true uses the base model, i.e. the MLE of the bias values. (DUMMY in this case)
Returns: States for v.
Return type: numpy array [batch size, input dim]
-
SoftMaxLinear¶
BinaryRectRBM¶
-
class
pydeep.rbm.model.
BinaryRectRBM
(number_visibles, number_hiddens, data=None, initial_weights='AUTO', initial_visible_bias='AUTO', initial_hidden_bias='AUTO', initial_visible_offsets='AUTO', initial_hidden_offsets='AUTO', dtype=<type 'numpy.float64'>)[source]¶ Implementation of a centered Restricted Boltzmann machine with Binary visible and Noisy linear rectified hidden units.
-
__init__
(number_visibles, number_hiddens, data=None, initial_weights='AUTO', initial_visible_bias='AUTO', initial_hidden_bias='AUTO', initial_visible_offsets='AUTO', initial_hidden_offsets='AUTO', dtype=<type 'numpy.float64'>)[source]¶ This function initializes all necessary parameters and data structures. It is recommended to pass the training data to initialize the network automatically.
Parameters: - number_visibles (int) – Number of the visible variables.
- number_hiddens (int) – Number of hidden variables.
- data (None or numpy array [num samples, input dim]) – The training data for parameter initialization if ‘AUTO’ is chosen for the corresponding parameter.
- initial_weights ('AUTO', scalar or numpy array [input dim, output_dim]) – Initial weights. ‘AUTO’ and a scalar are random init.
- initial_visible_bias ('AUTO','INVERSE_SIGMOID', scalar or numpy array [1, input dim]) – Initial visible bias. ‘AUTO’ is random, ‘INVERSE_SIGMOID’ is the inverse Sigmoid of the visilbe mean. If a scalar is passed all values are initialized with it.
- initial_hidden_bias ('AUTO','INVERSE_SIGMOID', scalar or numpy array [1, output_dim]) – Initial hidden bias. ‘AUTO’ is random, ‘INVERSE_SIGMOID’ is the inverse Sigmoid of the hidden mean. If a scalar is passed all values are initialized with it.
- initial_visible_offsets ('AUTO', scalar or numpy array [1, input dim]) – Initial visible offset values. AUTO=data mean or 0.5 if no data is given. If a scalar is passed all values are initialized with it.
- initial_hidden_offsets ('AUTO', scalar or numpy array [1, output_dim]) – Initial hidden offset values. AUTO = 0.5 If a scalar is passed all values are initialized with it.
- dtype (numpy.float32 or numpy.float64 or numpy.longdouble) – Used data type i.e. numpy.float64
-
probability_h_given_v
(v, beta=None)[source]¶ Calculates the conditional probabilities h given v.
Parameters: - v (numpy array [batch size, input dim]) – Visible states / data.
- beta (float or numpy array [batch size, 1]) – Allows to sample from a given inverse temperature beta, or if a vector is given to sample from different betas simultaneously.
Returns: Conditional probabilities h given v.
Return type: numpy array [batch size, output dim]
-
sample_h
(h, beta=None, use_base_model=False)[source]¶ Samples the hidden variables from the conditional probabilities h given v.
Parameters: - h (numpy array [batch size, output dim]) – Conditional probabilities of h given v.
- beta (None) – DUMMY Variable. The sampling in other types of units like Gaussian-Binary RBMs will be affected by beta.
- use_base_model (bool) – If true uses the base model, i.e. the MLE of the bias values. (DUMMY in this case)
Returns: States for h.
Return type: numpy array [batch size, output dim]
-
RectBinaryRBM¶
-
class
pydeep.rbm.model.
RectBinaryRBM
(number_visibles, number_hiddens, data=None, initial_weights='AUTO', initial_visible_bias='AUTO', initial_hidden_bias='AUTO', initial_visible_offsets='AUTO', initial_hidden_offsets='AUTO', dtype=<type 'numpy.float64'>)[source]¶ Implementation of a centered Restricted Boltzmann machine with Noisy linear rectified visible units and binary hidden units.
-
__init__
(number_visibles, number_hiddens, data=None, initial_weights='AUTO', initial_visible_bias='AUTO', initial_hidden_bias='AUTO', initial_visible_offsets='AUTO', initial_hidden_offsets='AUTO', dtype=<type 'numpy.float64'>)[source]¶ This function initializes all necessary parameters and data structures. It is recommended to pass the training data to initialize the network automatically.
Parameters: - number_visibles (int) – Number of the visible variables.
- number_hiddens (int) – Number of hidden variables.
- data (None or numpy array [num samples, input dim]) – The training data for parameter initialization if ‘AUTO’ is chosen for the corresponding parameter.
- initial_weights ('AUTO', scalar or numpy array [input dim, output_dim]) – Initial weights. ‘AUTO’ and a scalar are random init.
- initial_visible_bias ('AUTO','INVERSE_SIGMOID', scalar or numpy array [1, input dim]) – Initial visible bias. ‘AUTO’ is random, ‘INVERSE_SIGMOID’ is the inverse Sigmoid of the visilbe mean. If a scalar is passed all values are initialized with it.
- initial_hidden_bias ('AUTO','INVERSE_SIGMOID', scalar or numpy array [1, output_dim]) – Initial hidden bias. ‘AUTO’ is random, ‘INVERSE_SIGMOID’ is the inverse Sigmoid of the hidden mean. If a scalar is passed all values are initialized with it.
- initial_visible_offsets ('AUTO', scalar or numpy array [1, input dim]) – Initial visible offset values. AUTO=data mean or 0.5 if no data is given. If a scalar is passed all values are initialized with it.
- initial_hidden_offsets ('AUTO', scalar or numpy array [1, output_dim]) – Initial hidden offset values. AUTO = 0.5 If a scalar is passed all values are initialized with it.
- dtype (numpy.float32 or numpy.float64 or numpy.longdouble) – Used data type i.e. numpy.float64
-
probability_v_given_h
(h, beta=None, use_base_model=False)[source]¶ Calculates the conditional probabilities of v given h.
Parameters: - h (numpy array [batch size, output dim]) – Hidden states.
- beta (None, float or numpy array [batch size, 1]) – Allows to sample from a given inverse temperature beta, or if a vector is given to sample from different betas simultaneously. None is equivalent to pass the value 1.0
- use_base_model (bool) – If true uses the base model, i.e. the MLE of the bias values.
Returns: Conditional probabilities v given h.
Return type: numpy array [batch size, input d
-
sample_v
(v, beta=None, use_base_model=False)[source]¶ Samples the visible variables from the conditional probabilities v given h.
Parameters: - v (numpy array [batch size, input dim]) – Conditional probabilities of v given h.
- beta (None) – DUMMY Variable. The sampling in other types of units like Gaussian-Binary RBMs will be affected by beta.
- use_base_model (bool) – If true uses the base model, i.e. the MLE of the bias values. (DUMMY in this case)
Returns: States for v.
Return type: numpy array [batch size, input dim]
-
RectRectRBM¶
-
class
pydeep.rbm.model.
RectRectRBM
(number_visibles, number_hiddens, data=None, initial_weights='AUTO', initial_visible_bias='AUTO', initial_hidden_bias='AUTO', initial_visible_offsets='AUTO', initial_hidden_offsets='AUTO', dtype=<type 'numpy.float64'>)[source]¶ Implementation of a centered Restricted Boltzmann machine with Noisy linear rectified visible and hidden units.
-
__init__
(number_visibles, number_hiddens, data=None, initial_weights='AUTO', initial_visible_bias='AUTO', initial_hidden_bias='AUTO', initial_visible_offsets='AUTO', initial_hidden_offsets='AUTO', dtype=<type 'numpy.float64'>)[source]¶ This function initializes all necessary parameters and data structures. It is recommended to pass the training data to initialize the network automatically.
Parameters: - number_visibles (int) – Number of the visible variables.
- number_hiddens (int) – Number of hidden variables.
- data (None or numpy array [num samples, input dim]) – The training data for parameter initialization if ‘AUTO’ is chosen for the corresponding parameter.
- initial_weights ('AUTO', scalar or numpy array [input dim, output_dim]) – Initial weights. ‘AUTO’ and a scalar are random init.
- initial_visible_bias ('AUTO','INVERSE_SIGMOID', scalar or numpy array [1, input dim]) – Initial visible bias. ‘AUTO’ is random, ‘INVERSE_SIGMOID’ is the inverse Sigmoid of the visilbe mean. If a scalar is passed all values are initialized with it.
- initial_hidden_bias ('AUTO','INVERSE_SIGMOID', scalar or numpy array [1, output_dim]) – Initial hidden bias. ‘AUTO’ is random, ‘INVERSE_SIGMOID’ is the inverse Sigmoid of the hidden mean. If a scalar is passed all values are initialized with it.
- initial_visible_offsets ('AUTO', scalar or numpy array [1, input dim]) – Initial visible offset values. AUTO=data mean or 0.5 if no data is given. If a scalar is passed all values are initialized with it.
- initial_hidden_offsets ('AUTO', scalar or numpy array [1, output_dim]) – Initial hidden offset values. AUTO = 0.5 If a scalar is passed all values are initialized with it.
- dtype (numpy.float32 or numpy.float64 or numpy.longdouble) – Used data type i.e. numpy.float64
-
probability_v_given_h
(h, beta=None, use_base_model=False)[source]¶ Calculates the conditional probabilities of v given h.
Parameters: - h (numpy array [batch size, output dim]) – Hidden states.
- beta (None, float or numpy array [batch size, 1]) – Allows to sample from a given inverse temperature beta, or if a vector is given to sample from different betas simultaneously. None is equivalent to pass the value 1.0
- use_base_model (bool) – If true uses the base model, i.e. the MLE of the bias values.
Returns: Conditional probabilities v given h.
Return type: numpy array [batch size, input d
-
sample_v
(v, beta=None, use_base_model=False)[source]¶ Samples the visible variables from the conditional probabilities v given h.
Parameters: - v (numpy array [batch size, input dim]) – Conditional probabilities of v given h.
- beta (None) – DUMMY Variable The sampling in other types of units like Gaussian-Binary RBMs will be affected by beta.
- use_base_model (bool) – If true uses the base model, i.e. the MLE of the bias values. (DUMMY in this case)
Returns: States for v.
Return type: numpy array [batch size, input dim]
-
GaussianRectRBM¶
-
class
pydeep.rbm.model.
GaussianRectRBM
(number_visibles, number_hiddens, data=None, initial_weights='AUTO', initial_visible_bias='AUTO', initial_hidden_bias='AUTO', initial_sigma='AUTO', initial_visible_offsets='AUTO', initial_hidden_offsets='AUTO', dtype=<type 'numpy.float64'>)[source]¶ Implementation of a centered Restricted Boltzmann machine with Gaussian visible and Noisy linear rectified hidden units.
-
__init__
(number_visibles, number_hiddens, data=None, initial_weights='AUTO', initial_visible_bias='AUTO', initial_hidden_bias='AUTO', initial_sigma='AUTO', initial_visible_offsets='AUTO', initial_hidden_offsets='AUTO', dtype=<type 'numpy.float64'>)[source]¶ This function initializes all necessary parameters and data structures. See comments for automatically chosen values.
Parameters: - number_visibles (int) – Number of the visible variables.
- number_hiddens (int) – Number of the hidden variables.
- data (None or numpy array [num samples, input dim] or List of numpy arrays [num samples, input dim]) – The training data for initializing the visible bias.
- initial_weights ('AUTO', scalar or numpy array [input dim, output_dim]) – Initial weights.
- initial_visible_bias ('AUTO', scalar or numpy array [1,input dim]) – Initial visible bias.
- initial_hidden_bias ('AUTO', scalar or numpy array [1, output_dim]) – Initial hidden bias.
- initial_sigma ('AUTO', scalar or numpy array [1, input_dim]) – Initial standard deviation for the model.
- initial_visible_offsets ('AUTO', scalar or numpy array [1, input dim]) – Initial visible mean values.
- initial_hidden_offsets ('AUTO', scalar or numpy array [1, output_dim]) – Initial hidden mean values.
- dtype (numpy.float32, numpy.float64 and, numpy.longdouble) – Used data type.
-
probability_h_given_v
(v, beta=None)[source]¶ Calculates the conditional probabilities h given v.
Parameters: - v (numpy array [batch size, input dim]) – Visible states / data.
- beta (float or numpy array [batch size, 1]) – Allows to sample from a given inverse temperature beta, or if a vector is given to sample from different betas simultaneously.
Returns: Conditional probabilities h given v.
Return type: numpy array [batch size, output dim]
-
sample_h
(h, beta=None, use_base_model=False)[source]¶ Samples the hidden variables from the conditional probabilities h given v.
Parameters: - h (numpy array [batch size, output dim]) – Conditional probabilities of h given v.
- beta (None) – DUMMY Variable The sampling in other types of units like Gaussian-Binary RBMs will be affected by beta.
- use_base_model (bool) – If true uses the base model, i.e. the MLE of the bias values. (DUMMY in this case)
Returns: States for h.
Return type: numpy array [batch size, output dim]
-
GaussianRectVarianceRBM¶
-
class
pydeep.rbm.model.
GaussianRectVarianceRBM
(number_visibles, number_hiddens, data=None, initial_weights='AUTO', initial_visible_bias='AUTO', initial_hidden_bias='AUTO', initial_sigma='AUTO', initial_visible_offsets=0.0, initial_hidden_offsets=0.0, dtype=<type 'numpy.float64'>)[source]¶ Implementation of a Restricted Boltzmann machine with Gaussian visible having trainable variances and noisy rectified hidden units.
-
__init__
(number_visibles, number_hiddens, data=None, initial_weights='AUTO', initial_visible_bias='AUTO', initial_hidden_bias='AUTO', initial_sigma='AUTO', initial_visible_offsets=0.0, initial_hidden_offsets=0.0, dtype=<type 'numpy.float64'>)[source]¶ This function initializes all necessary parameters and data structures. See comments for automatically chosen values.
Parameters: - number_visibles (int) – Number of the visible variables.
- number_hiddens (int) – Number of the hidden variables.
- data (None or numpy array [num samples, input dim] or List of numpy arrays [num samples, input dim]) – The training data for initializing the visible bias.
- initial_weights ('AUTO', scalar or numpy array [input dim, output_dim]) – Initial weights.
- initial_visible_bias ('AUTO', scalar or numpy array [1,input dim]) – Initial visible bias.
- initial_hidden_bias ('AUTO', scalar or numpy array [1, output_dim]) – Initial hidden bias.
- initial_sigma ('AUTO', scalar or numpy array [1, input_dim]) – Initial standard deviation for the model.
- initial_visible_offsets ('AUTO', scalar or numpy array [1, input dim]) – Initial visible mean values.
- initial_hidden_offsets ('AUTO', scalar or numpy array [1, output_dim]) – Initial hidden mean values.
- dtype (numpy.float32, numpy.float64 and, numpy.longdouble) – Used data type.
-
_calculate_sigma_gradient
(v, h)[source]¶ This function calculates the gradient for the variance of the RBM.
Parameters: - v (numpy arrays [batchsize, input dim]) – States of the visible variables.
- h (numpy arrays [batchsize, output dim]) – Probabilities of the hidden variables.
Returns: Sigma gradient.
Return type: list of numpy arrays [input dim,1]
-
calculate_gradients
(v, h)[source]¶ This function calculates all gradients of this RBM and returns them as an ordered array. This keeps the flexibility of adding parameters which will be updated by the training algorithms.
Parameters: - v (numpy arrays [batchsize, input dim]) – States of the visible variables.
- h (numpy arrays [batchsize, output dim]) – Probabilities of the hidden variables.
Returns: Gradients for all parameters.
Return type: numpy arrays (num parameters x [parameter.shape])
-
sampler¶
This module provides different sampling algorithms for RBMs running on CPU. The structure is kept modular to simplify the understanding of the code and the mathematics. In addition the modularity helps to create other kind of sampling algorithms by inheritance.
Implemented: |
|
---|---|
Info: | For the derivations .. seealso:: https://www.ini.rub.de/PEOPLE/wiskott/Reprints/Melchior-2012-MasterThesis-RBMs.pdf |
Version: | 1.1.0 |
Date: | 04.04.2017 |
Author: | Jan Melchior |
Contact: | |
License: | Copyright (C) 2017 Jan Melchior This file is part of the Python library PyDeep. PyDeep is free software: you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation, either version 3 of the License, or (at your option) any later version. This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details. You should have received a copy of the GNU General Public License along with this program. If not, see <http://www.gnu.org/licenses/>. |
GibbsSampler¶
-
class
pydeep.rbm.sampler.
GibbsSampler
(model)[source]¶ Implementation of k-step Gibbs-sampling for bipartite graphs.
-
__init__
(model)[source]¶ Initializes the sampler with the model.
Parameters: model (Valid model class like BinaryBinary-RBM.) – The model to sample from.
-
sample
(vis_states, k=1, betas=None, ret_states=True)[source]¶ Performs k steps Gibbs-sampling starting from given visible data.
Parameters: - vis_states (numpy array [num samples, input dimension]) – The initial visible states to sample from.
- k (int) – The number of Gibbs sampling steps.
- betas (None, float, numpy array [num_betas,1]) – Inverse temperature to sample from.(energy based models)
- ret_states (bool) – If False returns the visible probabilities instead of the states.
Returns: The visible samples of the Markov chains.
Return type: numpy array [num samples, input dimension]
-
sample_from_h
(hid_states, k=1, betas=None, ret_states=True)[source]¶ Performs k steps Gibbs-sampling starting from given hidden states.
Parameters: - hid_states (numpy array [num samples, output dimension]) – The initial hidden states to sample from.
- k (int) – The number of Gibbs sampling steps.
- betas ((energy based models)) – Inverse temperature to sample from.
- ret_states (bool) – If False returns the visible probabilities instead of the states.
Returns: The visible samples of the Markov chains.
Return type: numpy array [num samples, input dimension]
-
PersistentGibbsSampler¶
-
class
pydeep.rbm.sampler.
PersistentGibbsSampler
(model, num_chains)[source]¶ Implementation of k-step persistent Gibbs sampling.
-
__init__
(model, num_chains)[source]¶ Initializes the sampler with the model.
Parameters: - model (Valid model class.) – The model to sample from.
- num_chains (int) – The number of Markov chains. .. Note:: Optimal performance is achieved if the number of samples and the number of chains equal the batch_size.
-
sample
(num_samples, k=1, betas=None, ret_states=True)[source]¶ Performs k steps persistent Gibbs-sampling.
Parameters: - num_samples (int, numpy array) – The number of samples to generate. .. Note:: Optimal performance is achieved if the number of samples and the number of chains equal the batch_size.
- k (int) – The number of Gibbs sampling steps.
- betas (None, float, numpy array [num_betas,1]) – Inverse temperature to sample from.(energy based models)
- ret_states (bool) – If False returns the visible probabilities instead of the states.
Returns: The visible samples of the Markov chains.
Return type: numpy array [num samples, input dimension]
-
ParallelTemperingSampler¶
-
class
pydeep.rbm.sampler.
ParallelTemperingSampler
(model, num_chains=3, betas=None)[source]¶ Implementation of k-step parallel tempering sampling.
-
__init__
(model, num_chains=3, betas=None)[source]¶ Initializes the sampler with the model.
Parameters: - model (Valid model Class.) – The model to sample from.
- num_chains (int) – The number of Markov chains.
- betas (int, None) – Array of inverse temperatures to sample from, its dimensionality needs to equal the number of chains or if None is given the inverse temperatures are initialized linearly from 0.0 to 1.0 in ‘num_chains’ steps.
-
classmethod
_swap_chains
(chains, hid_states, model, betas)[source]¶ Swaps the samples between the Markov chains according to the Metropolis Hastings Ratio.
Parameters: - chains ([num samples, input dimension]) – Chains with visible data.
- hid_states ([num samples, output dimension]) – Hidden states.
- model (Valid RBM Class.) – The model to sample from.
- betas (int, None) – Array of inverse temperatures to sample from, its dimensionality needs to equal the number of chains or if None is given the inverse temperatures are initialized linearly from 0.0 to 1.0 in ‘num_chains’ steps.
-
sample
(num_samples, k=1, ret_states=True)[source]¶ Performs k steps parallel tempering sampling.
Parameters: - num_samples (int, numpy array) – The number of samples to generate. .. Note:: Optimal performance is achieved if the number of samples and the number of chains equal the batch_size.
- k (int) – The number of Gibbs sampling steps.
- ret_states (bool) – If False returns the visible probabilities instead of the states.
Returns: The visible samples of the Markov chains.
Return type: numpy array [num samples, input dimension]
-
IndependentParallelTemperingSampler¶
-
class
pydeep.rbm.sampler.
IndependentParallelTemperingSampler
(model, num_samples, num_chains=3, betas=None)[source]¶ Implementation of k-step independent parallel tempering sampling. IPT runs an PT instance for each sample in parallel. This speeds up the sampling but also decreases the mixing rate.
-
__init__
(model, num_samples, num_chains=3, betas=None)[source]¶ Initializes the sampler with the model.
Parameters: - model (Valid model Class.) – The model to sample from.
- num_samples – The number of samples to generate. .. Note:: Optimal performance (ATLAS,MKL) is achieved if the number of samples equals the batchsize.
- num_chains (int) – The number of Markov chains.
- betas (int, None) – Array of inverse temperatures to sample from, its dimensionality needs to equal the number of chains or if None is given the inverse temperatures are initialized linearly from 0.0 to 1.0 in ‘num_chains’ steps.
-
classmethod
_swap_chains
(chains, num_chains, hid_states, model, betas)[source]¶ Swaps the samples between the Markov chains according to the Metropolis Hastings Ratio.
Parameters: - chains ([num samples*num_chains, input dimension]) – Chains with visible data.
- hid_states ([num samples*num_chains, output dimension]) – Hidden states.
- model (Valid RBM Class.) – The model to sample from.
- betas (int, None) – Array of inverse temperatures to sample from, its dimensionality needs to equal the number of chains or if None is given the inverse temperatures are initialized linearly from 0.0 to 1.0 in ‘num_chains’ steps.
-
sample
(num_samples='AUTO', k=1, ret_states=True)[source]¶ Performs k steps independent parallel tempering sampling.
Parameters: - num_samples (int or 'AUTO') – The number of samples to generate. .. Note:: Optimal performance is achieved if the number of samples and the number of chains equal the batch_size. -> AUTO
- k (int) – The number of Gibbs sampling steps.
- ret_states (bool) – If False returns the visible probabilities instead of the states.
Returns: The visible samples of the Markov chains.
Return type: numpy array [num samples, input dimension]
-
trainer¶
This module provides different types of training algorithms for RBMs running on CPU. The structure is kept modular to simplify the understanding of the code and the mathematics. In addition the modularity helps to create other kind of training algorithms by inheritance.
Implemented: |
|
---|---|
Info: | For the derivations .. seealso:: https://www.ini.rub.de/PEOPLE/wiskott/Reprints/Melchior-2012-MasterThesis-RBMs.pdf |
Version: | 1.1.0 |
Date: | 04.04.2017 |
Author: | Jan Melchior |
Contact: | |
License: | Copyright (C) 2017 Jan Melchior This file is part of the Python library PyDeep. PyDeep is free software: you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation, either version 3 of the License, or (at your option) any later version. This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details. You should have received a copy of the GNU General Public License along with this program. If not, see <http://www.gnu.org/licenses/>. |
CD¶
-
class
pydeep.rbm.trainer.
CD
(model, data=None)[source]¶ Implementation of the training algorithm Contrastive Divergence (CD).
INFO: A fast learning algorithm for deep belief nets, Geoffrey E. Hinton and Simon Osindero Yee-Whye Teh Department of Computer Science University of Toronto Yee-Whye Teh 10 Kings College Road National University of Singapore. -
__init__
(model, data=None)[source]¶ The constructor initializes the CD trainer with a given model and data.
Parameters: - model (Valid model class.) – The model to sample from.
- data (numpy array [num. samples x input dim]) – Data for initialization, only has effect if the centered gradient is used.
-
_adapt_gradient
(pos_gradients, neg_gradients, batch_size, epsilon, momentum, reg_l1norm, reg_l2norm, reg_sparseness, desired_sparseness, mean_hidden_activity, visible_offsets, hidden_offsets, use_centered_gradient, restrict_gradient, restriction_norm)[source]¶ This function updates the parameter gradients.
Parameters: - pos_gradients (numpy array[parameter index, parameter shape]) – Positive Gradients.
- neg_gradients (numpy array[parameter index, parameter shape]) – Negative Gradients.
- batch_size (float) – The batch_size of the data.
- epsilon (numpy array[num parameters]) – The learning rate.
- momentum (numpy array[num parameters]) – The momentum term.
- reg_l1norm (float) – The parameter for the L1 regularization
- reg_l2norm (float) – The parameter for the L2 regularization also know as weight decay.
- reg_sparseness (None or float) – The parameter for the desired_sparseness regularization.
- desired_sparseness (None or float) – Desired average hidden activation or None for no regularization.
- mean_hidden_activity (numpy array [num samples]) – Average hidden activation <P(h_i=1|x)>_h_i
- visible_offsets (float) – If not zero the gradient is centered around this value.
- hidden_offsets (float) – If not zero the gradient is centered around this value.
- use_centered_gradient (bool) – Uses the centered gradient instead of centering.
- restrict_gradient (None, float) – If a scalar is given the norm of the weight gradient (along the input dim) is restricted to stay below this value.
- restriction_norm (string, 'Cols','Rows', 'Mat') – Restricts the column norm, row norm or Matrix norm.
-
classmethod
_calculate_centered_gradient
(gradients, visible_offsets, hidden_offsets)[source]¶ Calculates the centered gradient from the normal CD gradient for the parameters W, bv, bh and the corresponding offset values.
Parameters: - gradients (List of 2D numpy arrays) – Original gradients.
- visible_offsets (numpy array[1,input dim]) – Visible offsets to be used.
- hidden_offsets (numpy array[1,output dim]) – Hidden offsets to be used.
Returns: Enhanced gradients for all parameters.
Return type: numpy arrays (num parameters x [parameter.shape])
-
_train
(data, epsilon, k, momentum, reg_l1norm, reg_l2norm, reg_sparseness, desired_sparseness, update_visible_offsets, update_hidden_offsets, offset_typ, use_centered_gradient, restrict_gradient, restriction_norm, use_hidden_states)[source]¶ The training for one batch is performed using Contrastive Divergence (CD) for k sampling steps.
Parameters: - data (numpy array [batch_size, input dimension]) – The data used for training.
- epsilon (scalar or numpy array[num parameters] or numpy array[num parameters, parameter shape]) – The learning rate.
- k (int) – NUmber of sampling steps.
- momentum (scalar or numpy array[num parameters] or numpy array[num parameters, parameter shape]) – The momentum term.
- reg_l1norm (float) – The parameter for the L1 regularization
- reg_l2norm (float) – The parameter for the L2 regularization also know as weight decay.
- reg_sparseness (None or float) – The parameter for the desired_sparseness regularization.
- desired_sparseness (None or float) – Desired average hidden activation or None for no regularization.
- update_visible_offsets (float) – The update step size for the models visible offsets.
- update_hidden_offsets (float) – The update step size for the models hidden offsets.
- offset_typ (string) – Different offsets can be used to center the gradient.:Example: ‘DM’ uses the positive phase visible mean and the negative phase hidden mean. ‘A0’ uses the average of positive and negative phase mean for visible, zero for the hiddens. Possible values are out of {A,D,M,0}x{A,D,M,0}
- use_centered_gradient (bool) – Uses the centered gradient instead of centering.
- restrict_gradient (None, float) – If a scalar is given the norm of the weight gradient (along the input dim) is restricted to stay below this value.
- restriction_norm (string, 'Cols','Rows', 'Mat') – Restricts the column norm, row norm or Matrix norm.
- use_hidden_states (bool) – If True, the hidden states are used for the gradient calculations, the hiddens probabilities otherwise.
-
train
(data, num_epochs=1, epsilon=0.01, k=1, momentum=0.0, reg_l1norm=0.0, reg_l2norm=0.0, reg_sparseness=0.0, desired_sparseness=None, update_visible_offsets=0.01, update_hidden_offsets=0.01, offset_typ='DD', use_centered_gradient=False, restrict_gradient=False, restriction_norm='Mat', use_hidden_states=False)[source]¶ Train the models with all batches using Contrastive Divergence (CD) for k sampling steps.
Parameters: - data (numpy array [batch_size, input dimension]) – The data used for training.
- num_epochs (int) – NUmber of epochs (loop through the data).
- epsilon (scalar or numpy array[num parameters] or numpy array[num parameters, parameter shape]) – The learning rate.
- k (int) – NUmber of sampling steps.
- momentum (scalar or numpy array[num parameters] or numpy array[num parameters, parameter shape]) – The momentum term.
- reg_l1norm (float) – The parameter for the L1 regularization
- reg_l2norm (float) – The parameter for the L2 regularization also know as weight decay.
- reg_sparseness (None or float) – The parameter for the desired_sparseness regularization.
- desired_sparseness (None or float) – Desired average hidden activation or None for no regularization.
- update_visible_offsets (float) – The update step size for the models visible offsets.
- update_hidden_offsets (float) – The update step size for the models hidden offsets.
- offset_typ (string) – Different offsets can be used to center the gradient.Example:’DM’ uses the positive phase visible mean and the negative phase hidden mean. ‘A0’ uses the average of positive and negative phase mean for visible, zero for the hiddens. Possible values are out of {A,D,M,0}x{A,D,M,0}
- use_centered_gradient (bool) – Uses the centered gradient instead of centering.
- restrict_gradient (None, float) – If a scalar is given the norm of the weight gradient (along the input dim) is restricted to stay below this value.
- restriction_norm (string, 'Cols','Rows', 'Mat') – Restricts the column norm, row norm or Matrix norm.
- use_hidden_states (bool) – If True, the hidden states are used for the gradient calculations, the hiddens probabilities otherwise.
-
PCD¶
-
class
pydeep.rbm.trainer.
PCD
(model, num_chains, data=None)[source]¶ Implementation of the training algorithm Persistent Contrastive Divergence (PCD).
Reference: Training Restricted Boltzmann Machines using Approximations to theLikelihood Gradient, Tijmen Tieleman, Department of ComputerScience, University of Toronto, Toronto, Ontario M5S 3G4, Canada-
__init__
(model, num_chains, data=None)[source]¶ The constructor initializes the PCD trainer with a given model and data.
Parameters: - model (Valid model class.) – The model to sample from.
- num_chains (int) – The number of chains that should be used. .. Note:: You should use the data’s batch size!
- data (numpy array [num. samples x input dim]) – Data for initialization, only has effect if the centered gradient is used.
-
PT¶
-
class
pydeep.rbm.trainer.
PT
(model, betas=3, data=None)[source]¶ Implementation of the training algorithm Parallel Tempering Contrastive Divergence (PT).
Reference: Parallel Tempering for Training of Restricted Boltzmann Machines,Guillaume Desjardins, Aaron Courville, Yoshua Bengio, PascalVincent, Olivier Delalleau, Dept. IRO, Universite de Montreal P.O.Box 6128, Succ. Centre-Ville, Montreal, H3C 3J7, Qc, Canada.-
__init__
(model, betas=3, data=None)[source]¶ The constructor initializes the IPT trainer with a given model anddata.
Parameters: - model (Valid model class.) – The model to sample from.
- betas (int, numpy array [num betas]) – List of inverse temperatures to sample from. If a scalar is given, the temperatures will be set linearly from 0.0 to 1.0 in ‘betas’ steps.
- data (numpy array [num. samples x input dim]) – Data for initialization, only has effect if the centered gradient is used.
-
IPT¶
-
class
pydeep.rbm.trainer.
IPT
(model, num_samples, betas=3, data=None)[source]¶ Implementation of the training algorithm Independent Parallel Tempering Contrastive Divergence (IPT). As normal PT but the chain’s switches are done only from one batch to the next instead of from one sample to the next.
Reference: Parallel Tempering for Training of Restricted Boltzmann Machines,Guillaume Desjardins, Aaron Courville, Yoshua Bengio, PascalVincent, Olivier Delalleau, Dept. IRO, Universite de Montreal P.O.Box 6128, Succ. Centre-Ville, Montreal, H3C 3J7, Qc, Canada.-
__init__
(model, num_samples, betas=3, data=None)[source]¶ - The constructor initializes the IPT trainer with a given model and
- data.
Parameters: - model (Valid model class.) – The model to sample from.
- num_samples (int) – The number of Samples to produce. .. Note:: you should use the batchsize.
- betas (int, numpy array [num betas]) – List of inverse temperatures to sample from. If a scalar is given, the temperatures will be set linearly from 0.0 to 1.0 in ‘betas’ steps.
- data (numpy array [num. samples x input dim]) – Data for initialization, only has effect if the centered gradient is used.
-
GD¶
-
class
pydeep.rbm.trainer.
GD
(model, data=None)[source]¶ Implementation of the training algorithm Gradient descent. Since it involves the calculation of the partition function for each update, it is only possible for small BBRBMs.
-
__init__
(model, data=None)[source]¶ The constructor initializes the Gradient trainer with a given model.
Parameters: - model (Valid model class.) – The model to sample from.
- data (numpy array [num. samples x input dim]) – Data for initialization, only has effect if the centered gradient is used.
-
_train
(data, epsilon, k, momentum, reg_l1norm, reg_l2norm, reg_sparseness, desired_sparseness, update_visible_offsets, update_hidden_offsets, offset_typ, use_centered_gradient, restrict_gradient, restriction_norm, use_hidden_states)[source]¶ The training for one batch is performed using True Gradient (GD) for k Gibbs-sampling steps.
Parameters: - data (numpy array [batch_size, input dimension]) – The data used for training.
- epsilon (scalar or numpy array[num parameters] or numpy array[num parameters, parameter shape]) – The learning rate.
- k (int) – NUmber of sampling steps.
- momentum (scalar or numpy array[num parameters] or numpy array[num parameters, parameter shape]) – The momentum term.
- reg_l1norm (float) – The parameter for the L1 regularization
- reg_l2norm (float) – The parameter for the L2 regularization also know as weight decay.
- reg_sparseness (None or float) – The parameter for the desired_sparseness regularization.
- desired_sparseness (None or float) – Desired average hidden activation or None for no regularization.
- update_visible_offsets (float) – The update step size for the models visible offsets.
- update_hidden_offsets (float) – The update step size for the models hidden offsets.
- offset_typ (string) – Different offsets can be used to center the gradient.<br />Example: ‘DM’ uses the positive phase visible mean and the negative phase hidden mean.’A0’ uses the average of positive and negative phase mean for visible, zero for thehiddens. Possible values are out of {A,D,M,0}x{A,D,M,0}
- use_centered_gradient (bool) – Uses the centered gradient instead of centering.
- restrict_gradient (None, float) – If a scalar is given the norm of the weight gradient (along the input dim) is restricted to stay below this value.
- restriction_norm (string, 'Cols','Rows', 'Mat') – Restricts the column norm, row norm or Matrix norm.
- use_hidden_states (bool) – If True, the hidden states are used for the gradient calculations, the hiddens probabilities otherwise.
-