Documentation¶

API documentation for PyDeep.

pydeep¶

Root package directory containing all subpackages og the library.

Version:	1.1.0
Date:	19.03.2017
Author:	Jan Melchior
Contact:	JanMelchior@gmx.de
License:	Copyright (C) 2017 Jan Melchior This file is part of the Python library PyDeep. PyDeep is free software: you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation, either version 3 of the License, or (at your option) any later version. This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details. You should have received a copy of the GNU General Public License along with this program. If not, see <http://www.gnu.org/licenses/>.

ae¶

Module initializer includes all sub-modules for the autoencoder module.

Version:	1.0
Date:	21.01.2018
Author:	Jan Melchior
Contact:	JanMelchior@gmx.de
License:	Copyright (C) 2018 Jan Melchior This program is free software: you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation, either version 3 of the License, or (at your option) any later version. This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details. You should have received a copy of the GNU General Public License along with this program. If not, see <http://www.gnu.org/licenses/>.

model¶

This module provides a general implementation of a 3 layer tied weights Auto-encoder (x-h-y). The code is focused on readability and clearness, while keeping the efficiency and flexibility high. Several activation functions are available for visible and hidden units which can be mixed arbitrarily. The code can easily be adapted to AEs without tied weights. For deep AEs the FFN code can be adapted.

Implemented:	AE - Auto-encoder (centered) DAE - Denoising Auto-encoder (centered) SAE - Sparse Auto-encoder (centered) CAE - Contractive Auto-encoder (centered) SLAE - Slow Auto-encoder (centered)
Info:	http://ufldl.stanford.edu/wiki/index.php/Sparse_Coding:_Autoencoder_Interpretation
Version:	1.0
Date:	08.02.2016
Author:	Jan Melchior
Contact:	JanMelchior@gmx.de
License:	Copyright (C) 2016 Jan Melchior This program is free software: you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation, either version 3 of the License, or (at your option) any later version. This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details. You should have received a copy of the GNU General Public License along with this program. If not, see <http://www.gnu.org/licenses/>.

AutoEncoder¶

class pydeep.ae.model.AutoEncoder(number_visibles, number_hiddens, data=None, visible_activation_function=<class 'pydeep.base.activationfunction.Sigmoid'>, hidden_activation_function=<class 'pydeep.base.activationfunction.Sigmoid'>, cost_function=<class 'pydeep.base.costfunction.CrossEntropyError'>, initial_weights='AUTO', initial_visible_bias='AUTO', initial_hidden_bias='AUTO', initial_visible_offsets='AUTO', initial_hidden_offsets='AUTO', dtype=<type 'numpy.float64'>)[source]¶

Class for a 3 Layer Auto-encoder (x-h-y) with tied weights.

_AutoEncoder__get_sparse_penalty_gradient_part(h, desired_sparseness)¶

This function computes the desired part of the gradient for the sparse penalty term. Only used for efficiency.

Parameters:

h: hidden activations: -type: numpy array [num samples, input dim]
desired_sparseness: Desired average hidden activation.: -type: float

Returs:

The computed gradient part is returned

-type: numpy array [1, hidden dim]

__init__(number_visibles, number_hiddens, data=None, visible_activation_function=<class 'pydeep.base.activationfunction.Sigmoid'>, hidden_activation_function=<class 'pydeep.base.activationfunction.Sigmoid'>, cost_function=<class 'pydeep.base.costfunction.CrossEntropyError'>, initial_weights='AUTO', initial_visible_bias='AUTO', initial_hidden_bias='AUTO', initial_visible_offsets='AUTO', initial_hidden_offsets='AUTO', dtype=<type 'numpy.float64'>)[source]¶

This function initializes all necessary parameters and data structures. It is recommended to pass the training data to initialize the network automatically.

Parameters:

number_visibles: Number of the visible variables.

-type: int

number_hiddens Number of hidden variables.

-type: int

data: The training data for parameter

initialization if ‘AUTO’ is chosen.

-type: None or: numpy array [num samples, input dim] or List of numpy arrays [num samples, input dim]

visible_activation_function: A non linear transformation function

for the visible units (default: Sigmoid)

-type: Subclass of ActivationFunction()

hidden_activation_function: A non linear transformation function

for the hidden units (default: Sigmoid)

-type: Subclass of ActivationFunction

cost_function A cost function (default: CrossEntropyError())

-type: subclass of FNNCostFunction()

initial_weights: Initial weights.’AUTO’ is random

-type: ‘AUTO’, scalar or: numpy array [input dim, output_dim]

initial_visible_bias: Initial visible bias.

‘AUTO’ is random ‘INVERSE_SIGMOID’ is the inverse Sigmoid of

the visilbe mean

-type: ‘AUTO’,’INVERSE_SIGMOID’, scalar or: numpy array [1, input dim]

initial_hidden_bias: Initial hidden bias.

‘AUTO’ is random ‘INVERSE_SIGMOID’ is the inverse Sigmoid of the hidden mean

-type: ‘AUTO’,’INVERSE_SIGMOID’, scalar or: numpy array [1, output_dim]

initial_visible_offsets: Initial visible mean values.

AUTO=data mean or 0.5 if not data is given.

-type: ‘AUTO’, scalar or: numpy array [1, input dim]

initial_hidden_offsets: Initial hidden mean values.

AUTO = 0.5

-type: ‘AUTO’, scalar or: numpy array [1, output_dim]

dtype: Used data type i.e. numpy.float64

-type: numpy.float32 or numpy.float64 or: numpy.longdouble

_decode(h)[source]¶

The function propagates the activation of the hidden: layer reverse through the network to the input layer.

Parameters:

h: Output of the network: -type: numpy array [num samples, hidden dim]

Returns:

Input of the network.

-type: array [num samples, input dim]

_encode(x)[source]¶

The function propagates the activation of the input: layer through the network to the hidden/output layer.

Parameters:

x: Input of the network.: -type: numpy array [num samples, input dim]

Returns:

Pre and Post synaptic output.

-type: List of arrays [num samples, hidden dim]

_get_contractive_penalty(a_h, factor)[source]¶

Calculates contractive penalty cost for a data point x.

Parameters:

a_h: Pre-synaptic activation of h: a_h = (Wx+c).: -type: numpy array [num samples, hidden dim]
factor: Influence factor (lambda) for the penalty.: -type: float

Returns:

Contractive penalty costs for x.

-type: numpy array [num samples]

_get_contractive_penalty_gradient(x, a_h, df_a_h)[source]¶

This function computes the gradient for the contractive penalty term.

Parameters:

x: Training data.: -type: numpy array [num samples, input dim]
a_h: Untransformed hidden activations: -type: numpy array [num samples, input dim]
df_a_h: Derivative of untransformed hidden activations: -type: numpy array [num samples, input dim]

Returs:

The computed gradient is returned

-type: numpy array [input dim, hidden dim]

_get_gradients(x, a_h, h, a_y, y, reg_contractive, reg_sparseness, desired_sparseness, reg_slowness, x_next, a_h_next, h_next)[source]¶

Computes the gradients of weights, visible and the hidden bias. Depending on whether contractive penalty and or sparse penalty is used the gradient changes.

Parameters:

x: Training data.: -type: numpy array [num samples, input dim]
a_h: Pre-synaptic activation of h: a_h = (Wx+c).: -type: numpy array [num samples, output dim]
h Post-synaptic activation of h: h = f(a_h).: -type: numpy array [num samples, output dim]
a_y: Pre-synaptic activation of y: a_y = (Wh+b).: -type: numpy array [num samples, input dim]
y Post-synaptic activation of y: y = f(a_y).: -type: numpy array [num samples, input dim]
reg_contractive: Contractive influence factor (lambda).: -type: float
reg_sparseness: Sparseness influence factor (lambda).: -type: float
desired_sparseness: Desired average hidden activation.: -type: float
reg_slowness: Slowness influence factor.: -type: float
x_next: Next Training data in Sequence.: -type: numpy array [num samples, input dim]
a_h_next: Next pre-synaptic activation of h: a_h = (Wx+c).: -type: numpy array [num samples, output dim]
h_next Next post-synaptic activation of h: h = f(a_h).: -type: numpy array [num samples, input dim]

_get_slowness_penalty(h, h_next, factor)[source]¶

Calculates slowness penalty cost for a data point x.: Warning

Different penalties are used depending on the hidden activation function.

Parameters:

h: hidden activation.: -type: numpy array [num samples, hidden dim]
h_next: hidden activation of the next data point in a sequence.: -type: numpy array [num samples, hidden dim]
factor: Influence factor (beta) for the penalty.: -type: float

Returns:

Sparseness penalty costs for x.

-type: numpy array [num samples]

_get_slowness_penalty_gradient(x, x_next, h, h_next, df_a_h, df_a_h_next)[source]¶

This function computes the gradient for the slowness penalty term.

Parameters:

x: Training data.: -type: numpy array [num samples, input dim]
x_next: Next training data points in Sequence.: -type: numpy array [num samples, input dim]
h: Corresponding hidden activations.: -type: numpy array [num samples, output dim]
h_next: Corresponding next hidden activations.: -type: numpy array [num samples, output dim]
df_a_h: Derivative of untransformed hidden activations.: -type: numpy array [num samples, input dim]
df_a_h_next: Derivative of untransformed next hidden activations.: -type: numpy array [num samples, input dim]

Returs:

The computed gradient is returned

-type: numpy array [input dim, hidden dim]

_get_sparse_penalty(h, factor, desired_sparseness)[source]¶

Calculates sparseness penalty cost for a data point x.: Warning

Different penalties are used depending on the hidden activation function.

Parameters:

h: hidden activation.: -type: numpy array [num samples, hidden dim]
factor: Influence factor (beta) for the penalty.: -type: float
desired_sparseness: Desired average hidden activation.: -type: float

Returns:

Sparseness penalty costs for x.

-type: numpy array [num samples]

_get_sparse_penalty_gradient(h, df_a_h, desired_sparseness)[source]¶

This function computes the gradient for the sparse penalty term.

Parameters:

h: hidden activations: -type: numpy array [num samples, input dim]
df_a_h: Derivative of untransformed hidden activations: -type: numpy array [num samples, input dim]
desired_sparseness: Desired average hidden activation.: -type: float

Returs:

The computed gradient part is returned

-type: numpy array [1, hidden dim]

decode(h)[source]¶

The function propagates the activation of the hidden: layer reverse through the network to the input layer.

Parameters:

h: Output of the network: -type: numpy array [num samples, hidden dim]

Returns:

Pre and Post synaptic input.

-type: List of arrays [num samples, input dim]

encode(x)[source]¶

The function propagates the activation of the input: layer through the network to the hidden/output layer.

Parameters:

x: Input of the network.: -type: numpy array [num samples, input dim]

Returns:

Output of the network.

-type: array [num samples, hidden dim]

energy(x, contractive_penalty=0.0, sparse_penalty=0.0, desired_sparseness=0.01, x_next=None, slowness_penalty=0.0)[source]¶

Calculates the energy/cost for a data point x.

Parameters:

x: Data points.: -type: numpy array [num samples, input dim]
contractive_penalty: If a value > 0.0 is given the cost is also: calculated on the contractive penalty.

-type: float
sparse_penalty: If a value > 0.0 is given the cost is also: calculated on the sparseness penalty.

-type: float
desired_sparseness: Desired average hidden activation.: -type: float
x_next: Next data points.: -type: None or numpy array [num samples, input dim]
slowness_penalty: If a value > 0.0 is given the cost is also: calculated on the slowness penalty.

-type: float

Returns:

Costs for x.

-type: numpy array [num samples]

finit_differences(data, delta, reg_sparseness, desired_sparseness, reg_contractive, reg_slowness, data_next)[source]¶

Finite differences test for AEs. The finite differences test involves all functions of the model except init and reconstruction_error

data: The training data: -type: numpy array [num samples, input dim]
delta: The learning rate.: -type: numpy array[num parameters]
reg_sparseness: The parameter (epsilon) for the sparseness regularization.: -type: float
desired_sparseness: Desired average hidden activation.: -type: float
reg_contractive: The parameter (epsilon) for the contractive regularization.: -type: float
reg_slowness: The parameter (epsilon) for the slowness regularization.: -type: float
data_next: The next training data in the sequence.: -type: numpy array [num samples, input dim]

reconstruction_error(x, absolut=False)[source]¶

Calculates the reconstruction error for given training data.

Parameters:

x: Datapoints: -type: numpy array [num samples, input dim]
absolut: If true the absolute error is caluclated.: -type: bool

Returns:

Reconstruction error.

-type: List of arrays [num samples, 1]

sae¶

Helper class for stacked auto encoder networks.

Version:	1.1.0
Date:	21.01.2018
Author:	Jan Melchior
Contact:	JanMelchior@gmx.de
License:	Copyright (C) 2018 Jan Melchior This file is part of the Python library PyDeep. PyDeep is free software: you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation, either version 3 of the License, or (at your option) any later version. This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details. You should have received a copy of the GNU General Public License along with this program. If not, see <http://www.gnu.org/licenses/>.

SAE¶

class pydeep.ae.sae.SAE(list_of_autoencoders)[source]¶

Stack of auto encoders.

__init__(list_of_autoencoders)[source]¶

Initializes the network with auto encoders.

Parameters:	list_of_autoencoders (list) – List of auto-encoders

backward_propagate(output_data)[source]¶

Propagates the output back through the input.

Parameters:	output_data (numpy array [batchsize x output dim]) – Output data.
Returns:	Input of the network.
Return type:	numpy array [batchsize x input dim]

forward_propagate(input_data)[source]¶

Propagates the data through the network.

Parameters:	input_data (numpy array [batchsize x input dim]) – Input data.
Returns:	Output of the network.
Return type:	numpy array [batchsize x output dim]

trainer¶

This module provides implementations for training different variants of Auto-encoders, modifications on standard gradient decent are provided (centering, denoising, dropout, sparseness, contractiveness, slowness L1-decay, L2-decay, momentum, gradient restriction)

Implemented:	GDTrainer
Info:	http://ufldl.stanford.edu/wiki/index.php/Sparse_Coding:_Autoencoder_Interpretation
Version:	1.0
Date:	21.01.2018
Author:	Jan Melchior
Contact:	JanMelchior@gmx.de
License:	Copyright (C) 2018 Jan Melchior This program is free software: you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation, either version 3 of the License, or (at your option) any later version. This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details. You should have received a copy of the GNU General Public License along with this program. If not, see <http://www.gnu.org/licenses/>.

GDTrainer¶

class pydeep.ae.trainer.GDTrainer(model)[source]¶

Auto encoder trainer using gradient descent.

__init__(model)[source]¶

The constructor takes the model as input

Parameters:	model: An auto-encoder object which should be trained. -type: AutoEncoder

_train(data, epsilon, momentum, update_visible_offsets, update_hidden_offsets, corruptor, reg_L1Norm, reg_L2Norm, reg_sparseness, desired_sparseness, reg_contractive, reg_slowness, data_next, restrict_gradient, restriction_norm)[source]¶

The training for one batch is performed using gradient descent.

Parameters:

data: The training data: -type: numpy array [num samples, input dim]
epsilon: The learning rate.: -type: numpy array[num parameters]
momentum: The momentum term.: -type: numpy array[num parameters]
update_visible_offsets: The update step size for the models: visible offsets. Good value if functionality is used: 0.001

-type: float
update_hidden_offsets: The update step size for the models hidden: offsets. Good value if functionality is used: 0.001

-type: float
corruptor: Defines if and how the data gets corrupted.: (e.g. Gauss noise, dropout, Max out)

-type: corruptor
reg_L1Norm: The parameter for the L1 regularization: -type: float
reg_L2Norm: The parameter for the L2 regularization,: also know as weight decay.

-type: float
reg_sparseness: The parameter (epsilon) for the sparseness regularization.: -type: float
desired_sparseness: Desired average hidden activation.: -type: float
reg_contractive: The parameter (epsilon) for the contractive regularization.: -type: float
reg_slowness: The parameter (epsilon) for the slowness regularization.: -type: float
data_next: The next training data in the sequence.: -type: numpy array [num samples, input dim]
restrict_gradient: If a scalar is given the norm of the: weight gradient is restricted to stay below this value.

-type: None, float
restriction_norm: restricts the column norm, row norm or: Matrix norm.

-type: string: ‘Cols’,’Rows’, ‘Mat’

train(data, num_epochs=1, epsilon=0.1, momentum=0.0, update_visible_offsets=0.0, update_hidden_offsets=0.0, corruptor=None, reg_L1Norm=0.0, reg_L2Norm=0.0, reg_sparseness=0.0, desired_sparseness=0.01, reg_contractive=0.0, reg_slowness=0.0, data_next=None, restrict_gradient=False, restriction_norm='Mat')[source]¶

The training for one batch is performed using gradient descent.

Parameters:

data: The data used for training.

-type: list of numpy arrays: [num samples input dimension]

num_epochs: Number of epochs to train.

-type: int

epsilon: The learning rate.

-type: numpy array[num parameters]

momentum: The momentum term.

-type: numpy array[num parameters]

update_visible_offsets: The update step size for the models

visible offsets. Good value if functionality is used: 0.001

-type: float

update_hidden_offsets: The update step size for the models hidden

offsets. Good value if functionality is used: 0.001

-type: float

corruptor: Defines if and how the data gets corrupted.

-type: corruptor

reg_L1Norm: The parameter for the L1 regularization

-type: float

reg_L2Norm: The parameter for the L2 regularization,

also know as weight decay. -type: float

reg_sparseness: The parameter (epsilon) for the sparseness regularization.

-type: float

desired_sparseness: Desired average hidden activation.

-type: float

reg_contractive: The parameter (epsilon) for the contractive regularization.

-type: float

reg_slowness: The parameter (epsilon) for the slowness regularization.

-type: float

data_next: The next training data in the sequence.

-type: numpy array [num samples, input dim]

restrict_gradient: If a scalar is given the norm of the

weight gradient is restricted to stay below this value.

-type: None, float

restriction_norm: restricts the column norm, row norm or

Matrix norm.

-type: string: ‘Cols’,’Rows’, ‘Mat’

base¶

Package providing basic/fundamental functions/structures such as cost-functions, activation-functions, preprocessing …

Version:	1.1.0
Date:	13.03.2017
Author:	Jan Melchior
Contact:	JanMelchior@gmx.de
License:	Copyright (C) 2017 Jan Melchior This file is part of the Python library PyDeep. PyDeep is free software: you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation, either version 3 of the License, or (at your option) any later version. This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details. You should have received a copy of the GNU General Public License along with this program. If not, see <http://www.gnu.org/licenses/>.

activationfunction¶

Different kind of non linear activation functions and their derivatives.

Implemented:

# Unbounded

# Linear

Identity

# Piecewise-linear

Rectifier
RestrictedRectifier (hard bounded)
LeakyRectifier

# Soft-linear

ExponentialLinear
SigmoidWeightedLinear
SoftPlus

# Bounded

# Step

Step

# Soft-Step

Sigmoid
SoftSign
HyperbolicTangent
SoftMax
K-Winner takes all

# Symmetric, periodic

Radial Basis function
Sinus

Info:	http://en.wikipedia.org/wiki/Activation_function
Version:	1.1.1
Date:	16.01.2018
Author:	Jan Melchior
Contact:	JanMelchior@gmx.de
License:	Copyright (C) 2018 Jan Melchior This file is part of the Python library PyDeep. PyDeep is free software: you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation, either version 3 of the License, or (at your option) any later version. This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details. You should have received a copy of the GNU General Public License along with this program. If not, see <http://www.gnu.org/licenses/>.

Identity¶

class pydeep.base.activationfunction.Identity[source]¶

Identity function.

Info:	http://www.wolframalpha.com/input/?i=line

classmethod ddf(x)[source]¶

Calculates the second derivative of the identity function value for a given input x.

Parameters:	x (scalar or numpy array.) – Inout data.
Returns:	Value of the second derivative of the identity function for x.
Return type:	scalar or numpy array with the same shape as x.

classmethod df(x)[source]¶

Calculates the derivative of the identity function value for a given input x.

Parameters:	x (scalar or numpy array.) – Input data.
Returns:	Value of the derivative of the identity function for x.
Return type:	scalar or numpy array with the same shape as x.

classmethod dg(y)[source]¶

Calculates the derivative of the inverse identity function value for a given input y.

Parameters:	y (scalar or numpy array.) – Input data.
Returns:	Value of the derivative of the inverse identity function for y.
Return type:	scalar or numpy array with the same shape as y.

classmethod f(x)[source]¶

Calculates the identity function value for a given input x.

Parameters:	x (scalar or numpy array.) – Input data.
Returns:	Value of the identity function for x.
Return type:	scalar or numpy array with the same shape as x.

classmethod g(y)[source]¶

Calculates the inverse identity function value for a given input y.

Parameters:	y (scalar or numpy array.) – Input data.
Returns:	Value of the inverse identity function for y.
Return type:	scalar or numpy array with the same shape as y.

Rectifier¶

class pydeep.base.activationfunction.Rectifier[source]¶

Rectifier activation function function.

Info:	http://www.wolframalpha.com/input/?i=max%280%2Cx%29&dataset=&asynchronous=false&equal=Submit

classmethod ddf(x)[source]¶

Calculates the second derivative of the Rectifier function value for a given input x.

Parameters:	x (scalar or numpy array.) – Input data.
Returns:	Value of the 2nd derivative of the Rectifier function for x.
Return type:	scalar or numpy array with the same shape as x.

classmethod df(x)[source]¶

Calculates the derivative of the Rectifier function value for a given input x.

Parameters:	x (scalar or numpy array.) – Input data.
Returns:	Value of the derivative of the Rectifier function for x.
Return type:	scalar or numpy array with the same shape as x.

classmethod f(x)[source]¶

Calculates the Rectifier function value for a given input x.

Parameters:	x (scalar or numpy array.) – Input data.
Returns:	Value of the Rectifier function for x.
Return type:	scalar or numpy array with the same shape as x.

RestrictedRectifier¶

class pydeep.base.activationfunction.RestrictedRectifier(restriction=1.0)[source]¶

Restricted Rectifier activation function function.

Info:	http://www.wolframalpha.com/input/?i=max%280%2Cx%29&dataset=&asynchronous=false&equal=Submit

__init__(restriction=1.0)[source]¶

Constructor.

Parameters:	restriction (float.) – Restriction value / upper limit value.

df(x)[source]¶

Calculates the derivative of the Restricted Rectifier function value for a given input x.

Parameters:	x (scalar or numpy array) – Input data.
Returns:	Value of the derivative of the Restricted Rectifier function for x.
Return type:	scalar or numpy array with the same shape as x.

f(x)[source]¶

Calculates the Restricted Rectifier function value for a given input x.

Parameters:	x (scalar or numpy array.) – Input data.
Returns:	Value of the Restricted Rectifier function for x.
Return type:	scalar or numpy array with the same shape as x.

LeakyRectifier¶

class pydeep.base.activationfunction.LeakyRectifier(negativeSlope=0.01, positiveSlope=1.0)[source]¶

Leaky Rectifier activation function function.

Info:	https://en.wikipedia.org/wiki/Activation_function

__init__(negativeSlope=0.01, positiveSlope=1.0)[source]¶

Constructor.

Parameters:	negativeSlope (scalar) – Slope when x < 0 positiveSlope (scalar) – Slope when x >= 0

df(x)[source]¶

Calculates the derivative of the Leaky Rectifier function value for a given input x.

Parameters:	x (scalar or numpy array) – Input data.
Returns:	Value of the derivative of the Leaky Rectifier function for x.
Return type:	scalar or numpy array with the same shape as x.

f(x)[source]¶

Calculates the Leaky Rectifier function value for a given input x.

Parameters:	x (scalar or numpy array.) – Input data.
Returns:	Value of the Leaky Rectifier function for x.
Return type:	scalar or numpy array with the same shape as x.

ExponentialLinear¶

class pydeep.base.activationfunction.ExponentialLinear(alpha=1.0)[source]¶

Exponential Linear activation function function.

Info:	https://en.wikipedia.org/wiki/Activation_function

__init__(alpha=1.0)[source]¶

Constructor.

Parameters:	alpha (scalar) – scaling factor

df(x)[source]¶

Calculates the derivative of the Exponential Linear function value for a given input x.

Parameters:	x (scalar or numpy array.) – Input data.
Returns:	Value of the derivative of the Exponential Linear function for x.
Return type:	scalar or numpy array with the same shape as x.

f(x)[source]¶

Calculates the Exponential Linear function value for a given input x.

Parameters:	x (scalar or numpy array.) – Input data.
Returns:	Value of the Exponential Linear function for x.
Return type:	scalar or numpy array with the same shape as x.

SigmoidWeightedLinear¶

class pydeep.base.activationfunction.SigmoidWeightedLinear(beta=1.0)[source]¶

Sigmoid weighted linear units (also named Swish)

Info:	https://arxiv.org/pdf/1702.03118v1.pdf and for Swish: https://arxiv.org/pdf/1710.05941.pdf

__init__(beta=1.0)[source]¶

Constructor.

Parameters:	beta (scalar) – scaling factor

df(x)[source]¶

Calculates the derivative of the Sigmoid weighted linear function value for a given input x.

Parameters:	x (scalar or numpy array.) – Input data.
Returns:	Value of the derivative of the Sigmoid weighted linear function for x.
Return type:	scalar or numpy array with the same shape as x.

f(x)[source]¶

Calculates the Sigmoid weighted linear function value for a given input x.

Parameters:	x (scalar or numpy array.) – Input data.
Returns:	Value of the Sigmoid weighted linear function for x.
Return type:	scalar or numpy array with the same shape as x.

SoftPlus¶

class pydeep.base.activationfunction.SoftPlus[source]¶

Soft Plus function.

Info:	http://www.wolframalpha.com/input/?i=log%28exp%28x%29%2B1%29

classmethod ddf(x)[source]¶

Calculates the second derivative of the SoftPlus function value for a given input x.

Parameters:	x (scalar or numpy array) – Input data.
Returns:	Value of the 2nd derivative of the SoftPlus function for x.
Return type:	scalar or numpy array with the same shape as x.

classmethod df(x)[source]¶

Calculates the derivative of the SoftPlus function value for a given input x.

Parameters:	x (scalar or numpy array.) – Input data.
Returns:	Value of the derivative of the SoftPlus function for x.
Return type:	scalar or numpy array with the same shape as x.

classmethod dg(y)[source]¶

Calculates the derivative of the inverse SoftPlus function value for a given input y.

Parameters:	y (scalar or numpy array.) – Input data.
Returns:	Value of the derivative of the inverse SoftPlus function for x.
Return type:	scalar or numpy array with the same shape as y.

classmethod f(x)[source]¶

Calculates the SoftPlus function value for a given input x.

Parameters:	x (scalar or numpy array.) – Input data.
Returns:	Value of the SoftPlus function for x.
Return type:	scalar or numpy array with the same shape as x.

classmethod g(y)[source]¶

Calculates the inverse SoftPlus function value for a given input y.

Parameters:	y (scalar or numpy array.) – Input data.
Returns:	Value of the inverse SoftPlus function for y.
Return type:	scalar or numpy array with the same shape as y.

Step¶

class pydeep.base.activationfunction.Step[source]¶

Step activation function function.

classmethod ddf(x)[source]¶

Calculates the second derivative of the step function value for a given input x.

Parameters:	x (scalar or numpy array.) – Input data.
Returns:	Value of the derivative of the Step function for x.
Return type:	scalar or numpy array with the same shape as x.

classmethod df(x)[source]¶

Calculates the derivative of the step function value for a given input x.

Parameters:	x (scalar or numpy array.) – Input data.
Returns:	Value of the derivative of the step function for x.
Return type:	scalar or numpy array with the same shape as x.

classmethod f(x)[source]¶

Calculates the step function value for a given input x.

Parameters:	x (scalar or numpy array.) – Input data.
Returns:	Value of the step function for x.
Return type:	scalar or numpy array with the same shape as x.

Sigmoid¶

class pydeep.base.activationfunction.Sigmoid[source]¶

Sigmoid function.

Info:	http://www.wolframalpha.com/input/?i=sigmoid

classmethod ddf(x)[source]¶

Calculates the second derivative of the Sigmoid function value for a given input x.

Parameters:	x (scalar or numpy array.) – Input data.
Returns:	Value of the second derivative of the Sigmoid function for x.
Return type:	scalar or numpy array with the same shape as x.

classmethod df(x)[source]¶

Calculates the derivative of the Sigmoid function value for a given input x.

Parameters:	x (scalar or numpy array.) – Input data.
Returns:	Value of the derivative of the Sigmoid function for x.
Return type:	scalar or numpy array with the same shape as x.

classmethod dg(y)[source]¶

Calculates the derivative of the inverse Sigmoid function value for a given input y.

Parameters:	y (scalar or numpy array.) – Input data.
Returns:	Value of the derivative of the inverse Sigmoid function for y.
Return type:	scalar or numpy array with the same shape as y.

classmethod f(x)[source]¶

Calculates the Sigmoid function value for a given input x.

Parameters:	x (scalar or numpy array.) – Input data.
Returns:	Value of the Sigmoid function for x.
Return type:	scalar or numpy array with the same shape as x.

classmethod g(y)[source]¶

Calculates the inverse Sigmoid function value for a given input y.

Parameters:	y (scalar or numpy array.) – Input data.
Returns:	Value of the inverse Sigmoid function for y.
Return type:	scalar or numpy array with the same shape as y.

SoftSign¶

class pydeep.base.activationfunction.SoftSign[source]¶

SoftSign function.

Info:	http://www.wolframalpha.com/input/?i=x%2F%281%2Babs%28x%29%29

classmethod ddf(x)[source]¶

Calculates the second derivative of the SoftSign function value for a given input x.

Parameters:	x (scalar or numpy array.) – Input data.
Returns:	Value of the 2nd derivative of the SoftSign function for x.
Return type:	scalar or numpy array with the same shape as x.

classmethod df(x)[source]¶

Calculates the derivative of the SoftSign function value for a given input x.

Parameters:	x (scalar or numpy array.) – Input data.
Returns:	Value of the SoftSign function for x.
Return type:	scalar or numpy array with the same shape as x.

classmethod f(x)[source]¶

Calculates the SoftSign function value for a given input x.

Parameters:	x (scalar or numpy array.) – Input data.
Returns:	Value of the SoftSign function for x.
Return type:	scalar or numpy array with the same shape as x.

HyperbolicTangent¶

class pydeep.base.activationfunction.HyperbolicTangent[source]¶

HyperbolicTangent function.

Info:	http://www.wolframalpha.com/input/?i=tanh

classmethod ddf(x)[source]¶

Calculates the second derivative of the Hyperbolic Tangent function value for a given input x.

Parameters:	x (scalar or numpy array.) – Input data.
Returns:	Value of the second derivative of the Hyperbolic Tangent function for x.
Return type:	scalar or numpy array with the same shape as x.

classmethod df(x)[source]¶

Calculates the derivative of the Hyperbolic Tangent function value for a given input x.

Parameters:	x (scalar or numpy array.) – Input data.
Returns:	Value of the derivative of the Hyperbolic Tangent function for x.
Return type:	scalar or numpy array with the same shape as x.

classmethod dg(y)[source]¶

Calculates the derivative of the inverse Hyperbolic Tangent function value for a given input y.

Parameters:	y (scalar or numpy array.) – Input data.
Returns:	Value the derivative of the inverse Hyperbolic Tangent function for x.
Return type:	scalar or numpy array with the same shape as y.

classmethod f(x)[source]¶

Calculates the Hyperbolic Tangent function value for a given input x.

Parameters:	x (scalar or numpy array.) – Input data.
Returns:	Value of the Hyperbolic Tangent function for x.
Return type:	scalar or numpy array with the same shape as x.

classmethod g(y)[source]¶

Calculates the inverse Hyperbolic Tangent function value for a given input y.

Parameters:	y (scalar or numpy array.) – Input data.
Returns:	alue of the inverse Hyperbolic Tangent function for y.
Return type:	scalar or numpy array with the same shape as x.

SoftMax¶

class pydeep.base.activationfunction.SoftMax[source]¶

Soft Max function.

Info:	https://en.wikipedia.org/wiki/Activation_function

classmethod df(x)[source]¶

Calculates the derivative of the SoftMax function value for a given input x.

Parameters:	x (scalar or numpy array) – Input data.
Returns:	Value of the derivative of the SoftMax function for x.
Return type:	scalar or numpy array with the same shape as x.

classmethod f(x)[source]¶

Calculates the function value of the SoftMax function value for a given input x.

Parameters:	x (scalar or numpy array) – Input data.
Returns:	Value of the SoftMax function for x.
Return type:	scalar or numpy array with the same shape as x.

RadialBasis¶

class pydeep.base.activationfunction.RadialBasis(mean=0.0, variance=1.0)[source]¶

Radial Basis function.

Info:	http://www.wolframalpha.com/input/?i=Gaussian

__init__(mean=0.0, variance=1.0)[source]¶

Constructor.

Parameters:	mean (scalar or numpy array) – Mean of the function. variance (scalar or numpy array) – Variance of the function.

ddf(x)[source]¶

Calculates the second derivative of the Radial Basis function value for a given input x.

Parameters:	x (scalar or numpy array) – Input data.
Returns:	Value of the second derivative of the Radial Basis function for x.
Return type:	scalar or numpy array with the same shape as x.

df(x)[source]¶

Calculates the derivative of the Radial Basis function value for a given input x.

Parameters:	x (scalar or numpy array) – Input data.
Returns:	Value of the derivative of the Radial Basis function for x.
Return type:	scalar or numpy array with the same shape as x.

f(x)[source]¶

Calculates the Radial Basis function value for a given input x.

Parameters:	x (scalar or numpy array) – Input data.
Returns:	Value of the Radial Basis function for x.
Return type:	scalar or numpy array with the same shape as x.

Sinus¶

class pydeep.base.activationfunction.Sinus[source]¶

Sinus function.

Info:	http://www.wolframalpha.com/input/?i=sin(x)

classmethod ddf(x)[source]¶

Calculates the second derivative of the Sinus function value for a given input x.

Parameters:	x (scalar or numpy array) – Input data.
Returns:	Value of the second derivative of the Sinus function for x.
Return type:	scalar or numpy array with the same shape as x.

classmethod df(x)[source]¶

Calculates the derivative of the Sinus function value for a given input x.

Parameters:	x (scalar or numpy array) – Input data.
Returns:	Value of the derivative of the Sinus function for x.
Return type:	scalar or numpy array with the same shape as x.

classmethod f(x)[source]¶

Calculates the function value of the Sinus function value for a given input x.

Parameters:	x (scalar or numpy array) – Input data.
Returns:	Value of the Sinus function for x.
Return type:	scalar or numpy array with the same shape as x.

KWinnerTakeAll¶

class pydeep.base.activationfunction.KWinnerTakeAll(k, axis=1, activation_function=<pydeep.base.activationfunction.Identity object>)[source]¶

K Winner take all activation function.

WARNING:	The derivative gets already calcluated in the forward pass. Thus, for the same data-point the order should always be forward_pass, backward_pass!

__init__(k, axis=1, activation_function=<pydeep.base.activationfunction.Identity object>)[source]¶

Constructor.

Parameters:	k (Instance of an activation function) – Number of active units. axis (int) – Axis to compute the maximum. k – activation_function

df(x)[source]¶

Calculates the derivative of the KWTA function.

Parameters:	x (scalar or numpy array) – Input data.
Returns:	Derivative of the KWTA function
Return type:	scalar or numpy array with the same shape as x.

f(x)[source]¶

Calculates the K-max function value for a given input x.

Parameters:	x (scalar or numpy array) – Input data.
Returns:	Value of the Kmax function for x.
Return type:	scalar or numpy array with the same shape as x.

basicstructure¶

This module provides basic structural elements, which different models have in common.

Implemented:	BipartiteGraph StackOfBipartiteGraphs
Version:	1.1.0
Date:	06.04.2017
Author:	Jan Melchior
Contact:	JanMelchior@gmx.de
License:	Copyright (C) 2017 Jan Melchior This file is part of the Python library PyDeep. PyDeep is free software: you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation, either version 3 of the License, or (at your option) any later version. This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details. You should have received a copy of the GNU General Public License along with this program. If not, see <http://www.gnu.org/licenses/>.

BipartiteGraph¶

class pydeep.base.basicstructure.BipartiteGraph(number_visibles, number_hiddens, data=None, visible_activation_function=<class 'pydeep.base.activationfunction.Sigmoid'>, hidden_activation_function=<class 'pydeep.base.activationfunction.Sigmoid'>, initial_weights='AUTO', initial_visible_bias='AUTO', initial_hidden_bias='AUTO', initial_visible_offsets='AUTO', initial_hidden_offsets='AUTO', dtype=<type 'numpy.float64'>)[source]¶

Implementation of a bipartite graph structure.

__init__(number_visibles, number_hiddens, data=None, visible_activation_function=<class 'pydeep.base.activationfunction.Sigmoid'>, hidden_activation_function=<class 'pydeep.base.activationfunction.Sigmoid'>, initial_weights='AUTO', initial_visible_bias='AUTO', initial_hidden_bias='AUTO', initial_visible_offsets='AUTO', initial_hidden_offsets='AUTO', dtype=<type 'numpy.float64'>)[source]¶

This function initializes all necessary parameters and data structures. It is recommended to pass the training data to initialize the network automatically.

Parameters:

number_visibles (int) – Number of the visible variables.
number_hiddens (int) – Number of the hidden variables.
data (None or numpy array [num samples, input dim]) – The training data for parameter initialization if ‘AUTO’ is chosen for the corresponding parameter.
visible_activation_function (pydeep.base.activationFunction) – Activation function for the visible units.
hidden_activation_function (pydeep.base.activationFunction) – Activation function for the hidden units.
initial_weights ('AUTO', scalar or numpy array [input dim, output_dim]) – Initial weights. ‘AUTO’ and a scalar are random init.
initial_visible_bias ('AUTO','INVERSE_SIGMOID', scalar or numpy array [1, input dim]) – Initial visible bias. ‘AUTO’ is random, ‘INVERSE_SIGMOID’ is the inverse Sigmoid of the visible mean. If a scalar is passed all values are initialized with it.
initial_hidden_bias ('AUTO','INVERSE_SIGMOID', scalar or numpy array [1, output_dim]) – Initial hidden bias. ‘AUTO’ is random, ‘INVERSE_SIGMOID’ is the inverse Sigmoid of the hidden mean. If a scalar is passed all values are initialized with it.
initial_visible_offsets ('AUTO', scalar or numpy array [1, input dim]) – Initial visible offset values. AUTO=data mean or 0.5 if no data is given. If a scalar is passed all values are initialized with it
initial_hidden_offsets ('AUTO', scalar or numpy array [1, output_dim]) – Initial hidden offset values. AUTO = 0.5 If a scalar is passed all values are initialized with it.
dtype (numpy.float32 or numpy.float64 or numpy.longdouble) – Used data type i.e. numpy.float64.

_add_hidden_units(num_new_hiddens, position=0, initial_weights='AUTO', initial_bias='AUTO', initial_offsets='AUTO')[source]¶

This function adds new hidden units at the given position to the model. .. Warning:: If the parameters are changed. the trainer needs to be reinitialized.

Parameters:

num_new_hiddens (int) – The number of new hidden units to add.
position (int) – Position where the units should be added.
initial_weights ('AUTO' or scalar or numpy array [input_dim, num_new_hiddens]) – The initial weight values for the hidden units.
initial_bias ('AUTO' or scalar or numpy array [1, num_new_hiddens]) – The initial hidden bias values.
initial_offsets ('AUTO' or scalar or numpy array [1, num_new_hiddens]) – The initial hidden mean values.

_add_visible_units(num_new_visibles, position=0, initial_weights='AUTO', initial_bias='AUTO', initial_offsets='AUTO', data=None)[source]¶

This function adds new visible units at the given position to the model.: Warning

If the parameters are changed. the trainer needs to be reinitialized.

Parameters:

num_new_visibles (int) – The number of new hidden units to add
position (int) – Position where the units should be added.
initial_weights ('AUTO' or scalar or numpy array [num_new_visibles, output_dim]) – The initial weight values for the hidden units.
initial_bias (numpy array [1, num_new_visibles]) – The initial hidden bias values.
initial_offsets (numpy array [1, num_new_visibles]) – The initial visible offset values.
data (numpy array [num datapoints, num_new_visibles]) – Data for AUTO initialization.

_hidden_post_activation(pre_act_h)[source]¶

Computes the Hidden (post) activations from hidden pre-activations.

Parameters:	pre_act_h (numpy array [num data points, output_dim]) – Hidden pre-activations.
Returns:	Hidden activations.
Return type:	numpy array [num data points, output_dim]

_hidden_pre_activation(v)[source]¶

Computes the Hidden pre-activations from visible activations.

Parameters:	v (numpy array [num data points, input_dim]) – Visible activations.
Returns:	Hidden pre-synaptic activations.
Return type:	numpy array [num data points, output_dim]

_remove_hidden_units(indices)[source]¶

This function removes the hidden units whose indices are given. .. Warning:: If the parameters are changed. the trainer needs to be reinitialized.

Parameters:	indices (int or list of int or numpy array of int) – Indices to remove.

_remove_visible_units(indices)[source]¶

This function removes the visible units whose indices are given.: Warning

If the parameters are changed. the trainer needs to be reinitialized.

Parameters:	indices (int or list of int or numpy array of int) – Indices of units to be remove.

_visible_post_activation(pre_act_v)[source]¶

Computes the visible (post) activations from visible pre-activations.

Parameters:	pre_act_v (numpy array [num data points, input_dim]) – Visible pre-activations.
Returns:	Visible activations.
Return type:	numpy array [num data points, input_dim]

_visible_pre_activation(h)[source]¶

Computes the visible pre-activations from hidden activations.

Parameters:	h (numpy array [num data points, output_dim]) – Hidden activations.
Returns:	Visible pre-synaptic activations.
Return type:	numpy array [num data points, input_dim]

get_parameters()[source]¶

This function returns all model parameters in a list.

Returns:	The parameter references in a list.
Return type:	list

hidden_activation(v)[source]¶

Computes the Hidden (post) activations from visible activations.

Parameters:	v (numpy array [num data points, input_dim]) – Visible activations.
Returns:	Hidden activations.
Return type:	numpy array [num data points, output_dim]

update_offsets(new_visible_offsets=0.0, new_hidden_offsets=0.0, update_visible_offsets=1.0, update_hidden_offsets=1.0)[source]¶

This function updates the visible and hidden offsets. | –> update_offsets(0,0,1,1) reparameterizes to the normal binary RBM.

Parameters:	new_visible_offsets (numpy arrays [1, input dim]) – New visible means. new_hidden_offsets (numpy arrays [1, output dim]) – New hidden means. update_visible_offsets (float) – Update/Shifting factor for the visible means. update_hidden_offsets (float) – Update/Shifting factor for the hidden means.

update_parameters(updates)[source]¶

This function updates all parameters given the updates derived by the training methods.

Parameters:	updates (list of numpy arrays (num para. x [para.shape])) – Parameter gradients.

visible_activation(h)[source]¶

Computes the visible (post) activations from hidden activations.

Parameters:	h (numpy array [num data points, output_dim]) – Hidden activations.
Returns:	Visible activations.
Return type:	numpy array [num data points, input_dim]

StackOfBipartiteGraphs¶

class pydeep.base.basicstructure.StackOfBipartiteGraphs(list_of_layers)[source]¶

Stacked network layers

__init__(list_of_layers)[source]¶

Initializes the network with auto encoders.

Parameters:	list_of_layers (list) – List of Layers i.e. BipartiteGraph.

_check_network()[source]¶: Check whether the network is consistent and raise an exception if it is not the case.

append_layer(layer)[source]¶

Appends the model to the network.

Parameters:	layer (Layer object i.e. BipartiteGraph.) – Layer object.

backward_propagate(output_data)[source]¶

Propagates the output back through the input.

Parameters:	output_data (numpy array [batchsize x output dim]) – Output data.
Returns:	Input of the network.
Return type:	numpy array [batchsize x input dim]

depth¶: Networks depth/ number of layers.

forward_propagate(input_data)[source]¶

Propagates the data through the network.

Parameters:	input_data (numpy array [batchsize x input dim]) – Input data.
Returns:	Output of the network.
Return type:	numpy array [batchsize x output dim]

num_layers¶: Networks depth/ number of layers.

pop_last_layer()[source]¶: Removes/pops the last layer in the network.

reconstruct(input_data)[source]¶

Reconstructs the data by propagating the data to the output and back to the input.

Parameters:	input_data (numpy array [batchsize x input dim]) – Input data.
Returns:	Output of the network.
Return type:	numpy array [batchsize x output dim]

save(path, save_states=False)[source]¶

Saves the network.

Parameters:	path (string.) – Filename+path. save_states (bool) – If true the current states are saved.

corruptor¶

This module provides implementations for corrupting the training data.

Implemented:	Identity Sampling Binary BinaryNoise Additive Gauss Noise Multiplicative Gauss Noise Dropout Random Permutation KeepKWinner KWinnerTakesAll
Info:	http://ufldl.stanford.edu/wiki/index.php/Sparse_Coding:_Autoencoder_Interpretation
Version:	1.1.0
Date:	13.03.2017
Author:	Jan Melchior
Contact:	JanMelchior@gmx.de
License:	Copyright (C) 2017 Jan Melchior This file is part of the Python library PyDeep. PyDeep is free software: you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation, either version 3 of the License, or (at your option) any later version. This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details. You should have received a copy of the GNU General Public License along with this program. If not, see <http://www.gnu.org/licenses/>.

Identity¶

class pydeep.base.corruptor.Identity[source]¶

Dummy corruptor object.

classmethod corrupt(data)[source]¶

The function corrupts the data.

Parameters:	data (numpy array [num samples, layer dim]) – Input of the layer.
Returns:	Corrupted data.
Return type:	numpy array [num samples, layer dim]

AdditiveGaussNoise¶

class pydeep.base.corruptor.AdditiveGaussNoise(mean, std)[source]¶

An object that corrupts data by adding Gauss noise.

__init__(mean, std)[source]¶

The function corrupts the data.

Parameters:	mean (float) – Constant the data is shifted std (float) – Standard deviation Added to the data.

corrupt(data)[source]¶

The function corrupts the data.

Parameters:	data (numpy array [num samples, layer dim]) – Input of the layer.
Returns:	Corrupted data.
Return type:	numpy array [num samples, layer dim]

MultiGaussNoise¶

class pydeep.base.corruptor.MultiGaussNoise(mean, std)[source]¶

An object that corrupts data by multiplying Gauss noise.

__init__(mean, std)[source]¶

Corruptor contructor.

Parameters:	mean (float) – Constant the data is shifted std (float) – Standard deviation Added to the data.

corrupt(data)[source]¶

The function corrupts the data.

Parameters:	data (numpy array [num samples, layer dim]) – Input of the layer.
Returns:	Corrupted data.
Return type:	numpy array [num samples, layer dim]

SamplingBinary¶

class pydeep.base.corruptor.SamplingBinary[source]¶

Sample binary states (zero out) corruption.

classmethod corrupt(data)[source]¶

The function corrupts the data.

Parameters:	data (numpy array [num samples, layer dim]) – Input of the layer.
Returns:	Corrupted data.
Return type:	numpy array [num samples, layer dim]

Dropout¶

class pydeep.base.corruptor.Dropout(dropout_percentage=0.2)[source]¶

Dropout (zero out) corruption.

__init__(dropout_percentage=0.2)[source]¶

Corruptor contructor.

Parameters:	dropout_percentage (float) – Dropout percentage

corrupt(data)[source]¶

The function corrupts the data.

Parameters:	data (numpy array [num samples, layer dim]) – Input of the layer.
Returns:	Corrupted data.
Return type:	numpy array [num samples, layer dim]

RandomPermutation¶

class pydeep.base.corruptor.RandomPermutation(permutation_percentage=0.2)[source]¶

RandomPermutation corruption, a fix number of units change their activation values.

__init__(permutation_percentage=0.2)[source]¶

Corruptor contructor.

Parameters:	permutation_percentage (float) – permutation_percentage: Percentage of states to permute

corrupt(data)[source]¶

The function corrupts the data.

Parameters:	data (numpy array [num samples, layer dim]) – Input of the layer.
Returns:	Corrupted data.
Return type:	numpy array [num samples, layer dim]

KeepKWinner¶

class pydeep.base.corruptor.KeepKWinner(k=10, axis=0)[source]¶

Implements K Winner stay. Keep the k max values and set the rest to 0.

__init__(k=10, axis=0)[source]¶

Corruptor contructor.

Parameters:	k (int) – Keep the k max values and set the rest to 0. axis (int) – Axis =0 across min batch, axis = 1 across hidden units

corrupt(data)[source]¶

The function corrupts the data.

Parameters:	data (numpy array [num samples, layer dim]) – Input of the layer.
Returns:	Corrupted data.
Return type:	numpy array [num samples, layer dim]

KWinnerTakesAll¶

class pydeep.base.corruptor.KWinnerTakesAll(k=10, axis=0)[source]¶

Implements K Winner takes all. Keep the k max values and set the rest to 0.

__init__(k=10, axis=0)[source]¶

Corruptor constructor.

Parameters:	k (int) – Keep the k max values and set the rest to 0. axis (int) – Axis =0 across min batch, axis = 1 across hidden units

corrupt(data)[source]¶

The function corrupts the data.

Parameters:	data (numpy array [num samples, layer dim]) – Input of the layer.
Returns:	Corrupted data.
Return type:	numpy array [num samples, layer dim]

costfunction¶

Different kind of cost functions and their derivatives.

Implemented:	Squared error Absolute error Cross entropy Negative Log-likelihood
Version:	1.1.0
Date:	13.03.2017
Author:	Jan Melchior
Contact:	JanMelchior@gmx.de
License:	Copyright (C) 2017 Jan Melchior This file is part of the Python library PyDeep. PyDeep is free software: you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation, either version 3 of the License, or (at your option) any later version. This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details. You should have received a copy of the GNU General Public License along with this program. If not, see <http://www.gnu.org/licenses/>.

SquaredError¶

class pydeep.base.costfunction.SquaredError[source]¶

Mean Squared error.

classmethod df(x, t)[source]¶

Calculates the derivative of the Squared Error value for a given input x and target t.

Parameters:	x (scalar or numpy array) – Input data. t (scalar or numpy array) – Target vales.
Returns:	Value of the derivative of the cost function for x and t.
Return type:	scalar or numpy array with the same shape as x and t.

classmethod f(x, t)[source]¶

Calculates the Squared Error value for a given input x and target t.

Parameters:	x (scalar or numpy array) – Input data. t (scalar or numpy array) – Target vales
Returns:	Value of the cost function for x and t.
Return type:	scalar or numpy array with the same shape as x and t.

AbsoluteError¶

class pydeep.base.costfunction.AbsoluteError[source]¶

Absolute error.

classmethod df(x, t)[source]¶

Calculates the derivative of the absolute error value for a given input x and target t.

Parameters:	x (scalar or numpy array) – Input data. t (scalar or numpy array) – Target vales.
Returns:	Value of the derivative of the cost function for x and t.
Return type:	scalar or numpy array with the same shape as x and t.

classmethod f(x, t)[source]¶

Calculates the absolute error value for a given input x and target t.

Parameters:	x (scalar or numpy array) – Input data. t (scalar or numpy array) – Target vales
Returns:	Value of the cost function for x and t.
Return type:	scalar or numpy array with the same shape as x and t.

CrossEntropyError¶

class pydeep.base.costfunction.CrossEntropyError[source]¶

Cross entropy functions.

classmethod df(x, t)[source]¶

Calculates the derivative of the cross entropy value for a given input x and target t.

Parameters:	x (scalar or numpy array) – Input data. t (scalar or numpy array) – Target vales.
Returns:	Value of the derivative of the cost function for x and t.
Return type:	scalar or numpy array with the same shape as x and t.

classmethod f(x, t)[source]¶

Calculates the cross entropy value for a given input x and target t.

Parameters:	x (scalar or numpy array) – Input data. t (scalar or numpy array) – Target vales
Returns:	Value of the cost function for x and t.
Return type:	scalar or numpy array with the same shape as x and t.

NegLogLikelihood¶

class pydeep.base.costfunction.NegLogLikelihood[source]¶

Negative log likelihood function.

classmethod df(x, t)[source]¶

Calculates the derivative of the negative log-likelihood value for a given input x and target t.

Parameters:	x (scalar or numpy array) – Input data. t (scalar or numpy array) – Target vales.
Returns:	Value of the derivative of the cost function for x and t.
Return type:	scalar or numpy array with the same shape as x and t.

classmethod f(x, t)[source]¶

Calculates the negative log-likelihood value for a given input x and target t.

Parameters:	x (scalar or numpy array) – Input data. t (scalar or numpy array) – Target vales
Returns:	Value of the cost function for x and t.
Return type:	scalar or numpy array with the same shape as x and t.

numpyextension¶

This module provides different math functions that extend the numpy library.

Implemented:	log_sum_exp log_diff_exp get_norms multinominal_batch_sampling restrict_norms resize_norms angle_between_vectors get_2D_gauss_kernel generate_binary_code get_binary_label compare_index_of_max shuffle_dataset rotationSequence generate_2D_connection_matrix
Version:	1.1.0
Date:	13.03.2017
Author:	Jan Melchior
Contact:	JanMelchior@gmx.de
License:	Copyright (C) 2017 Jan Melchior This file is part of the Python library PyDeep. PyDeep is free software: you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation, either version 3 of the License, or (at your option) any later version. This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details. You should have received a copy of the GNU General Public License along with this program. If not, see <http://www.gnu.org/licenses/>.

log_sum_exp¶

numpyextension.log_sum_exp(axis=0)¶

Calculates the logarithm of the sum of e to the power of input ‘x’. The method tries to avoid overflows by using the relationship: log(sum(exp(x))) = alpha + log(sum(exp(x-alpha))).

Parameters:	x (float or numpy array) – data. axis (int) – Sums along the given axis.
Returns:	Logarithm of the sum of exp of x.
Return type:	float or numpy array.

log_diff_exp¶

numpyextension.log_diff_exp(axis=0)¶

Calculates the logarithm of the diffs of e to the power of input ‘x’. The method tries to avoid overflows by using the relationship: log(diff(exp(x))) = alpha + log(diff(exp(x-alpha))).

Parameters:	x (float or numpy array) – data. axis (int) – Diffs along the given axis.
Returns:	Logarithm of the diff of exp of x.
Return type:	float or numpy array.

multinominal_batch_sampling¶

numpyextension.multinominal_batch_sampling(isnormalized=True)¶

Sample states where only one entry is one and the rest is zero according to the given probablities.

Parameters:	probabilties (numpy array [batchsize, number of states]) – Matrix containing probabilities the rows have to sum to one, otherwise chosen normalized=False. isnormalized (bool) – If True the probabilities are assumed to be normalized. If False the probabilities are normalized.
Returns:	Sampled multinominal states.
Return type:	numpy array [batchsize, number of states]

get_norms¶

numpyextension.get_norms(axis=0)¶

Computes the norms of the matrix along a given axis.

Parameters:	matrix (numpy array [num rows, num columns]) – Matrix to get the norm of. axis (int, None) – Axis along the norm should be calculated. 0 = rows, 1 = cols, None = Matrix norm
Returns:	Norms along the given axis.
Return type:	numpy array or float

restrict_norms¶

numpyextension.restrict_norms(max_norm, axis=0)¶

This function restricts a matrix, its columns or rows to a given norm.

Parameters:	matrix (numpy array [num rows, num columns]) – Matrix that should be restricted. max_norm (float) – The maximal data norm. axis (int, None) – Restriction of the matrix along the given axis or the full matrix.
Returns:	Restricted matrix
Return type:	numpy array [num rows, num columns]

resize_norms¶

numpyextension.resize_norms(norm, axis=0)¶

This function resizes a matrix, its columns or rows to a given norm.

Parameters:	matrix (numpy array [num rows, num columns]) – Matrix that should be resized. norm (float) – The norm to restrict the matrix to. axis (int, None) – Resize of the matrix along the given axis.
Returns:	Resized matrix, however it is inplace
Return type:	numpy array [num rows, num columns]

angle_between_vectors¶

numpyextension.angle_between_vectors(v2, degree=True)¶

Computes the angle between two vectors.

Parameters:	v1 (numpy array) – Vector 1. v2 (numpy array) – Vector 2. degree (bool) – If true degrees is return, rad otherwise.
Returns:	Angle
Return type:	float

get_2d_gauss_kernel¶

numpyextension.get_2d_gauss_kernel(height, shift=0, var=[1.0, 1.0])¶

Creates a 2D Gauss kernel of size NxM with variance 1.

Parameters:	width (int) – Number of pixels first dimension. height (int) – Number of pixels second dimension. shift (int, 1D numpy array) – The Gaussian is shifted by this amount from the center of the image. Passing a scalar -> x,y shifted by the same value Passing a vector -> x,y shifted accordingly var (int, 1D numpy array or 2D numpy array) – Variances or Covariance matrix. Passing a scalar -> Isotropic Gaussian Passing a vector -> Spherical covariance with vector values on the diagonals. Passing a matrix -> Full Gaussian
Returns:	Bit array containing the states.
Return type:	numpy array [num samples, bit_length]

generate_binary_code¶

numpyextension.generate_binary_code(batch_size_exp=None, batch_number=0)¶

This function can be used to generate all possible binary vectors of length ‘bit_length’. It is possible to generate only a particular batch of the data, where ‘batch_size_exp’ controls the size of the batch (batch_size = 2**batch_size_exp) and ‘batch_number’ is the index of the batch that should be generated.

Example:

bit_length = 2, batchSize = 2

-> All combination = 2^bit_length = 2^2 = 4

-> All_combinations / batchSize = 4 / 2 = 2 batches

-> _generate_bit_array(2, 2, 0) = [0,0],[0,1]

-> _generate_bit_array(2, 2, 1) = [1,0],[1,1]

Parameters:	bit_length (int) – Length of the bit vectors. batch_size_exp (int) – Size of the batch of data. Here: batch_size = 2batch_size_exp batch_number** (int) – Index of the batch.
Returns:	Bit array containing the states .
Return type:	numpy array [num samples, bit_length]

get_binary_label¶

numpyextension.get_binary_label()¶

This function converts a 1D-array with integers labels into a 2D-array containing binary labels.

Example:

-> [3,1,0]|

-> [[1,0,0,0],[0,0,1,0],[0,0,0,1]]

Parameters:	int_array (int) – 1D array containing integers
Returns:	2D array with binary labels.
Return type:	numpy array [num samples, num labels]

compare_index_of_max¶

numpyextension.compare_index_of_max(target)¶

Compares data rows by comparing the index of the maximal value e.g. Classifier output and true labels.

Example:

[0.3,0.5,0.2],[0.2,0.6,0.2] -> 0

[0.3,0.5,0.2],[0.6,0.2,0.2] -> 1

Parameters:	output (numpy array [batchsize, output_dim]) – vectors usually containing label probabilties. target (numpy array [batchsize, output_dim]) – vectors usually containing true labels.
Returns:	Int array containging 0 is the two rows hat the maximum at the same index, 1 otherwise.
Return type:	numpy array [num samples, num labels]

shuffle_dataset¶

numpyextension.shuffle_dataset(label)¶

Shuffles the data points and the labels correspondingly.

Parameters:	data (numpy array [num_datapoints, dim_datapoints]) – Datapoints. label (numpy array [num_datapoints]) – Labels.
Returns:	Shuffled datapoints and labels.
Return type:	List of numpy arrays

rotation_sequence¶

numpyextension.rotation_sequence(width, height, steps)¶

Rotates a 2D image given as a 1D vector with shape[width*height] in ‘steps’ number of steps.

Parameters:	image (int) – Image as 1D vector. width (int) – Width of the image such that image.shape[0] = widthheight. height* (int) – Height of the image such that image.shape[0] = widthheight. steps* (int) – Number of rotation steps e.g. 360 each steps is 1 degree.
Returns:	Bool array containging True is the two rows hat the maximum at the same index, False otherwise.
Return type:	numpy array [num samples, num labels]

generate_2d_connection_matrix¶

numpyextension.generate_2d_connection_matrix(input_y_dim, field_x_dim, field_y_dim, overlap_x_dim, overlap_y_dim, wrap_around=True)¶

This function constructs a connection matrix, which can be used to force the weights to have local receptive fields.

Example:

input_x_dim = 3,

input_y_dim = 3,

field_x_dim = 2,

field_y_dim = 2,

overlap_x_dim = 1,

overlap_y_dim = 1,

wrap_around=False)

leads to numx.array([[1,1,0,1,1,0,0,0,0],

[0,1,1,0,1,1,0,0,0],

[0,0,0,1,1,0,1,1,0],

[0,0,0,0,1,1,0,1,1]]).T

Parameters:	input_x_dim (int) – Input dimension. input_y_dim (int) – Output dimension. field_x_dim (int) – Size of the receptive field in dimension x. field_y_dim (int) – Size of the receptive field in dimension y. overlap_x_dim (int) – Overlap of the receptive fields in dimension x. overlap_y_dim (int) – Overlap of the receptive fields in dimension y. wrap_around (bool) – If true teh overlap has warp around in both dimensions.
Returns:	Connection matrix.
Return type:	numpy arrays [input dim, output dim]

misc¶

Package providing miscellaneous functionalities such as datsets, input-output, visualization, profiling methods …

Version:	1.1.0
Date:	19.03.2017
Author:	Jan Melchior
Contact:	JanMelchior@gmx.de
License:	Copyright (C) 2017 Jan Melchior This file is part of the Python library PyDeep. PyDeep is free software: you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation, either version 3 of the License, or (at your option) any later version. This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details. You should have received a copy of the GNU General Public License along with this program. If not, see <http://www.gnu.org/licenses/>.

io¶

This class contains methods to read and write data.

Implemented:	Save/Load arbitrary objects. Save/Load images. Load MNIST. Load CIFAR. Load Caltech. Load olivietti face dataset Load nactural image patches Load UCI binary dataset Adult dataset Connect4 dataset Nips dataset Web dataset RCV1 dataset Mushrooms dataset DNA dataset OCR_letters dataset
Version:	1.1.0
Date:	29.03.2018
Author:	Jan Melchior
Contact:	JanMelchior@gmx.de
License:	Copyright (C) 2018 Jan Melchior This file is part of the Python library PyDeep. PyDeep is free software: you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation, either version 3 of the License, or (at your option) any later version. This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details. You should have received a copy of the GNU General Public License along with this program. If not, see <http://www.gnu.org/licenses/>.

save_object¶

io.save_object(path, info=True, compressed=True)¶

Saves an object to file.

Parameters:	obj (object) – object to be saved. path (string) – Path and name of the file info (bool) – Prints statements if True compressed (bool) – Object will be compressed before storage.
Returns:
Return type:

save_image¶

io.save_image(path, ext='bmp')¶

Saves a numpy array to an image file.

Parameters:	array (numpy array [width, height]) – Data to save path (string) – Path and name of the directory to save the image at. ext (string) – Extension for the image.

load_object¶

io.load_object(info=True, compressed=True)¶

Loads an object from file.

Parameters:	path (string) – Path and name of the file info (bool) – If True, prints status information. compressed (bool) –
Returns:	Loaded object
Return type:	object

load_image¶

io.load_image(grayscale=False)¶

Loads an image to numpy array.

Parameters:	path (string) – Path and name of the directory to save the image at. grayscale (bool) – If true image is converted to gray scale.
Returns:	Loaded image.
Return type:	numpy array [width, height]

download_file¶

io.download_file(path, buffer_size=1048576)¶

Downloads an saves a dataset from a given url.

Parameters:	url (string) – URL including filename (e.g. www.testpage.com/file1.zip) path (string, None) – Path the dataset should be stored including filename (e.g. /home/file1.zip). buffer_size (int) – Size of the streaming buffer in bytes.

load_mnist¶

io.load_mnist(binary=False)¶

Loads the MNIST digit data in binary [0,1] or real values [0,1].

Parameters:	path (string) – Path and name of the file to load. binary (bool) – If True returns binary images, real valued between [0,1] if False.
Returns:	MNIST dataset [train_set, train_lab, valid_set, valid_lab, test_set, test_lab]
Return type:	list of numpy arrays

load_caltech¶

io.load_caltech()¶

Loads the Caltech dataset.

Parameters:	path (string) – Path and name of the file to load.
Returns:	CAltech dataset [train_set, train_lab, valid_set, valid_lab, test_set, test_lab]
Return type:	list of numpy arrays

load_cifar¶

io.load_cifar(grayscale=True)¶

Loads the CIFAR dataset in real values [0,1]

Parameters:	path (string) – Path and name of the file to load. grayscale (bool) – If true converts the data to grayscale.
Returns:	CIFAR data and labels.
Return type:	list of numpy arrays ([# samples, 1024],[# samples])

load_natural_image_patches¶

io.load_natural_image_patches()¶

Loads the natural image patches used in the publication ‘Gaussian-binary restricted Boltzmann machines for modeling natural image statistics’.: See also

http://journals.plos.org/plosone/article/authors?id=10.1371/journal.pone.0171015

Parameters:	path (string) – Path and name of the file to load.
Returns:	Natural image dataset
Return type:	numpy array

load_olivetti_faces¶

io.load_olivetti_faces(correct_orientation=True)¶

Loads the Olivetti face dataset 400 images, size 64x64

Parameters:	path (string) – Path and name of the file to load. correct_orientation (bool) – Corrects the orientation of the images.
Returns:	Olivetti face dataset
Return type:	numpy array

measuring¶

This module provides functions for measuring like time measuring for executed code.

Version:	1.1.0
Date:	19.03.2017
Author:	Jan Melchior
Contact:	JanMelchior@gmx.de
License:	Copyright (C) 2017 Jan Melchior This file is part of the Python library PyDeep. PyDeep is free software: you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation, either version 3 of the License, or (at your option) any later version. This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details. You should have received a copy of the GNU General Public License along with this program. If not, see <http://www.gnu.org/licenses/>.

print_progress¶

measuring.print_progress(num_steps, gauge=False, length=50, decimal_place=1)¶

Prints the progress of a system at state ‘step’.

Parameters:	step (int) – Current step between 0 and num_steps-1. num_steps (int) – Total number of steps. gauge (bool) – If true prints a gauge length (int) – Length of the gauge (in number of chars) decimal_place (int) – Number of decimal places to display.

Stopwatch¶

class pydeep.misc.measuring.Stopwatch[source]¶

This class provides a stop watch for measuring the execution time of code.

__init__()[source]¶

Constructor sets the starting time to the current time.

Info:	Will be overwritten by calling start()!

end()[source]¶: Stops/ends the time measuring.

get_end_time()[source]¶

Returns the end time.

Returns:	End time:
Return type:	datetime

get_expected_end_time(iteration, num_iterations)[source]¶

Returns the expected end time.

Parameters:	iteration (int) – Current iteration num_iterations (int) – Total number of iterations.
Returns:	Expected end time.
Return type:	datetime

get_expected_interval(iteration, num_iterations)[source]¶

Returns the expected interval/Time needed till ending.

Parameters:	iteration (int) – Current iteration num_iterations (int) – Total number of iterations.
Returns:	Expected interval.
Return type:	timedelta

get_interval()[source]¶

Returns the current interval.

Returns:	Current interval:
Return type:	timedelta

get_start_time()[source]¶

Returns the starting time.

Returns:	Starting time:
Return type:	datetime

pause()[source]¶: Pauses the time measuring.

resume()[source]¶: Resumes the time measuring.

start()[source]¶: Sets the starting time to the current time.

update(factor=1.0)[source]¶

Updates the internal variables. | Factor can be used to sum up not regular events in a loop: | Lets assume you have a loop over 100 sets and only every 10th | step you execute a function, then use update(factor=0.1) to | measure it.

Parameters:	factor (float) – Sums up factor*current interval

sshthreadpool¶

Provides a thread/script pooling mechanism based on ssh + screen.

Version:	1.1.0
Date:	19.03.2017
Author:	Jan Melchior
Contact:	JanMelchior@gmx.de
License:	Copyright (C) 2017 Jan Melchior This file is part of the Python library PyDeep. PyDeep is free software: you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation, either version 3 of the License, or (at your option) any later version. This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details. You should have received a copy of the GNU General Public License along with this program. If not, see <http://www.gnu.org/licenses/>.

SSHConnection¶

class pydeep.misc.sshthreadpool.SSHConnection(hostname, username, password, max_cpus_usage=2)[source]¶

Handles a SSH connection.

__init__(hostname, username, password, max_cpus_usage=2)[source]¶

Constructor takes hostname, username, password.

Parameters:	hostname (string) – Hostname or address of host. username (string) – SSH username. password (string) – SSH password. max_cpus_usage (int) – Maximal number of cores to be used

connect()[source]¶

Connects to the server.

Returns:	turns True is the connection was sucessful
Return type:	bool

classmethod decrypt(connection, password)[source]¶

Decrypts a connection object and returns it

Parameters:	connection (string) – SSHConnection to be decrypted password (string) – Encryption password
Returns:	Decrypted object
Return type:	SSHConnection

disconnect()[source]¶: Disconnects from the server.

encrypt(password)[source]¶

Encrypts the connection object.

Parameters:	password (string) – Encryption password
Returns:	Encrypted object
Return type:	object

execute_command(command)[source]¶

Executes a command on the server and returns stdin, stdout, and stderr

Parameters:	command (string) – Command to be executed.
Returns:	stdin, stdout, and stderr
Return type:	list

execute_command_in_screen(command)[source]¶

Executes a command in a screen on the server which is automatically detached and returns stdin, stdout, and stderr. Screen closes automatically when the job is: done.

Parameters:	command (string) – Command to be executed.
Returns:	stdin, stdout, and stderr
Return type:	list

get_number_users_processes()[source]¶

Gets number of processes of the user on the server.

Returns:	number of processes
Return type:	int or None

get_number_users_screens()[source]¶

Gets number of users screens on the server.

Returns:	number of users screens on the server.
Return type:	int or None

get_server_info()[source]¶

Get the server info like number of cpus, meomory size and stores it in the corresponding variables.

Returns:	online or offline FLAG
Return type:	string

get_server_load()[source]¶

Get the current cpu and memory of the server.

Returns:	Average CPU(s) usage last 1 min, Average CPU(s) usage last 5 min, Average CPU(s) usage last 15 min, Average memory usage,
Return type:	list

kill_all_processes()[source]¶

Kills all processes.

Returns:	stdin, stdout, and stderr
Return type:	list

kill_all_screen_processes()[source]¶

Kills all acreen processes.

Returns:	stdin, stdout, and stderr
Return type:	list

renice_processes(value)[source]¶

Renices all processes.

Parameters:	value (int or string) – The New nice value -40 … 20
Returns:	stdin, stdout, and stderr
Return type:	list

SSHJob¶

class pydeep.misc.sshthreadpool.SSHJob(command, num_threads=1, nice=19)[source]¶

Handles a SSH JOB.

__init__(command, num_threads=1, nice=19)[source]¶

Saves the encrypted serverlist to path.

Parameters:	command (string) – Command to be extecuted. num_threads (int) – Number of threads the job needs. nice (int) – Nice value for this job.

SSHPool¶

class pydeep.misc.sshthreadpool.SSHPool(servers)[source]¶

Handles a pool of servers and allows to distribute jobs over the pool.

__init__(servers)[source]¶

Constructor takes a list of SSHConnections.

Parameters:	servers (list) – List of SSHConnections.

broadcast_command(command)[source]¶

Executes a command an all servers.

Parameters:	command (string) – Command to be executed
Returns:	list of all stdin, stdout, and stderr
Return type:	list

broadcast_kill_all()[source]¶

Kills all processes on the server of the corresponding user.

Returns:	list of all stdin, stdout, and stderr
Return type:	list

broadcast_kill_all_screens()[source]¶

Kills all screens on the server of the corresponding user.

Returns:	list of all stdin, stdout, and stderr
Return type:	list

distribute_jobs(jobs, status=False, ignore_load=False, sort_server=True)[source]¶

Distributes the jobs over the servers.

Parameters:	jobs (string or SSHConnection) – List of SSHJobs to be executeed on the servers. status (bool) – If true prints info about which job was started on which server. ignore_load (bool) – If true starts the job without caring about the current load. sort_server (bool) – If True Servers will be sorted by load.
Returns:	List of all started jobs and list of all remaining jobs
Return type:	list, list

execute_command(host, command)[source]¶

Executes a command on a given server servers.

Parameters:	host (string or SSHConnection) – Hostname or connection object command (string) – Command to be executed
Returns:
Return type:

execute_command_in_screen(host, command)[source]¶

Executes a command in a screen on a given server servers.

Parameters:	host (string or SSHConnection) – Hostname or connection object command (string) – Command to be executed
Returns:	list of all stdin, stdout, and stderr
Return type:	list

get_servers_info(status=True)[source]¶

Reads the status of all servers, the information is stored: in the SSHConnection objects. Additionally print to the console if status == True.

Parameters:	status (bool) – If true prints info.

get_servers_status()[source]¶

Reads the status of all servers and returns it a list. Additionally print to the console if status == True.

Returns:	list of header and list corresponding status information
Return type:	list, list

load_server(path, password, append=True)[source]¶

Parameters:	path (string) – Path and filename. password (string) – Encrption password. append (bool) – If true, servers get append to list, if false server list gets replaced.

save_server(path, password)[source]¶

Saves the encrypted serverlist to path.

Parameters:	path (string) – Path and filename password (string) – Encrption password

toyproblems¶

This class contains some example toy problems for RBMs.

Implemented:	Bars and Stripes dataset Shifting bars dataset 2D mixture of Laplacians
Version:	1.1.0
Date:	19.03.2017
Author:	Jan Melchior
Contact:	JanMelchior@gmx.de
License:	Copyright (C) 2017 Jan Melchior This file is part of the Python library PyDeep. PyDeep is free software: you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation, either version 3 of the License, or (at your option) any later version. This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details. You should have received a copy of the GNU General Public License along with this program. If not, see <http://www.gnu.org/licenses/>.

generate_2d_mixtures¶

toyproblems.generate_2d_mixtures(mean=0.0, scale=0.7071067811865476)¶

Creates a dataset containing 2D data points from a random mixtures of two independent Laplacian distributions.

Info:	Every sample is a 2-dimensional mixture of two sources. The sources can either be super_gauss or sub_gauss.

If x is one sample generated by mixing s, i.e. x = A*s, then the mixing_matrix is A.

Parameters:	num_samples (int) – The number of training samples. mean (float) – The mean of the two independent sources. scale (float) – The scale of the two independent sources.
Returns:	Data and mixing matrix.
Return type:	list of numpy arrays ([num samples, 2], [2,2])

generate_bars_and_stripes¶

toyproblems.generate_bars_and_stripes(num_samples)¶

Creates a dataset containing samples showing bars or stripes.

Parameters:	length (int) – Length of the bars/stripes. num_samples (int) – Number of samples
Returns:	Samples.
Return type:	numpy array [num_samples, length*length]

generate_bars_and_stripes_complete¶

toyproblems.generate_bars_and_stripes_complete()¶

Creates a dataset containing all possible samples showing bars or stripes.

Parameters:	length (int) – Length of the bars/stripes.
Returns:	Samples.
Return type:	numpy array [num_samples, length*length]

generate_shifting_bars¶

toyproblems.generate_shifting_bars(bar_length, num_samples, random=False, flipped=False)¶

Creates a dataset containing random positions of a bar of length “bar_length” in a strip of “length” dimensions.

Parameters:	length (int) – Number of dimensions bar_length (int) – Length of the bar num_samples (int) – Number of samples to generate random (bool) – If true dataset gets shuffled flipped (bool) – If true dataset gets flipped 0–>1 and 1–>0
Returns:	Samples of the shifting bars dataset.
Return type:	numpy array [samples, dimensions]

generate_shifting_bars_complete¶

toyproblems.generate_shifting_bars_complete(bar_length, random=False, flipped=False)¶

Creates a dataset containing all possible positions of a bar of length “bar_length” can take in a strip of “length” dimensions.

Parameters:	length (int) – Number of dimensions bar_length (int) – Length of the bar random (bool) – If true dataset gets shuffled flipped (bool) – If true dataset gets flipped 0–>1 and 1–>0
Returns:	Complete shifting bars dataset.
Return type:	numpy array [samples, dimensions]

visualization¶

This module provides functions for displaying and visualize data. It extends the matplotlib.pyplot.

Implemented:	Tile a matrix rows Tile a matrix columns Show a matrix Show plot Show a histogram Plot data Plot 2D weights Plot PDF-contours Show RBM parameters hidden_activation reorder_filter_by_hidden_activation generate_samples filter_frequency_and_angle filter_angle_response calculate_amari_distance Show the tuning curves Show the optimal gratings Show the frequency angle histogram
Version:	1.1.0
Date:	19.03.2017
Author:	Jan Melchior, Nan Wang
Contact:	JanMelchior@gmx.de
License:	Copyright (C) 2017 Jan Melchior This file is part of the Python library PyDeep. PyDeep is free software: you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation, either version 3 of the License, or (at your option) any later version. This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details. You should have received a copy of the GNU General Public License along with this program. If not, see <http://www.gnu.org/licenses/>.

tile_matrix_columns¶

visualization.tile_matrix_columns(tile_width, tile_height, num_tiles_x, num_tiles_y, border_size=1, normalized=True)¶

Creates a matrix with tiles from columns.

Parameters:	matrix (numpy array 2D) – Matrix to display. tile_width (int) – Tile width dimension. tile_height (int) – Tile height dimension. num_tiles_x (int) – Number of tiles horizontal. num_tiles_y (int) – Number of tiles vertical. border_size (int) – Size of the border. normalized (bool) – If true each image gets normalized to be between 0..1.
Returns:	Matrix showing the 2D patches.
Return type:	2D numpy array

tile_matrix_rows¶

visualization.tile_matrix_rows(tile_width, tile_height, num_tiles_x, num_tiles_y, border_size=1, normalized=True)¶

Creates a matrix with tiles from rows.

Parameters:	matrix (numpy array 2D) – Matrix to display. tile_width (int) – Tile width dimension. tile_height (int) – Tile height dimension. num_tiles_x (int) – Number of tiles horizontal. num_tiles_y (int) – Number of tiles vertical. border_size (int) – Size of the border. normalized (bool) – If true each image gets normalized to be between 0..1.
Returns:	Matrix showing the 2D patches.
Return type:	2D numpy array

imshow_matrix¶

visualization.imshow_matrix(windowtitle, interpolation='nearest')¶

Displays a matrix in gray-scale.

Parameters:	matrix (numpy array) – Data to display windowtitle (string) – Figure title interpolation (string) – Interpolation style

imshow_plot¶

visualization.imshow_plot(windowtitle)¶

Plots the colums of a matrix.

Parameters:	matrix (numpy array) – Data to plot windowtitle (string) – Figure title

imshow_histogram¶

visualization.imshow_histogram(windowtitle, num_bins=10, normed=False, cumulative=False, log_scale=False)¶

Shows a image of the histogram.

Parameters:	matrix (numpy array 2D) – Data to display windowtitle (string) – Figure title num_bins (int) – Number of bins normed (bool) – If true histogram is being normed to 0..1 cumulative (bool) – Show cumulative histogram log_scale (bool) – Use logarithm Y-scaling

plot_2d_weights¶

visualization.plot_2d_weights(bias=array([[0., 0.]]), scaling_factor=1.0, color='random', bias_color='random')¶

Parameters:	weights (numpy array [2,2]) – Weight matrix (weights per column). bias (numpy array [1,2]) – Bias value. scaling_factor (float) – If not 1.0 the weights will be scaled by this factor. color (string) – Color for the weights. bias_color (string) – Color for the bias.

plot_2d_data¶

visualization.plot_2d_data(alpha=0.1, color='navy', point_size=5)¶

Plots the data into the current figure.

Parameters:	data (numpy array) – Data matrix (Datapoint x dimensions). alpha (float) – ranspary value 0.0 = invisible, 1.0 = solid. color (string (color name)) – Color for the data points. point_size (int) – Size of the data points.

plot_2d_contour¶

visualization.plot_2d_contour(value_range=[-5.0, 5.0, -5.0, 5.0], step_size=0.01, levels=20, stylev=None, colormap='jet')¶

Plots the data into the current figure.

Parameters:

probability_function (python method) – Probability function must take 2D array [number of datapoint x 2]
value_range (list with four float entries) – Min x, max x , min y, max y.
step_size (float) – Step size for evaluating the pdf.
levels (int) – Number of contour lines or array of contour height.
stylev (string or None) – None as normal contour, ‘filled’ as filled contour, ‘image’ as contour image
colormap (string) – Selected colormap .. seealso:: http://www.scipy.org/Cookbook/Matplotlib/…/Show_colormaps

imshow_standard_rbm_parameters¶

visualization.imshow_standard_rbm_parameters(v1, v2, h1, h2, whitening=None, window_title='')¶

Saves the weights and biases of a given RBM at the given location.

Parameters:

rbm (RBM object) – RBM which weights and biases should be saved.
v1 (int) – Visible bias and the single weights will be saved as an image with size
v2 (int) – Visible bias and the single weights will be saved as an image with size
h1 (int) – Hidden bias and the image containing all weights will be saved as an image with size h1 x h2.
h2 (int) – Hidden bias and the image containing all weights will be saved as an image with size h1 x h2.
whitening (preprocessing object or None) – If the data is PCA whitened it is useful to dewhiten the filters to wee the structure!
window_title (string) – Title for this rbm.

hidden_activation¶

visualization.hidden_activation(data, states=False)¶

Calculates the hidden activation.

Parameters:	rbm (RBM model object) – RBM model object. data (numpy array [num samples, dimensions]) – Data for the activation calculation. states (bool) – If True uses states rather then probabilities by rounding to 0 or 1.
Returns:	hidden activation and the mean and standard deviation over the data.
Return type:	numpy array, float, floa

reorder_filter_by_hidden_activation¶

visualization.reorder_filter_by_hidden_activation(data)¶

Reorders the weights by its activation over the data set in decreasing order.

Parameters:	rbm (RBM model object) – RBM model object. data (numpy array [num samples, dimensions]) – Data for the activation calculation.
Returns:	RBM with reordered weights.
Return type:	RBM object.

generate_samples¶

visualization.generate_samples(data, iterations, stepsize, v1, v2, sample_states=False, whitening=None)¶

Generates samples from the given RBM model.

Parameters:	rbm (RBM model object.) – RBM model. data (numpy array [num samples, dimensions]) – Data to start sampling from. iterations (int) – Number of Gibbs sampling steps. stepsize (int) – After how many steps a sample should be plotted. v1 (int) – X-Axis of the reorder image patch. v2 (int) – Y-Axis of the reorder image patch. sample_states (bool) – If true returns the sates , probabilities otherwise. whitening (preprocessing object or None) – If the data has been preprocessed it needs to be undone.
Returns:	Matrix with image patches order along X-Axis and it’s evolution in Y-Axis.
Return type:	numpy array

imshow_filter_tuning_curve¶

visualization.imshow_filter_tuning_curve(num_of_ang=40)¶

Plot the tuning curves of the filter’s changes in frequency and angles.

Parameters:	filters (numpy array) – Filters to analyze. num_of_ang (int) – Number of orientations to check.

imshow_filter_optimal_gratings¶

visualization.imshow_filter_optimal_gratings(opt_frq, opt_ang)¶

Plot the filters and corresponding optimal gating pattern.

Parameters:	filters (numpy array) – Filters to analyze. opt_frq (int) – Optimal frequencies. opt_ang (int) – Optimal frequencies.

imshow_filter_frequency_angle_histogram¶

visualization.imshow_filter_frequency_angle_histogram(opt_ang, max_wavelength=14)¶

lots the histograms of the optimal frequencies and angles.

Parameters:	opt_frq (int) – Optimal frequencies. opt_ang (int) – Optimal angle. max_wavelength (int) – Maximal wavelength.

filter_frequency_and_angle¶

visualization.filter_frequency_and_angle(num_of_angles=40)¶

Analyze the filters by calculating the responses when gratings, i.e. sinusoidal functions, are input to them.

Info:	Hyv/”arinen, A. et al. (2009) Natural image statistics, Page 144-146
Parameters:	filters (numpy array) – Filters to analyze num_of_angles (int) – Number of angles steps to check
Returns:	The optimal frequency (pixels/cycle) of the filters, the optimal orientation angle (rad) of the filters
Return type:	numpy array, numpy array

filter_frequency_response¶

visualization.filter_frequency_response(num_of_angles=40)¶

Compute the response of filters w.r.t. different frequency.

Parameters:	filters (numpy array) – Filters to analyze num_of_angles (int) – Number of angles steps to check
Returns:	Frequency response as output_dim x max_wavelength-1 index of the
Return type:	numpy array, numpy array

filter_angle_response¶

visualization.filter_angle_response(num_of_angles=40)¶

Compute the angle response of the given filter.

Parameters:	filters (numpy array) – Filters to analyze num_of_angles (int) – Number of angles steps to check
Returns:	Angle response as output_dim x num_of_ang, index of angles
Return type:	numpy array, numpy array

calculate_amari_distance¶

visualization.calculate_amari_distance(matrix_two, version=1)¶

Calculate the Amari distance between two input matrices.

Parameters:	matrix_one (numpy array) – the first matrix matrix_two (numpy array) – the second matrix version (int) – Variant to use.
Returns:	The amari distance between two input matrices.
Return type:	float

preprocessing¶

This module contains several classes for data preprocessing.

Implemented:	Standarizer Principal Component Analysis (PCA) Zero Phase Component Analysis (ZCA) Independent Component Analysis (ICA) Binarize data Rescale data Remove row means Remove column means
Version:	1.1.0
Date:	04.04.2017
Author:	Jan Melchior
Contact:	JanMelchior@gmx.de
License:	Copyright (C) 2017 Jan Melchior This file is part of the Python library PyDeep. PyDeep is free software: you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation, either version 3 of the License, or (at your option) any later version. This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details. You should have received a copy of the GNU General Public License along with this program. If not, see <http://www.gnu.org/licenses/>.

binarize_data¶

preprocessing.binarize_data()¶

Converts data to binary values. For data out of [a,b] a data point p will become zero if p < 0.5*(b-a) one otherwise.

Parameters:	data (numpy array [num data point, data dimension]) – Data to be binarized.
Returns:	Binarized data.
Return type:	numpy array [num data point, data dimension]

rescale_data¶

preprocessing.rescale_data(new_min=0.0, new_max=1.0)¶

Normalize the values of a matrix. e.g. [min,max] -> [new_min,new_max]

Parameters:	data (numpy array [num data point, data dimension]) – Data to be normalized. new_min (float) – New min value. new_max (float) – Rescaled data
Returns:
Return type:	numpy array [num data point, data dimension]

remove_rows_means¶

preprocessing.remove_rows_means(return_means=False)¶

Remove the individual mean of each row.

Parameters:	data (numpy array [num data point, data dimension]) – Data to be normalized return_means (bool) – If True returns also the means
Returns:	Data without row means, row means (optional).
Return type:	numpy array [num data point, data dimension], Means of the data (optional)

remove_cols_means¶

preprocessing.remove_cols_means(return_means=False)¶

Remove the individual mean of each column.

Parameters:	data (numpy array [num data point, data dimension]) – Data to be normalized return_means (bool) – If True returns also the means
Returns:	Data without column means, column means (optional).
Return type:	numpy array [num data point, data dimension], Means of the data (optional)

STANDARIZER¶

class pydeep.preprocessing.STANDARIZER(input_dim)[source]¶

Shifts the data having zero mean and scales it having unit variances along the axis.

__init__(input_dim)[source]¶

Constructor.

Parameters:	input_dim (int) – Data dimensionality.

project(data)[source]¶

Projects the data to normalized space.

Parameters:	data (numpy array [num data point, data dimension]) – Data to project.
Returns:	Projected data.
Return type:	numpy array [num data point, data dimension]

train(data)[source]¶

Training the model (full batch).

Parameters:	data (numpy array [num data point, data dimension]) – Data for training.

unproject(data)[source]¶

Projects the data back to the input space.

Parameters:	data (numpy array [num data point, data dimension]) – Data to unproject.
Returns:	Projected data.
Return type:	numpy array [num data point, data dimension]

PCA¶

class pydeep.preprocessing.PCA(input_dim, whiten=False)[source]¶

Principle component analysis (PCA) using Singular Value Decomposition (SVD)

__init__(input_dim, whiten=False)[source]¶

Constructor.

Parameters:	input_dim (int) – Data dimensionality. whiten (bool) – If true the projected data will be de-correlated in all directions.

project(data, num_components=None)[source]¶

Projects the data to Eigenspace.

Info:	projection_matrix has its projected vectors as its columns. i.e. if we project x by W into y where W is the projection_matrix, then y = W.T * x
Parameters:	data (numpy array [num data point, data dimension]) – Data to project. num_components (int or None) –
Returns:	Projected data.
Return type:	numpy array [num data point, data dimension]

train(data)[source]¶

Training the model (full batch).

Parameters:	data (numpy array [num data point, data dimension]) – data for training.

unproject(data, num_components=None)[source]¶

Projects the data from Eigenspace to normal space.

Parameters:	data (numpy array [num data point, data dimension]) – Data to be unprojected. num_components (int) – Number of components to project.
Returns:	Unprojected data.
Return type:	numpy array [num data point, num_components]

ZCA¶

class pydeep.preprocessing.ZCA(input_dim)[source]¶

Principle component analysis (PCA) using Singular Value Decomposition (SVD).

__init__(input_dim)[source]¶

Constructor.

Parameters:	input_dim (int) – Data dimensionality.

train(data)[source]¶

Training the model (full batch).

Parameters:	data (numpy array [num data point, data dimension]) – data for training.

ICA¶

class pydeep.preprocessing.ICA(input_dim)[source]¶

Independent Component Analysis using FastICA.

__init__(input_dim)[source]¶

Constructor.

Parameters:	input_dim (int) – Data dimensionality.

log_likelihood(data)[source]¶

Calculates the Log-Likelihood (LL) for the given data.

Parameters:	data (numpy array [num data point, data dimension]) – data to calculate the Log-Likelihood for.
Returns:	log-likelihood.
Return type:	numpy array [num data point]

train(data, iterations=1000, convergence=0.0, status=False)[source]¶

Training the model (full batch).

Parameters:	data (numpy array [num data point, data dimension]) – data for training. iterations (int) – Number of iterations convergence (double) – If the angle (in degrees) between filters of two updates is less than the given value, training is terminated. status (bool) – If true the progress is printed to the console.

rbm¶

Package providing rbm models and corresponding sampler, trainer and estimator.

Version:	1.1.0
Date:	04.04.2017
Author:	Jan Melchior
Contact:	JanMelchior@gmx.de
License:	Copyright (C) 2017 Jan Melchior This file is part of the Python library PyDeep. PyDeep is free software: you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation, either version 3 of the License, or (at your option) any later version. This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details. You should have received a copy of the GNU General Public License along with this program. If not, see <http://www.gnu.org/licenses/>.

dbn¶

Helper class for deep believe networks.

Version:	1.1.0
Date:	06.04.2017
Author:	Jan Melchior
Contact:	JanMelchior@gmx.de
License:	Copyright (C) 2017 Jan Melchior This file is part of the Python library PyDeep. PyDeep is free software: you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation, either version 3 of the License, or (at your option) any later version. This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details. You should have received a copy of the GNU General Public License along with this program. If not, see <http://www.gnu.org/licenses/>.

DBN¶

class pydeep.rbm.dbn.DBN(list_of_rbms)[source]¶

Deep believe network.

__init__(list_of_rbms)[source]¶

Initializes the network with rbms.

Parameters:	list_of_rbms (list) – List of rbms.

backward_propagate(output_data, sample=False)[source]¶

Propagates the output back through the input.

Parameters:	output_data (numpy array [batchsize x output dim]) – Output data. sample (bool) – If true the states are sampled, otherwise the probabilities are used.
Returns:	Input of the network.
Return type:	numpy array [batchsize x input dim]

forward_propagate(input_data, sample=False)[source]¶

Propagates the data through the network.

Parameters:	input_data (numpy array [batchsize x input dim]) – Input data sample (bool) – If true the states are sampled, otherwise the probabilities are used.
Returns:	Output of the network.
Return type:	numpy array [batchsize x output dim]

reconstruct(input_data, sample=False)[source]¶

Reconstructs the data by propagating the data to the output and back to the input.

Parameters:	input_data (numpy array [batchsize x input dim]) – Input data. sample (bool) – If true the states are sampled, otherwise the probabilities are used.
Returns:	Output of the network.
Return type:	numpy array [batchsize x output dim]

reconstruct_sample_top_layer(input_data, sampling_steps=100, sample_forward_backward=False)[source]¶

Reconstructs data by propagating the data forward, sampling the top most layer and propagating the result backward.

Parameters:	input_data (numpy array [batchsize x input dim]) – Input data sampling_steps (int) – Number of Sampling steps. sample_forward_backward (bool) – If true the states for the forward and backward phase are sampled.
Returns:	reconstruction of the network.
Return type:	numpy array [batchsize x output dim]

sample_top_layer(sampling_steps=100, initial_state=None, sample=True)[source]¶

Samples the top most layer, if initial_state is None the current state is used otherwise sampling is started from the given initial state

Parameters:	sampling_steps (int) – Number of Sampling steps. initial_state (numpy array [batchsize x output dim]) – Output data sample (bool) – If true the states are sampled, otherwise the probabilities are used (Mean field estimate).
Returns:	Output of the network.
Return type:	numpy array [batchsize x output dim]

estimator¶

This module provides methods for estimating the model performance (running on the CPU). Provided performance measures are for example the reconstruction error (RE) and the log-likelihood (LL). For estimating the LL we need to know the value of the partition function Z. If at least one layer is binary it is possible to calculate the value by factorizing over the binary values. Since it involves calculating all possible binary states, it is only possible for small models i.e. less than 25 (e.g. ~2^25 = 33554432 states). For bigger models we can estimate the partition function using annealed importance sampling (AIS).

Implemented:	kth order reconstruction error Log likelihood for visible data. Log likelihood for hidden data. True partition by factorization over the visible units. True partition by factorization over the hidden units. Annealed importance sampling to approximated the partition function. Reverse annealed importance sampling to approximated the partition function.
Info:	For the derivations .. seealso:: https://www.ini.rub.de/PEOPLE/wiskott/Reprints/Melchior-2012-MasterThesis-RBMs.pdf
Version:	1.1.0
Date:	04.04.2017
Author:	Jan Melchior
Contact:	JanMelchior@gmx.de
License:	Copyright (C) 2017 Jan Melchior This file is part of the Python library PyDeep. PyDeep is free software: you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation, either version 3 of the License, or (at your option) any later version. This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details. You should have received a copy of the GNU General Public License along with this program. If not, see <http://www.gnu.org/licenses/>.

reconstruction_error¶

estimator.reconstruction_error(data, k=1, beta=None, use_states=False, absolut_error=False)¶

This function calculates the reconstruction errors for a given model and data.

Parameters:	model (Valid RBM model) – The model. data (numpy array [num samples, num dimensions] or numpy array [num batches, num samples in batch, num dimensions]) – The data as 2D array or 3D array. k (int) – Number of Gibbs sampling steps. beta (None, float or numpy array [batchsize,1]) – Temperature(s) for the models energy. use_states (bool) – If false (default) the probabilities are used as reconstruction, if true states are sampled. absolut_error (boll) – If false (default) the squared error is used, the absolute error otherwise.
Returns:	Reconstruction errors of the data.
Return type:	nump array [num samples]

log_likelihood_v¶

estimator.log_likelihood_v(logz, data, beta=None)¶

Computes the log-likelihood (LL) for a given model and visible data given its log partition function.

Info: logz needs to be the partition function for the same beta (i.e. beta = 1.0)!

Parameters:	model (Valid RBM model.) – The model. logz (float) – The logarithm of the partition function. data (2D array [num samples, num input dim] or 3D type numpy array [num batches, num samples in batch, num input dim]) – The visible data. beta (None, float, numpy array [batchsize,1]) – Inverse temperature(s) for the models energy.
Returns:	The log-likelihood for each sample.
Return type:	numpy array [num samples]

log_likelihood_h¶

estimator.log_likelihood_h(logz, data, beta=None)¶

Computes the log-likelihood (LL) for a given model and hidden data given its log partition function.

Info: logz needs to be the partition function for the same beta (i.e. beta = 1.0)!

Parameters:	model (Valid RBM model.) – The model. logz (float) – The logarithm of the partition function. data (2D array [num samples, num output dim] or 3D type numpy array [num batches, num samples in batch, num output dim]) – The hidden data. beta (None, float, numpy array [batchsize,1]) – Inverse temperature(s) for the models energy.
Returns:	The log-likelihood for each sample.
Return type:	numpy array [num samples]

partition_function_factorize_v¶

estimator.partition_function_factorize_v(beta=None, batchsize_exponent='AUTO', status=False)¶

Computes the true partition function for the given model by factoring over the visible units.

Info: Exponential increase of computations by the number of visible units. (16 usually ~ 20 seconds)

Parameters:	model (Valid RBM model.) – The model. beta (None, float, numpy array [batchsize,1]) – Inverse temperature(s) for the models energy. batchsize_exponent (int) – 2^batchsize_exponent will be the batch size. status (bool) – If true prints the progress to the console.
Returns:	Log Partition function for the model.
Return type:	float

partition_function_factorize_h¶

estimator.partition_function_factorize_h(beta=None, batchsize_exponent='AUTO', status=False)¶

Computes the true partition function for the given model by factoring over the hidden units.

Info: Exponential increase of computations by the number of visible units. (16 usually ~ 20 seconds)

Parameters:	model (Valid RBM model.) – The model. beta (None, float, numpy array [batchsize,1]) – Inverse temperature(s) for the models energy. batchsize_exponent (int) – 2^batchsize_exponent will be the batch size. status (bool) – If true prints the progress to the console.
Returns:	Log Partition function for the model.
Return type:	float

annealed_importance_sampling¶

estimator.annealed_importance_sampling(num_chains=100, k=1, betas=10000, status=False)¶

Approximates the partition function for the given model using annealed importance sampling.

reverse_annealed_importance_sampling¶

estimator.reverse_annealed_importance_sampling(num_chains=100, k=1, betas=10000, status=False, data=None)¶

Approximates the partition function for the given model using reverse annealed importance sampling.

model¶

This module provides restricted Boltzmann machines (RBMs) with different types of units. The structure is very close to the mathematical derivations to simplify the understanding. In addition, the modularity helps to create other kind of RBMs without adapting the training algorithms.

Implemented:	centered BinaryBinary RBM (BB-RBM) centered GaussianBinary RBM (GB-RBM) with fixed variance centered GaussianBinaryVariance RBM (GB-RBM) with trainable variance # Models without implementation of p(v),p(h),p(v,h) -> AIS, PT, true gradient, … cannot be used! - centered BinaryBinaryLabel RBM (BBL-RBM) - centered GaussianBinaryLabel RBM (GBL-RBM) # Models with intractable p(v),p(h),p(v,h) -> AIS, PT, true gradient, … cannot be used! - centered BinaryRect RBM (BR-RBM) - centered RectBinary RBM (RB-RBM) - centered RectRect RBM (RR-RBM) - centered GaussianRect RBM (GR-RBM) - centered GaussianRectVariance RBM (GRV-RBM)
Info:	For the derivations .. seealso:: https://www.ini.rub.de/PEOPLE/wiskott/Reprints/Melchior-2012-MasterThesis-RBMs.pdf A usual way to create a new unit is to inherit from a given RBM class and override the functions that changed, e.g. Gaussian-Binary RBM inherited from the Binary-Binary RBM.
Version:	1.1.0
Date:	04.04.2017
Author:	Jan Melchior
Contact:	JanMelchior@gmx.de
License:	Copyright (C) 2017 Jan Melchior This file is part of the Python library PyDeep. PyDeep is free software: you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation, either version 3 of the License, or (at your option) any later version. This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details. You should have received a copy of the GNU General Public License along with this program. If not, see <http://www.gnu.org/licenses/>.

BinaryBinaryRBM¶

class pydeep.rbm.model.BinaryBinaryRBM(number_visibles, number_hiddens, data=None, initial_weights='AUTO', initial_visible_bias='AUTO', initial_hidden_bias='AUTO', initial_visible_offsets='AUTO', initial_hidden_offsets='AUTO', dtype=<type 'numpy.float64'>)[source]¶

Implementation of a centered restricted Boltzmann machine with binary visible and binary hidden units.

__init__(number_visibles, number_hiddens, data=None, initial_weights='AUTO', initial_visible_bias='AUTO', initial_hidden_bias='AUTO', initial_visible_offsets='AUTO', initial_hidden_offsets='AUTO', dtype=<type 'numpy.float64'>)[source]¶

This function initializes all necessary parameters and data structures. It is recommended to pass the training data to initialize the network automatically.

Parameters:

number_visibles (int) – Number of the visible variables.
number_hiddens (int) – Number of hidden variables.
data (None or numpy array [num samples, input dim]) – The training data for parameter initialization if ‘AUTO’ is chosen for the corresponding parameter.
initial_weights ('AUTO', scalar or numpy array [input dim, output_dim]) – Initial weights. ‘AUTO’ and a scalar are random init.
initial_visible_bias ('AUTO','INVERSE_SIGMOID', scalar or numpy array [1, input dim]) – Initial visible bias. ‘AUTO’ is random, ‘INVERSE_SIGMOID’ is the inverse Sigmoid of the visilbe mean. If a scalar is passed all values are initialized with it.
initial_hidden_bias ('AUTO','INVERSE_SIGMOID', scalar or numpy array [1, output_dim]) – Initial hidden bias. ‘AUTO’ is random, ‘INVERSE_SIGMOID’ is the inverse Sigmoid of the hidden mean. If a scalar is passed all values are initialized with it.
initial_visible_offsets ('AUTO', scalar or numpy array [1, input dim]) – Initial visible offset values. AUTO=data mean or 0.5 if no data is given. If a scalar is passed all values are initialized with it.
initial_hidden_offsets ('AUTO', scalar or numpy array [1, output_dim]) – Initial hidden offset values. AUTO = 0.5 If a scalar is passed all values are initialized with it.
dtype (numpy.float32 or numpy.float64 or numpy.longdouble) – Used data type i.e. numpy.float64

_add_visible_units(num_new_visibles, position=0, initial_weights='AUTO', initial_bias='AUTO', initial_offsets='AUTO', data=None)[source]¶

This function adds new visible units at the given position to the model. .. Warning:: If the parameters are changed. the trainer needs to be: reinitialized.

Parameters:

num_new_visibles (int) – The number of new hidden units to add
position (int) – Position where the units should be added.
initial_weights ('AUTO', scalar or numpy array [input num_new_visibles, output_dim]) – Initial weights. ‘AUTO’ and a scalar are random init.
initial_bias ('AUTO' or scalar or numpy array [1, num_new_visibles]) – Initial visible bias. ‘AUTO’ is random, ‘INVERSE_SIGMOID’ is the inverse Sigmoid of the visilbe mean. If a scalar is passed all values are initialized with it.
initial_offsets ('AUTO' or scalar or numpy array [1, num_new_visibles]) – The initial visible offset values.
data (numpy array [num datapoints, num_new_visibles]) – If data is given and the offset and bias is initzialized accordingly, if ‘AUTO’ is chosen.

_base_log_partition(use_base_model=False)[source]¶

Returns the base partition function for a given visible bias. .. Note:: that for AIS we need to be able to calculate the partition function of the base distribution exactly. Furthermore it is beneficial if the base distribution is a good approximation of the target distribution. A good choice is therefore the maximum likelihood estimate of the visible bias, given the data.

Parameters:	use_base_model (bool) – If true uses the base model, i.e. the MLE of the bias values.
Returns:	Partition function for zero parameters.
Return type:	float

_calculate_hidden_bias_gradient(h)[source]¶

This function calculates the gradient for the hidden biases.

Parameters:	h (numpy arrays [batch size, output dim]) – Hidden activations.
Returns:	Hidden bias gradient.
Return type:	numpy arrays [1, output dim]

_calculate_visible_bias_gradient(v)[source]¶

This function calculates the gradient for the visible biases.

Parameters:	v (numpy arrays [batch_size, input dim]) – Visible activations.
Returns:	Visible bias gradient.
Return type:	numpy arrays [1, input dim]

_calculate_weight_gradient(v, h)[source]¶

This function calculates the gradient for the weights from the visible and hidden activations.

Parameters:	v (numpy arrays [batchsize, input dim]) – Visible activations. h (numpy arrays [batchsize, output dim]) – Hidden activations.
Returns:	Weight gradient.
Return type:	numpy arrays [input dim, output dim]

_getbasebias()[source]¶

Returns the maximum likelihood estimate of the visible bias, given the data. If no data is given the RBMs bias value is return, but is highly recommended to pass the data.

Returns:	Base bias.
Return type:	numpy array [1, input dim]

_remove_visible_units(indices)[source]¶

This function removes the visible units whose indices are given.: Warning

If the parameters are changed. the trainer needs to be reinitialized.

Parameters:	indices (int or list of int or numpy array of int) – Indices of units to be remove.

calculate_gradients(v, h)[source]¶

This function calculates all gradients of this RBM and returns them as a list of arrays. This keeps the flexibility of adding parameters which will be updated by the training algorithms.

Parameters:	v (numpy arrays [batch size, output dim]) – Visible activations. h (numpy arrays [batch size, output dim]) – Hidden activations.
Returns:	Gradients for all parameters.
Return type:	list of numpy arrays (num parameters x [parameter.shape])

energy(v, h, beta=None, use_base_model=False)[source]¶

Compute the energy of the RBM given observed variable states v and hidden variables state h.

Parameters:	v (numpy array [batch size, input dim]) – Visible states. h (numpy array [batch size, output dim]) – Hidden states. beta (None, float or numpy array [batch size, 1]) – Allows to sample from a given inverse temperature beta, or if a vector is given to sample from different betas simultaneously. None is equivalent to pass the value 1.0 use_base_model (bool) – If true uses the base model, i.e. the MLE of the bias values.
Returns:	Energy of v and h.
Return type:	numpy array [batch size,1]

log_probability_h(logz, h, beta=None, use_base_model=False)[source]¶

Computes the log-probability / LogLikelihood(LL) for the given hidden units for this model. To estimate the LL we need to know the logarithm of the partition function Z. For small models it is possible to calculate Z, however since this involves calculating all possible hidden states, it is intractable for bigger models. As an estimation method annealed importance sampling (AIS) can be used instead.

Parameters:	logz (float) – The logarithm of the partition function. h (numpy array [batch size, output dim]) – Hidden states. beta (None, float or numpy array [batch size, 1]) – Allows to sample from a given inverse temperature beta, or if a vector is given to sample from different betas simultaneously. None is equivalent to pass the value 1.0 use_base_model (bool) – If true uses the base model, i.e. the MLE of the bias values.
Returns:	Log probability for hidden_states.
Return type:	numpy array [batch size, 1]

log_probability_v(logz, v, beta=None, use_base_model=False)[source]¶

Computes the log-probability / LogLikelihood(LL) for the given visible units for this model. To estimate the LL we need to know the logarithm of the partition function Z. For small models it is possible to calculate Z, however since this involves calculating all possible hidden states, it is intractable for bigger models. As an estimation method annealed importance sampling (AIS) can be used instead.

Parameters:	logz (float) – The logarithm of the partition function. v (numpy array [batch size, input dim]) – Visible states. beta (None, float or numpy array [batch size, 1]) – Allows to sample from a given inverse temperature beta, or if a vector is given to sample from different betas simultaneously.None is equivalent to pass the value 1.0. use_base_model (bool) – If true uses the base model, i.e. the MLE of the bias values.
Returns:	Log probability for visible_states.
Return type:	numpy array [batch size, 1]

log_probability_v_h(logz, v, h, beta=None, use_base_model=False)[source]¶

Computes the joint log-probability / LogLikelihood(LL) for the given visible and hidden units for this model. To estimate the LL we need to know the logarithm of the partition function Z. For small models it is possible to calculate Z, however since this involves calculating all possible hidden states, it is intractable for bigger models. As an estimation method annealed importance sampling (AIS) can be used instead.

Parameters:	logz (float) – The logarithm of the partition function. v (numpy array [batch size, input dim]) – Visible states. h (numpy array [batch size, output dim]) – Hidden states. beta (None, float or numpy array [batch size, 1]) – Allows to sample from a given inverse temperature beta, or if a vector is given to sample from different betas simultaneously. None is equivalent to pass the value 1.0 use_base_model (bool) – If true uses the base model, i.e. the MLE of the bias values.
Returns:	Joint log probability for v and h.
Return type:	numpy array [batch size, 1]

probability_h_given_v(v, beta=None, use_base_model=False)[source]¶

Calculates the conditional probabilities of h given v.

Parameters:	v (numpy array [batch size, input dim]) – Visible states. beta (None, float or numpy array [batch size, 1]) – Allows to sample from a given inverse temperature beta, or if a vector is given to sample from different betas simultaneously. None is equivalent to pass the value 1.0 use_base_model (bool) – DUMMY variable, since we do not use a base hidden bias.
Returns:	Conditional probabilities h given v.
Return type:	numpy array [batch size, output dim]

probability_v_given_h(h, beta=None, use_base_model=False)[source]¶

Calculates the conditional probabilities of v given h.

Parameters:	h (numpy array [batch size, output dim]) – Hidden states. beta (None, float or numpy array [batch size, 1]) – Allows to sample from a given inverse temperature beta, or if a vector is given to sample from different betas simultaneously. None is equivalent to pass the value 1.0 use_base_model (bool) – If true uses the base model, i.e. the MLE of the bias values.
Returns:	Conditional probabilities v given h.
Return type:	numpy array [batch size, input d

sample_h(h, beta=None, use_base_model=False)[source]¶

Samples the hidden variables from the conditional probabilities h given v.

Parameters:	h (numpy array [batch size, output dim]) – Conditional probabilities of h given v. beta (None) – DUMMY Variable. The sampling in other types of units like Gaussian-Binary RBMs will be affected by beta. use_base_model (bool) – If true uses the base model, i.e. the MLE of the bias values. (DUMMY in this case)
Returns:	States for h.
Return type:	numpy array [batch size, output dim]

sample_v(v, beta=None, use_base_model=False)[source]¶

Samples the visible variables from the conditional probabilities v given h.

Parameters:	v (numpy array [batch size, input dim]) – Conditional probabilities of v given h. beta (None) – DUMMY Variable. The sampling in other types of units like Gaussian-Binary RBMs will be affected by beta. use_base_model (bool) – If true uses the base model, i.e. the MLE of the bias values. (DUMMY in this case)
Returns:	States for v.
Return type:	numpy array [batch size, input dim]

unnormalized_log_probability_h(h, beta=None, use_base_model=False)[source]¶

Computes the unnormalized log probabilities of h.

Parameters:	h (numpy array [batch size, output dim]) – Hidden states. beta (None, float or numpy array [batch size, 1]) – Allows to sample from a given inverse temperature beta, or if a vector is given to sample from different betas simultaneously.None is equivalent to pass the value 1.0. use_base_model (bool) – If true uses the base model, i.e. the MLE of the bias values.
Returns:	Unnormalized log probability of h.
Return type:	numpy array [batch size, 1]

unnormalized_log_probability_v(v, beta=None, use_base_model=False)[source]¶

Computes the unnormalized log probabilities of v.

Parameters:	v (numpy array [batch size, input dim]) – Visible states. beta (None, float or numpy array [batch size, 1]) – Allows to sample from a given inverse temperature beta, or if a vector is given to sample from different betas simultaneously.None is equivalent to pass the value 1.0. use_base_model (bool) – If true uses the base model, i.e. the MLE of the bias values.
Returns:	Unnormalized log probability of v.
Return type:	numpy array [batch size, 1]

GaussianBinaryRBM¶

class pydeep.rbm.model.GaussianBinaryRBM(number_visibles, number_hiddens, data=None, initial_weights='AUTO', initial_visible_bias='AUTO', initial_hidden_bias='AUTO', initial_sigma='AUTO', initial_visible_offsets='AUTO', initial_hidden_offsets='AUTO', dtype=<type 'numpy.float64'>)[source]¶

Implementation of a centered Restricted Boltzmann machine with Gaussian visible and binary hidden units.

__init__(number_visibles, number_hiddens, data=None, initial_weights='AUTO', initial_visible_bias='AUTO', initial_hidden_bias='AUTO', initial_sigma='AUTO', initial_visible_offsets='AUTO', initial_hidden_offsets='AUTO', dtype=<type 'numpy.float64'>)[source]¶

This function initializes all necessary parameters and data structures. It is recommended to pass the training data to initialize the network automatically.

Parameters:

number_visibles (int) – Number of the visible variables.
number_hiddens (int) – Number of hidden variables.
data (None or numpy array [num samples, input dim]) – The training data for parameter initialization if ‘AUTO’ is chosen for the corresponding parameter.
initial_weights ('AUTO', scalar or numpy array [input dim, output_dim]) – Initial weights. ‘AUTO’ and a scalar are random init.
initial_visible_bias ('AUTO','INVERSE_SIGMOID', scalar or numpy array [1, input dim]) – Initial visible bias. ‘AUTO’ is random, ‘INVERSE_SIGMOID’ is the inverse Sigmoid of the visilbe mean. If a scalar is passed all values are initialized with it.
initial_hidden_bias ('AUTO','INVERSE_SIGMOID', scalar or numpy array [1, output_dim]) – Initial hidden bias. ‘AUTO’ is random, ‘INVERSE_SIGMOID’ is the inverse Sigmoid of the hidden mean. If a scalar is passed all values are initialized with it.
initial_sigma ('AUTO', scalar or numpy array [1, input_dim]) – Initial standard deviation for the model.
initial_visible_offsets ('AUTO', scalar or numpy array [1, input dim]) – Initial visible offset values. AUTO=data mean or 0.5 if no data is given. If a scalar is passed all values are initialized with it.
initial_hidden_offsets ('AUTO', scalar or numpy array [1, output_dim]) – Initial hidden offset values. AUTO = 0.5 If a scalar is passed all values are initialized with it.
dtype (numpy.float32 or numpy.float64 or numpy.longdouble) – Used data type i.e. numpy.float64

_add_hidden_units(num_new_hiddens, position=0, initial_weights='AUTO', initial_bias='AUTO', initial_offsets='AUTO')[source]¶

This function adds new hidden units at the given position to the model.: Warning

If the parameters are changed. the trainer needs to be reinitialized.

Parameters:

num_new_hiddens (int) – The number of new hidden units to add.
position (int) – Position where the units should be added.
initial_weights ('AUTO' or scalar or numpy array [input_dim, num_new_hiddens]) – The initial weight values for the hidden units.
initial_bias ('AUTO' or scalar or numpy array [1, num_new_hiddens]) – The initial hidden bias values.
initial_offsets ('AUTO' or scalar or numpy array [1, num_new_hidden) – he initial hidden mean values.

_add_visible_units(num_new_visibles, position=0, initial_weights='AUTO', initial_bias='AUTO', initial_sigmas=1.0, initial_offsets='AUTO', data=None)[source]¶

This function adds new visible units at the given position to the model.: Warning

If the parameters are changed. the trainer needs to be reinitialized.

Parameters:

num_new_visibles (int) – The number of new hidden units to add
position (int) – Position where the units should be added.
initial_weights ('AUTO', scalar or numpy array [input num_new_visibles, output_dim]) – Initial weights. ‘AUTO’ and a scalar are random init.
initial_bias ('AUTO' or scalar or numpy array [1, num_new_visibles]) – Initial visible bias. ‘AUTO’ is random, ‘INVERSE_SIGMOID’ is the inverse Sigmoid of the visilbe mean. If a scalar is passed all values are initialized with it.
initial_sigmas ('AUTO' or scalar or numpy array [1, num_new_visibles]) – The initial standard deviation for the model.
initial_offsets ('AUTO' or scalar or numpy array [1, num_new_visibles]) – The initial visible offset values.
data (numpy array [num datapoints, num_new_visibles]) – If data is given and the offset and bias is initzialized accordingly, if ‘AUTO’ is chosen.

_base_log_partition(use_base_model=False)[source]¶

Returns the base partition function which needs to be calculateable.

Parameters:	use_base_model (bool) – DUMMY sicne the integral does not change if the mean is shifted.
Returns:	Partition function for zero parameters.
Return type:	float

_calculate_visible_bias_gradient(v)[source]¶

This function calculates the gradient for the visible biases.

Parameters:	v (numpy arrays [batch_size, input dim]) – Visible activations.
Returns:	Visible bias gradient.
Return type:	numpy arrays [1, input dim]

_calculate_weight_gradient(v, h)[source]¶

This function calculates the gradient for the weights from the visible and hidden activations.

Parameters:	v (numpy arrays [batchsize, input dim]) – Visible activations. h (numpy arrays [batchsize, output dim]) – Hidden activations.
Returns:	Weight gradient.
Return type:	numpy arrays [input dim, output dim]

_remove_visible_units(indices)[source]¶

This function removes the visible units whose indices are given.: Warning

If the parameters are changed. the trainer needs to be reinitialized.

Parameters:	indices (int or list of int or numpy array of int) – Indices of units to be remove.

energy(v, h, beta=None, use_base_model=False)[source]¶

Compute the energy of the RBM given observed variable states v and hidden variables state h.

Parameters:	v (numpy array [batch size, input dim]) – Visible states. h (numpy array [batch size, output dim]) – Hidden states. beta (None, float or numpy array [batch size, 1]) – Allows to sample from a given inverse temperature beta, or if a vector is given to sample from different betas simultaneously. None is equivalent to pass the value 1.0 use_base_model (bool) – If true uses the base model, i.e. the MLE of the bias values.
Returns:	Energy of v and h.
Return type:	numpy array [batch size,1]

probability_h_given_v(v, beta=None, use_base_model=False)[source]¶

Calculates the conditional probabilities h given v.

Parameters:	v (numpy array [batch size, input dim]) – Visible states / data. beta (None, float or numpy array [batch size, 1]) – Allows to sample from a given inverse temperature beta, or if a vector is given to sample from different betas simultaneously. None is equivalent to pass the value 1.0 use_base_model (bool) – If true uses the base model, i.e. the MLE of the bias values.
Returns:	Conditional probabilities h given v.
Return type:	numpy array [batch size, output dim]

probability_v_given_h(h, beta=None, use_base_model=False)[source]¶

Calculates the conditional probabilities of v given h.

Parameters:	h (numpy array [batch size, output dim]) – Hidden states. beta (None, float or numpy array [batch size, 1]) – Allows to sample from a given inverse temperature beta, or if a vector is given to sample from different betas simultaneously. None is equivalent to pass the value 1.0 use_base_model (bool) – If true uses the base model, i.e. the MLE of the bias values.
Returns:	Conditional probabilities v given h.
Return type:	numpy array [batch size, input dim]

sample_v(v, beta=None, use_base_model=False)[source]¶

Samples the visible variables from the conditional probabilities v given h.

Parameters:	v (numpy array [batch size, input dim]) – Conditional probabilities of v given h. beta (None) – DUMMY Variable The sampling in other types of units like Gaussian-Binary RBMs will be affected by beta. use_base_model (bool) – If true uses the base model, i.e. the MLE of the bias values. (DUMMY in this case)
Returns:	States for v.
Return type:	numpy array [batch size, input dim]

unnormalized_log_probability_h(h, beta=None, use_base_model=False)[source]¶

Computes the unnormalized log probabilities of h.

Parameters:	h (numpy array [batch size, output dim]) – Hidden states. beta (None, float or numpy array [batch size, 1]) – Allows to sample from a given inverse temperature beta, or if a vector is given to sample from different betas simultaneously.None is equivalent to pass the value 1.0. use_base_model (bool) – If true uses the base model, i.e. the MLE of the bias values.
Returns:	Unnormalized log probability of h.
Return type:	numpy array [batch size, 1]

unnormalized_log_probability_v(v, beta=None, use_base_model=False)[source]¶

Computes the unnormalized log probabilities of v.: ln(z*p(v)) = ln(p(v))-ln(z)+ln(z) = ln(p(v)).

Parameters:	v (numpy array [batch size, input dim]) – Visible states. beta (None, float or numpy array [batch size, 1]) – Allows to sample from a given inverse temperature beta, or if a vector is given to sample from different betas simultaneously.None is equivalent to pass the value 1.0. use_base_model (bool) – If true uses the base model, i.e. the MLE of the bias values.
Returns:	Unnormalized log probability of v.
Return type:	numpy array [batch size, 1]

GaussianBinaryVarianceRBM¶

class pydeep.rbm.model.GaussianBinaryVarianceRBM(number_visibles, number_hiddens, data=None, initial_weights='AUTO', initial_visible_bias='AUTO', initial_hidden_bias='AUTO', initial_sigma='AUTO', initial_visible_offsets=0.0, initial_hidden_offsets=0.0, dtype=<type 'numpy.float64'>)[source]¶

Implementation of a Restricted Boltzmann machine with Gaussian visible having trainable variances and binary hidden units.

__init__(number_visibles, number_hiddens, data=None, initial_weights='AUTO', initial_visible_bias='AUTO', initial_hidden_bias='AUTO', initial_sigma='AUTO', initial_visible_offsets=0.0, initial_hidden_offsets=0.0, dtype=<type 'numpy.float64'>)[source]¶

This function initializes all necessary parameters and data structures. It is recommended to pass the training data to initialize the network automatically.

Parameters:

number_visibles (int) – Number of the visible variables.
number_hiddens (int) – Number of hidden variables.
data (None or numpy array [num samples, input dim]) – The training data for parameter initialization if ‘AUTO’ is chosen for the corresponding parameter.
initial_weights ('AUTO', scalar or numpy array [input dim, output_dim]) – Initial weights. ‘AUTO’ and a scalar are random init.
initial_visible_bias ('AUTO','INVERSE_SIGMOID', scalar or numpy array [1, input dim]) – Initial visible bias. ‘AUTO’ is random, ‘INVERSE_SIGMOID’ is the inverse Sigmoid of the visilbe mean. If a scalar is passed all values are initialized with it.
initial_hidden_bias ('AUTO','INVERSE_SIGMOID', scalar or numpy array [1, output_dim]) – Initial hidden bias. ‘AUTO’ is random, ‘INVERSE_SIGMOID’ is the inverse Sigmoid of the hidden mean. If a scalar is passed all values are initialized with it.
initial_sigma ('AUTO', scalar or numpy array [1, input_dim]) – Initial standard deviation for the model.
initial_visible_offsets ('AUTO', scalar or numpy array [1, input dim]) – Initial visible offset values. AUTO=data mean or 0.5 if no data is given. If a scalar is passed all values are initialized with it.
initial_hidden_offsets ('AUTO', scalar or numpy array [1, output_dim]) – Initial hidden offset values. AUTO = 0.5 If a scalar is passed all values are initialized with it.
dtype (numpy.float32 or numpy.float64 or numpy.longdouble) – Used data type i.e. numpy.float64

_calculate_sigma_gradient(v, h)[source]¶

This function calculates the gradient for the variance of the RBM.

Parameters:	v (numpy arrays [batchsize, input dim]) – States of the visible variables. h (numpy arrays [batchsize, output dim]) – Probs/States of the hidden variables.
Returns:	Sigma gradient.
Return type:	list of numpy arrays [input dim,1]

calculate_gradients(v, h)[source]¶

his function calculates all gradients of this RBM and returns them as an ordered array. This keeps the flexibility of adding parameters which will be updated by the training algorithms.

Parameters:	v (numpy arrays [batchsize, input dim]) – States of the visible variables. h (numpy arrays [batchsize, output dim]) – Probabilities of the hidden variables.
Returns:	Gradients for all parameters.
Return type:	numpy arrays (num parameters x [parameter.shape])

get_parameters()[source]¶

This function returns all mordel parameters in a list.

Returns:	The parameter references in a list.
Return type:	list

BinaryBinaryLabelRBM¶

class pydeep.rbm.model.BinaryBinaryLabelRBM(number_visibles, number_labels, number_hiddens, data=None, initial_weights='AUTO', initial_visible_bias='AUTO', initial_hidden_bias='AUTO', initial_visible_offsets='AUTO', initial_hidden_offsets='AUTO', dtype=<type 'numpy.float64'>)[source]¶

Implementation of a centered Restricted Boltzmann machine with Binary visible plus Softmax label units and binary hidden units.

__init__(number_visibles, number_labels, number_hiddens, data=None, initial_weights='AUTO', initial_visible_bias='AUTO', initial_hidden_bias='AUTO', initial_visible_offsets='AUTO', initial_hidden_offsets='AUTO', dtype=<type 'numpy.float64'>)[source]¶

This function initializes all necessary parameters and data structures. It is recommended to pass the training data to initialize the network automatically.

Parameters:

number_visibles (int) – Number of the visible variables.
number_labels (int) – Number of the label variables.
number_hiddens (int) – Number of hidden variables.
data (None or numpy array [num samples, input dim]) – The training data for parameter initialization if ‘AUTO’ is chosen for the corresponding parameter.
initial_weights ('AUTO', scalar or numpy array [input dim, output_dim]) – Initial weights. ‘AUTO’ and a scalar are random init.
initial_visible_bias ('AUTO','INVERSE_SIGMOID', scalar or numpy array [1, input dim]) – Initial visible bias. ‘AUTO’ is random, ‘INVERSE_SIGMOID’ is the inverse Sigmoid of the visilbe mean. If a scalar is passed all values are initialized with it.
initial_hidden_bias ('AUTO','INVERSE_SIGMOID', scalar or numpy array [1, output_dim]) – Initial hidden bias. ‘AUTO’ is random, ‘INVERSE_SIGMOID’ is the inverse Sigmoid of the hidden mean. If a scalar is passed all values are initialized with it.
initial_visible_offsets ('AUTO', scalar or numpy array [1, input dim]) – Initial visible offset values. AUTO=data mean or 0.5 if no data is given. If a scalar is passed all values are initialized with it.
initial_hidden_offsets ('AUTO', scalar or numpy array [1, output_dim]) – Initial hidden offset values. AUTO = 0.5 If a scalar is passed all values are initialized with it.
dtype (numpy.float32 or numpy.float64 or numpy.longdouble) – Used data type i.e. numpy.float64

_add_visible_units()[source]¶: Not available!

_base_log_partition()[source]¶: Not available!

_remove_visible_units()[source]¶: Not available!

energy()[source]¶: Not available!

log_probability_h()[source]¶: Not available!

log_probability_v()[source]¶: Not available!

log_probability_v_h()[source]¶: Not available!

sample_v(v, beta=None, use_base_model=False)[source]¶

Samples the visible variables from the conditional probabilities v given h.

Parameters:	v (numpy array [batch size, input dim]) – Conditional probabilities of v given h. beta (None) – DUMMY Variable. The sampling in other types of units like Gaussian-Binary RBMs will be affected by beta. use_base_model (bool) – If true uses the base model, i.e. the MLE of the bias values. (DUMMY in this case)
Returns:	States for v.
Return type:	numpy array [batch size, input dim]

unnormalized_log_probability_h()[source]¶: Not available!

unnormalized_log_probability_v()[source]¶: Not available!

SoftMaxSigmoid¶

GaussianBinaryLabelRBM¶

class pydeep.rbm.model.GaussianBinaryLabelRBM(number_visibles, number_labels, number_hiddens, data=None, initial_weights='AUTO', initial_visible_bias='AUTO', initial_hidden_bias='AUTO', initial_sigma='AUTO', initial_visible_offsets='AUTO', initial_hidden_offsets='AUTO', dtype=<type 'numpy.float64'>)[source]¶

Implementation of a centered Restricted Boltzmann machine with Gaussian visible plus Softmax label units and binary hidden units.

__init__(number_visibles, number_labels, number_hiddens, data=None, initial_weights='AUTO', initial_visible_bias='AUTO', initial_hidden_bias='AUTO', initial_sigma='AUTO', initial_visible_offsets='AUTO', initial_hidden_offsets='AUTO', dtype=<type 'numpy.float64'>)[source]¶

This function initializes all necessary parameters and data structures. It is recommended to pass the training data to initialize the network automatically.

Parameters:

number_visibles (int) – Number of the visible variables.
number_labels (int) – Number of the label variables.
number_hiddens (int) – Number of hidden variables.
data (None or numpy array [num samples, input dim]) – The training data for parameter initialization if ‘AUTO’ is chosen for the corresponding parameter.
initial_weights ('AUTO', scalar or numpy array [input dim, output_dim]) – Initial weights. ‘AUTO’ and a scalar are random init.
initial_visible_bias ('AUTO','INVERSE_SIGMOID', scalar or numpy array [1, input dim]) – Initial visible bias. ‘AUTO’ is random, ‘INVERSE_SIGMOID’ is the inverse Sigmoid of the visilbe mean. If a scalar is passed all values are initialized with it.
initial_hidden_bias ('AUTO','INVERSE_SIGMOID', scalar or numpy array [1, output_dim]) – Initial hidden bias. ‘AUTO’ is random, ‘INVERSE_SIGMOID’ is the inverse Sigmoid of the hidden mean. If a scalar is passed all values are initialized with it.
initial_sigma ('AUTO', scalar or numpy array [1, input_dim]) – Initial standard deviation for the model.
initial_visible_offsets ('AUTO', scalar or numpy array [1, input dim]) – Initial visible offset values. AUTO=data mean or 0.5 if no data is given. If a scalar is passed all values are initialized with it.
initial_hidden_offsets ('AUTO', scalar or numpy array [1, output_dim]) – Initial hidden offset values. AUTO = 0.5 If a scalar is passed all values are initialized with it.
dtype (numpy.float32 or numpy.float64 or numpy.longdouble) – Used data type i.e. numpy.float64

_add_visible_units()[source]¶: Not available!

_base_log_partition()[source]¶: Not available!

_remove_visible_units()[source]¶: Not available!

energy()[source]¶: Not available!

log_probability_h()[source]¶: Not available!

log_probability_v()[source]¶: Not available!

log_probability_v_h()[source]¶: Not available!

sample_v(v, beta=None, use_base_model=False)[source]¶

Samples the visible variables from the conditional probabilities v given h.

Parameters:	v (numpy array [batch size, input dim]) – Conditional probabilities of v given h. beta (None) – DUMMY Variable. The sampling in other types of units like Gaussian-Binary RBMs will be affected by beta. use_base_model (bool) – If true uses the base model, i.e. the MLE of the bias values. (DUMMY in this case)
Returns:	States for v.
Return type:	numpy array [batch size, input dim]

unnormalized_log_probability_h()[source]¶: Not available!

unnormalized_log_probability_v()[source]¶: Not available!

SoftMaxLinear¶

BinaryRectRBM¶

class pydeep.rbm.model.BinaryRectRBM(number_visibles, number_hiddens, data=None, initial_weights='AUTO', initial_visible_bias='AUTO', initial_hidden_bias='AUTO', initial_visible_offsets='AUTO', initial_hidden_offsets='AUTO', dtype=<type 'numpy.float64'>)[source]¶

Implementation of a centered Restricted Boltzmann machine with Binary visible and Noisy linear rectified hidden units.

__init__(number_visibles, number_hiddens, data=None, initial_weights='AUTO', initial_visible_bias='AUTO', initial_hidden_bias='AUTO', initial_visible_offsets='AUTO', initial_hidden_offsets='AUTO', dtype=<type 'numpy.float64'>)[source]¶

This function initializes all necessary parameters and data structures. It is recommended to pass the training data to initialize the network automatically.

Parameters:

number_visibles (int) – Number of the visible variables.
number_hiddens (int) – Number of hidden variables.
data (None or numpy array [num samples, input dim]) – The training data for parameter initialization if ‘AUTO’ is chosen for the corresponding parameter.
initial_weights ('AUTO', scalar or numpy array [input dim, output_dim]) – Initial weights. ‘AUTO’ and a scalar are random init.
initial_visible_bias ('AUTO','INVERSE_SIGMOID', scalar or numpy array [1, input dim]) – Initial visible bias. ‘AUTO’ is random, ‘INVERSE_SIGMOID’ is the inverse Sigmoid of the visilbe mean. If a scalar is passed all values are initialized with it.
initial_hidden_bias ('AUTO','INVERSE_SIGMOID', scalar or numpy array [1, output_dim]) – Initial hidden bias. ‘AUTO’ is random, ‘INVERSE_SIGMOID’ is the inverse Sigmoid of the hidden mean. If a scalar is passed all values are initialized with it.
initial_visible_offsets ('AUTO', scalar or numpy array [1, input dim]) – Initial visible offset values. AUTO=data mean or 0.5 if no data is given. If a scalar is passed all values are initialized with it.
initial_hidden_offsets ('AUTO', scalar or numpy array [1, output_dim]) – Initial hidden offset values. AUTO = 0.5 If a scalar is passed all values are initialized with it.
dtype (numpy.float32 or numpy.float64 or numpy.longdouble) – Used data type i.e. numpy.float64

_add_visible_units()[source]¶: Not available!

_base_log_partition()[source]¶: Not available!

_remove_visible_units()[source]¶: Not available!

energy()[source]¶: Not available!

log_probability_h()[source]¶: Not available!

log_probability_v()[source]¶: Not available!

log_probability_v_h()[source]¶: Not available!

probability_h_given_v(v, beta=None)[source]¶

Calculates the conditional probabilities h given v.

Parameters:	v (numpy array [batch size, input dim]) – Visible states / data. beta (float or numpy array [batch size, 1]) – Allows to sample from a given inverse temperature beta, or if a vector is given to sample from different betas simultaneously.
Returns:	Conditional probabilities h given v.
Return type:	numpy array [batch size, output dim]

sample_h(h, beta=None, use_base_model=False)[source]¶

Samples the hidden variables from the conditional probabilities h given v.

Parameters:	h (numpy array [batch size, output dim]) – Conditional probabilities of h given v. beta (None) – DUMMY Variable. The sampling in other types of units like Gaussian-Binary RBMs will be affected by beta. use_base_model (bool) – If true uses the base model, i.e. the MLE of the bias values. (DUMMY in this case)
Returns:	States for h.
Return type:	numpy array [batch size, output dim]

unnormalized_log_probability_h()[source]¶: Not available!

unnormalized_log_probability_v()[source]¶: Not available!

RectBinaryRBM¶

class pydeep.rbm.model.RectBinaryRBM(number_visibles, number_hiddens, data=None, initial_weights='AUTO', initial_visible_bias='AUTO', initial_hidden_bias='AUTO', initial_visible_offsets='AUTO', initial_hidden_offsets='AUTO', dtype=<type 'numpy.float64'>)[source]¶

Implementation of a centered Restricted Boltzmann machine with Noisy linear rectified visible units and binary hidden units.

__init__(number_visibles, number_hiddens, data=None, initial_weights='AUTO', initial_visible_bias='AUTO', initial_hidden_bias='AUTO', initial_visible_offsets='AUTO', initial_hidden_offsets='AUTO', dtype=<type 'numpy.float64'>)[source]¶

This function initializes all necessary parameters and data structures. It is recommended to pass the training data to initialize the network automatically.

Parameters:

number_visibles (int) – Number of the visible variables.
number_hiddens (int) – Number of hidden variables.
data (None or numpy array [num samples, input dim]) – The training data for parameter initialization if ‘AUTO’ is chosen for the corresponding parameter.
initial_weights ('AUTO', scalar or numpy array [input dim, output_dim]) – Initial weights. ‘AUTO’ and a scalar are random init.
initial_visible_bias ('AUTO','INVERSE_SIGMOID', scalar or numpy array [1, input dim]) – Initial visible bias. ‘AUTO’ is random, ‘INVERSE_SIGMOID’ is the inverse Sigmoid of the visilbe mean. If a scalar is passed all values are initialized with it.
initial_hidden_bias ('AUTO','INVERSE_SIGMOID', scalar or numpy array [1, output_dim]) – Initial hidden bias. ‘AUTO’ is random, ‘INVERSE_SIGMOID’ is the inverse Sigmoid of the hidden mean. If a scalar is passed all values are initialized with it.
initial_visible_offsets ('AUTO', scalar or numpy array [1, input dim]) – Initial visible offset values. AUTO=data mean or 0.5 if no data is given. If a scalar is passed all values are initialized with it.
initial_hidden_offsets ('AUTO', scalar or numpy array [1, output_dim]) – Initial hidden offset values. AUTO = 0.5 If a scalar is passed all values are initialized with it.
dtype (numpy.float32 or numpy.float64 or numpy.longdouble) – Used data type i.e. numpy.float64

_add_visible_units()[source]¶: Not available!

_base_log_partition()[source]¶: Not available!

_getbasebias()[source]¶: Not available!

_remove_visible_units()[source]¶: Not available!

energy()[source]¶: Not available!

log_probability_h()[source]¶: Not available!

log_probability_v()[source]¶: Not available!

log_probability_v_h()[source]¶: Not available!

probability_v_given_h(h, beta=None, use_base_model=False)[source]¶

Calculates the conditional probabilities of v given h.

Parameters:	h (numpy array [batch size, output dim]) – Hidden states. beta (None, float or numpy array [batch size, 1]) – Allows to sample from a given inverse temperature beta, or if a vector is given to sample from different betas simultaneously. None is equivalent to pass the value 1.0 use_base_model (bool) – If true uses the base model, i.e. the MLE of the bias values.
Returns:	Conditional probabilities v given h.
Return type:	numpy array [batch size, input d

sample_v(v, beta=None, use_base_model=False)[source]¶

Samples the visible variables from the conditional probabilities v given h.

Parameters:	v (numpy array [batch size, input dim]) – Conditional probabilities of v given h. beta (None) – DUMMY Variable. The sampling in other types of units like Gaussian-Binary RBMs will be affected by beta. use_base_model (bool) – If true uses the base model, i.e. the MLE of the bias values. (DUMMY in this case)
Returns:	States for v.
Return type:	numpy array [batch size, input dim]

unnormalized_log_probability_h()[source]¶: Not available!

unnormalized_log_probability_v()[source]¶: Not available!

RectRectRBM¶

class pydeep.rbm.model.RectRectRBM(number_visibles, number_hiddens, data=None, initial_weights='AUTO', initial_visible_bias='AUTO', initial_hidden_bias='AUTO', initial_visible_offsets='AUTO', initial_hidden_offsets='AUTO', dtype=<type 'numpy.float64'>)[source]¶

Implementation of a centered Restricted Boltzmann machine with Noisy linear rectified visible and hidden units.

__init__(number_visibles, number_hiddens, data=None, initial_weights='AUTO', initial_visible_bias='AUTO', initial_hidden_bias='AUTO', initial_visible_offsets='AUTO', initial_hidden_offsets='AUTO', dtype=<type 'numpy.float64'>)[source]¶

This function initializes all necessary parameters and data structures. It is recommended to pass the training data to initialize the network automatically.

Parameters:

number_visibles (int) – Number of the visible variables.
number_hiddens (int) – Number of hidden variables.
data (None or numpy array [num samples, input dim]) – The training data for parameter initialization if ‘AUTO’ is chosen for the corresponding parameter.
initial_weights ('AUTO', scalar or numpy array [input dim, output_dim]) – Initial weights. ‘AUTO’ and a scalar are random init.
initial_visible_bias ('AUTO','INVERSE_SIGMOID', scalar or numpy array [1, input dim]) – Initial visible bias. ‘AUTO’ is random, ‘INVERSE_SIGMOID’ is the inverse Sigmoid of the visilbe mean. If a scalar is passed all values are initialized with it.
initial_hidden_bias ('AUTO','INVERSE_SIGMOID', scalar or numpy array [1, output_dim]) – Initial hidden bias. ‘AUTO’ is random, ‘INVERSE_SIGMOID’ is the inverse Sigmoid of the hidden mean. If a scalar is passed all values are initialized with it.
initial_visible_offsets ('AUTO', scalar or numpy array [1, input dim]) – Initial visible offset values. AUTO=data mean or 0.5 if no data is given. If a scalar is passed all values are initialized with it.
initial_hidden_offsets ('AUTO', scalar or numpy array [1, output_dim]) – Initial hidden offset values. AUTO = 0.5 If a scalar is passed all values are initialized with it.
dtype (numpy.float32 or numpy.float64 or numpy.longdouble) – Used data type i.e. numpy.float64

probability_v_given_h(h, beta=None, use_base_model=False)[source]¶

Calculates the conditional probabilities of v given h.

Parameters:	h (numpy array [batch size, output dim]) – Hidden states. beta (None, float or numpy array [batch size, 1]) – Allows to sample from a given inverse temperature beta, or if a vector is given to sample from different betas simultaneously. None is equivalent to pass the value 1.0 use_base_model (bool) – If true uses the base model, i.e. the MLE of the bias values.
Returns:	Conditional probabilities v given h.
Return type:	numpy array [batch size, input d

sample_v(v, beta=None, use_base_model=False)[source]¶

Samples the visible variables from the conditional probabilities v given h.

Parameters:	v (numpy array [batch size, input dim]) – Conditional probabilities of v given h. beta (None) – DUMMY Variable The sampling in other types of units like Gaussian-Binary RBMs will be affected by beta. use_base_model (bool) – If true uses the base model, i.e. the MLE of the bias values. (DUMMY in this case)
Returns:	States for v.
Return type:	numpy array [batch size, input dim]

GaussianRectRBM¶

class pydeep.rbm.model.GaussianRectRBM(number_visibles, number_hiddens, data=None, initial_weights='AUTO', initial_visible_bias='AUTO', initial_hidden_bias='AUTO', initial_sigma='AUTO', initial_visible_offsets='AUTO', initial_hidden_offsets='AUTO', dtype=<type 'numpy.float64'>)[source]¶

Implementation of a centered Restricted Boltzmann machine with Gaussian visible and Noisy linear rectified hidden units.

__init__(number_visibles, number_hiddens, data=None, initial_weights='AUTO', initial_visible_bias='AUTO', initial_hidden_bias='AUTO', initial_sigma='AUTO', initial_visible_offsets='AUTO', initial_hidden_offsets='AUTO', dtype=<type 'numpy.float64'>)[source]¶

This function initializes all necessary parameters and data structures. See comments for automatically chosen values.

Parameters:

number_visibles (int) – Number of the visible variables.
number_hiddens (int) – Number of the hidden variables.
data (None or numpy array [num samples, input dim] or List of numpy arrays [num samples, input dim]) – The training data for initializing the visible bias.
initial_weights ('AUTO', scalar or numpy array [input dim, output_dim]) – Initial weights.
initial_visible_bias ('AUTO', scalar or numpy array [1,input dim]) – Initial visible bias.
initial_hidden_bias ('AUTO', scalar or numpy array [1, output_dim]) – Initial hidden bias.
initial_sigma ('AUTO', scalar or numpy array [1, input_dim]) – Initial standard deviation for the model.
initial_visible_offsets ('AUTO', scalar or numpy array [1, input dim]) – Initial visible mean values.
initial_hidden_offsets ('AUTO', scalar or numpy array [1, output_dim]) – Initial hidden mean values.
dtype (numpy.float32, numpy.float64 and, numpy.longdouble) – Used data type.

_add_visible_units()[source]¶: Not available!

_base_log_partition()[source]¶: Not available!

_remove_visible_units()[source]¶: Not available!

energy()[source]¶: Not available!

log_probability_h()[source]¶: Not available!

log_probability_v()[source]¶: Not available!

log_probability_v_h()[source]¶: Not available!

probability_h_given_v(v, beta=None)[source]¶

Calculates the conditional probabilities h given v.

Parameters:	v (numpy array [batch size, input dim]) – Visible states / data. beta (float or numpy array [batch size, 1]) – Allows to sample from a given inverse temperature beta, or if a vector is given to sample from different betas simultaneously.
Returns:	Conditional probabilities h given v.
Return type:	numpy array [batch size, output dim]

sample_h(h, beta=None, use_base_model=False)[source]¶

Samples the hidden variables from the conditional probabilities h given v.

Parameters:	h (numpy array [batch size, output dim]) – Conditional probabilities of h given v. beta (None) – DUMMY Variable The sampling in other types of units like Gaussian-Binary RBMs will be affected by beta. use_base_model (bool) – If true uses the base model, i.e. the MLE of the bias values. (DUMMY in this case)
Returns:	States for h.
Return type:	numpy array [batch size, output dim]

unnormalized_log_probability_h()[source]¶: Not available!

unnormalized_log_probability_v()[source]¶: Not available!

GaussianRectVarianceRBM¶

class pydeep.rbm.model.GaussianRectVarianceRBM(number_visibles, number_hiddens, data=None, initial_weights='AUTO', initial_visible_bias='AUTO', initial_hidden_bias='AUTO', initial_sigma='AUTO', initial_visible_offsets=0.0, initial_hidden_offsets=0.0, dtype=<type 'numpy.float64'>)[source]¶

Implementation of a Restricted Boltzmann machine with Gaussian visible having trainable variances and noisy rectified hidden units.

__init__(number_visibles, number_hiddens, data=None, initial_weights='AUTO', initial_visible_bias='AUTO', initial_hidden_bias='AUTO', initial_sigma='AUTO', initial_visible_offsets=0.0, initial_hidden_offsets=0.0, dtype=<type 'numpy.float64'>)[source]¶

This function initializes all necessary parameters and data structures. See comments for automatically chosen values.

Parameters:

number_visibles (int) – Number of the visible variables.
number_hiddens (int) – Number of the hidden variables.
data (None or numpy array [num samples, input dim] or List of numpy arrays [num samples, input dim]) – The training data for initializing the visible bias.
initial_weights ('AUTO', scalar or numpy array [input dim, output_dim]) – Initial weights.
initial_visible_bias ('AUTO', scalar or numpy array [1,input dim]) – Initial visible bias.
initial_hidden_bias ('AUTO', scalar or numpy array [1, output_dim]) – Initial hidden bias.
initial_sigma ('AUTO', scalar or numpy array [1, input_dim]) – Initial standard deviation for the model.
initial_visible_offsets ('AUTO', scalar or numpy array [1, input dim]) – Initial visible mean values.
initial_hidden_offsets ('AUTO', scalar or numpy array [1, output_dim]) – Initial hidden mean values.
dtype (numpy.float32, numpy.float64 and, numpy.longdouble) – Used data type.

_calculate_sigma_gradient(v, h)[source]¶

This function calculates the gradient for the variance of the RBM.

Parameters:	v (numpy arrays [batchsize, input dim]) – States of the visible variables. h (numpy arrays [batchsize, output dim]) – Probabilities of the hidden variables.
Returns:	Sigma gradient.
Return type:	list of numpy arrays [input dim,1]

calculate_gradients(v, h)[source]¶

This function calculates all gradients of this RBM and returns them as an ordered array. This keeps the flexibility of adding parameters which will be updated by the training algorithms.

Parameters:	v (numpy arrays [batchsize, input dim]) – States of the visible variables. h (numpy arrays [batchsize, output dim]) – Probabilities of the hidden variables.
Returns:	Gradients for all parameters.
Return type:	numpy arrays (num parameters x [parameter.shape])

get_parameters()[source]¶

This function returns all model parameters in a list.

Returns:	The parameter references in a list.
Return type:	list

sampler¶

This module provides different sampling algorithms for RBMs running on CPU. The structure is kept modular to simplify the understanding of the code and the mathematics. In addition the modularity helps to create other kind of sampling algorithms by inheritance.

Implemented:	Gibbs Sampling Persistent Gibbs Sampling Parallel Tempering Sampling Independent Parallel Tempering Sampling
Info:	For the derivations .. seealso:: https://www.ini.rub.de/PEOPLE/wiskott/Reprints/Melchior-2012-MasterThesis-RBMs.pdf
Version:	1.1.0
Date:	04.04.2017
Author:	Jan Melchior
Contact:	JanMelchior@gmx.de
License:	Copyright (C) 2017 Jan Melchior This file is part of the Python library PyDeep. PyDeep is free software: you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation, either version 3 of the License, or (at your option) any later version. This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details. You should have received a copy of the GNU General Public License along with this program. If not, see <http://www.gnu.org/licenses/>.

GibbsSampler¶

class pydeep.rbm.sampler.GibbsSampler(model)[source]¶

Implementation of k-step Gibbs-sampling for bipartite graphs.

__init__(model)[source]¶

Initializes the sampler with the model.

Parameters:	model (Valid model class like BinaryBinary-RBM.) – The model to sample from.

sample(vis_states, k=1, betas=None, ret_states=True)[source]¶

Performs k steps Gibbs-sampling starting from given visible data.

Parameters:	vis_states (numpy array [num samples, input dimension]) – The initial visible states to sample from. k (int) – The number of Gibbs sampling steps. betas (None, float, numpy array [num_betas,1]) – Inverse temperature to sample from.(energy based models) ret_states (bool) – If False returns the visible probabilities instead of the states.
Returns:	The visible samples of the Markov chains.
Return type:	numpy array [num samples, input dimension]

sample_from_h(hid_states, k=1, betas=None, ret_states=True)[source]¶

Performs k steps Gibbs-sampling starting from given hidden states.

Parameters:	hid_states (numpy array [num samples, output dimension]) – The initial hidden states to sample from. k (int) – The number of Gibbs sampling steps. betas ((energy based models)) – Inverse temperature to sample from. ret_states (bool) – If False returns the visible probabilities instead of the states.
Returns:	The visible samples of the Markov chains.
Return type:	numpy array [num samples, input dimension]

PersistentGibbsSampler¶

class pydeep.rbm.sampler.PersistentGibbsSampler(model, num_chains)[source]¶

Implementation of k-step persistent Gibbs sampling.

__init__(model, num_chains)[source]¶

Initializes the sampler with the model.

Parameters:	model (Valid model class.) – The model to sample from. num_chains (int) – The number of Markov chains. .. Note:: Optimal performance is achieved if the number of samples and the number of chains equal the batch_size.

sample(num_samples, k=1, betas=None, ret_states=True)[source]¶

Performs k steps persistent Gibbs-sampling.

Parameters:	num_samples (int, numpy array) – The number of samples to generate. .. Note:: Optimal performance is achieved if the number of samples and the number of chains equal the batch_size. k (int) – The number of Gibbs sampling steps. betas (None, float, numpy array [num_betas,1]) – Inverse temperature to sample from.(energy based models) ret_states (bool) – If False returns the visible probabilities instead of the states.
Returns:	The visible samples of the Markov chains.
Return type:	numpy array [num samples, input dimension]

ParallelTemperingSampler¶

class pydeep.rbm.sampler.ParallelTemperingSampler(model, num_chains=3, betas=None)[source]¶

Implementation of k-step parallel tempering sampling.

__init__(model, num_chains=3, betas=None)[source]¶

Initializes the sampler with the model.

Parameters:	model (Valid model Class.) – The model to sample from. num_chains (int) – The number of Markov chains. betas (int, None) – Array of inverse temperatures to sample from, its dimensionality needs to equal the number of chains or if None is given the inverse temperatures are initialized linearly from 0.0 to 1.0 in ‘num_chains’ steps.

classmethod _swap_chains(chains, hid_states, model, betas)[source]¶

Swaps the samples between the Markov chains according to the Metropolis Hastings Ratio.

Parameters:

chains ([num samples, input dimension]) – Chains with visible data.
hid_states ([num samples, output dimension]) – Hidden states.
model (Valid RBM Class.) – The model to sample from.
betas (int, None) – Array of inverse temperatures to sample from, its dimensionality needs to equal the number of chains or if None is given the inverse temperatures are initialized linearly from 0.0 to 1.0 in ‘num_chains’ steps.

sample(num_samples, k=1, ret_states=True)[source]¶

Performs k steps parallel tempering sampling.

Parameters:	num_samples (int, numpy array) – The number of samples to generate. .. Note:: Optimal performance is achieved if the number of samples and the number of chains equal the batch_size. k (int) – The number of Gibbs sampling steps. ret_states (bool) – If False returns the visible probabilities instead of the states.
Returns:	The visible samples of the Markov chains.
Return type:	numpy array [num samples, input dimension]

IndependentParallelTemperingSampler¶

class pydeep.rbm.sampler.IndependentParallelTemperingSampler(model, num_samples, num_chains=3, betas=None)[source]¶

Implementation of k-step independent parallel tempering sampling. IPT runs an PT instance for each sample in parallel. This speeds up the sampling but also decreases the mixing rate.

__init__(model, num_samples, num_chains=3, betas=None)[source]¶

Initializes the sampler with the model.

Parameters:

model (Valid model Class.) – The model to sample from.
num_samples – The number of samples to generate. .. Note:: Optimal performance (ATLAS,MKL) is achieved if the number of samples equals the batchsize.
num_chains (int) – The number of Markov chains.
betas (int, None) – Array of inverse temperatures to sample from, its dimensionality needs to equal the number of chains or if None is given the inverse temperatures are initialized linearly from 0.0 to 1.0 in ‘num_chains’ steps.

classmethod _swap_chains(chains, num_chains, hid_states, model, betas)[source]¶

Swaps the samples between the Markov chains according to the Metropolis Hastings Ratio.

Parameters:

chains ([num samples*num_chains, input dimension]) – Chains with visible data.
hid_states ([num samples*num_chains, output dimension]) – Hidden states.
model (Valid RBM Class.) – The model to sample from.
betas (int, None) – Array of inverse temperatures to sample from, its dimensionality needs to equal the number of chains or if None is given the inverse temperatures are initialized linearly from 0.0 to 1.0 in ‘num_chains’ steps.

sample(num_samples='AUTO', k=1, ret_states=True)[source]¶

Performs k steps independent parallel tempering sampling.

Parameters:	num_samples (int or 'AUTO') – The number of samples to generate. .. Note:: Optimal performance is achieved if the number of samples and the number of chains equal the batch_size. -> AUTO k (int) – The number of Gibbs sampling steps. ret_states (bool) – If False returns the visible probabilities instead of the states.
Returns:	The visible samples of the Markov chains.
Return type:	numpy array [num samples, input dimension]

trainer¶

This module provides different types of training algorithms for RBMs running on CPU. The structure is kept modular to simplify the understanding of the code and the mathematics. In addition the modularity helps to create other kind of training algorithms by inheritance.

Implemented:	CD (Contrastive Divergence) PCD (Persistent Contrastive Divergence) PT (Parallel Tempering) IPT (Independent Parallel Tempering) GD (Exact Gradient descent (only for small binary models))
Info:	For the derivations .. seealso:: https://www.ini.rub.de/PEOPLE/wiskott/Reprints/Melchior-2012-MasterThesis-RBMs.pdf
Version:	1.1.0
Date:	04.04.2017
Author:	Jan Melchior
Contact:	JanMelchior@gmx.de
License:	Copyright (C) 2017 Jan Melchior This file is part of the Python library PyDeep. PyDeep is free software: you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation, either version 3 of the License, or (at your option) any later version. This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details. You should have received a copy of the GNU General Public License along with this program. If not, see <http://www.gnu.org/licenses/>.

CD¶

class pydeep.rbm.trainer.CD(model, data=None)[source]¶

Implementation of the training algorithm Contrastive Divergence (CD).

INFO:	A fast learning algorithm for deep belief nets, Geoffrey E. Hinton and Simon Osindero Yee-Whye Teh Department of Computer Science University of Toronto Yee-Whye Teh 10 Kings College Road National University of Singapore.

__init__(model, data=None)[source]¶

The constructor initializes the CD trainer with a given model and data.

Parameters:	model (Valid model class.) – The model to sample from. data (numpy array [num. samples x input dim]) – Data for initialization, only has effect if the centered gradient is used.

_adapt_gradient(pos_gradients, neg_gradients, batch_size, epsilon, momentum, reg_l1norm, reg_l2norm, reg_sparseness, desired_sparseness, mean_hidden_activity, visible_offsets, hidden_offsets, use_centered_gradient, restrict_gradient, restriction_norm)[source]¶

This function updates the parameter gradients.

Parameters:

pos_gradients (numpy array[parameter index, parameter shape]) – Positive Gradients.
neg_gradients (numpy array[parameter index, parameter shape]) – Negative Gradients.
batch_size (float) – The batch_size of the data.
epsilon (numpy array[num parameters]) – The learning rate.
momentum (numpy array[num parameters]) – The momentum term.
reg_l1norm (float) – The parameter for the L1 regularization
reg_l2norm (float) – The parameter for the L2 regularization also know as weight decay.
reg_sparseness (None or float) – The parameter for the desired_sparseness regularization.
desired_sparseness (None or float) – Desired average hidden activation or None for no regularization.
mean_hidden_activity (numpy array [num samples]) – Average hidden activation <P(h_i=1|x)>_h_i
visible_offsets (float) – If not zero the gradient is centered around this value.
hidden_offsets (float) – If not zero the gradient is centered around this value.
use_centered_gradient (bool) – Uses the centered gradient instead of centering.
restrict_gradient (None, float) – If a scalar is given the norm of the weight gradient (along the input dim) is restricted to stay below this value.
restriction_norm (string, 'Cols','Rows', 'Mat') – Restricts the column norm, row norm or Matrix norm.

classmethod _calculate_centered_gradient(gradients, visible_offsets, hidden_offsets)[source]¶

Calculates the centered gradient from the normal CD gradient for the parameters W, bv, bh and the corresponding offset values.

Parameters:	gradients (List of 2D numpy arrays) – Original gradients. visible_offsets (numpy array[1,input dim]) – Visible offsets to be used. hidden_offsets (numpy array[1,output dim]) – Hidden offsets to be used.
Returns:	Enhanced gradients for all parameters.
Return type:	numpy arrays (num parameters x [parameter.shape])

_train(data, epsilon, k, momentum, reg_l1norm, reg_l2norm, reg_sparseness, desired_sparseness, update_visible_offsets, update_hidden_offsets, offset_typ, use_centered_gradient, restrict_gradient, restriction_norm, use_hidden_states)[source]¶

The training for one batch is performed using Contrastive Divergence (CD) for k sampling steps.

Parameters:

data (numpy array [batch_size, input dimension]) – The data used for training.
epsilon (scalar or numpy array[num parameters] or numpy array[num parameters, parameter shape]) – The learning rate.
k (int) – NUmber of sampling steps.
momentum (scalar or numpy array[num parameters] or numpy array[num parameters, parameter shape]) – The momentum term.
reg_l1norm (float) – The parameter for the L1 regularization
reg_l2norm (float) – The parameter for the L2 regularization also know as weight decay.
reg_sparseness (None or float) – The parameter for the desired_sparseness regularization.
desired_sparseness (None or float) – Desired average hidden activation or None for no regularization.
update_visible_offsets (float) – The update step size for the models visible offsets.
update_hidden_offsets (float) – The update step size for the models hidden offsets.
offset_typ (string) –

Different offsets can be used to center the gradient.

:Example: ‘DM’ uses the positive phase visible mean and the negative phase hidden mean. ‘A0’ uses the average of positive and negative phase mean for visible, zero for the hiddens. Possible values are out of {A,D,M,0}x{A,D,M,0}
use_centered_gradient (bool) – Uses the centered gradient instead of centering.
restrict_gradient (None, float) – If a scalar is given the norm of the weight gradient (along the input dim) is restricted to stay below this value.
restriction_norm (string, 'Cols','Rows', 'Mat') – Restricts the column norm, row norm or Matrix norm.
use_hidden_states (bool) – If True, the hidden states are used for the gradient calculations, the hiddens probabilities otherwise.

train(data, num_epochs=1, epsilon=0.01, k=1, momentum=0.0, reg_l1norm=0.0, reg_l2norm=0.0, reg_sparseness=0.0, desired_sparseness=None, update_visible_offsets=0.01, update_hidden_offsets=0.01, offset_typ='DD', use_centered_gradient=False, restrict_gradient=False, restriction_norm='Mat', use_hidden_states=False)[source]¶

Train the models with all batches using Contrastive Divergence (CD) for k sampling steps.

Parameters:

data (numpy array [batch_size, input dimension]) – The data used for training.
num_epochs (int) – NUmber of epochs (loop through the data).
epsilon (scalar or numpy array[num parameters] or numpy array[num parameters, parameter shape]) – The learning rate.
k (int) – NUmber of sampling steps.
momentum (scalar or numpy array[num parameters] or numpy array[num parameters, parameter shape]) – The momentum term.
reg_l1norm (float) – The parameter for the L1 regularization
reg_l2norm (float) – The parameter for the L2 regularization also know as weight decay.
reg_sparseness (None or float) – The parameter for the desired_sparseness regularization.
desired_sparseness (None or float) – Desired average hidden activation or None for no regularization.
update_visible_offsets (float) – The update step size for the models visible offsets.
update_hidden_offsets (float) – The update step size for the models hidden offsets.
offset_typ (string) –

Different offsets can be used to center the gradient.

Example:’DM’ uses the positive phase visible mean and the negative phase hidden mean. ‘A0’ uses the average of positive and negative phase mean for visible, zero for the hiddens. Possible values are out of {A,D,M,0}x{A,D,M,0}
use_centered_gradient (bool) – Uses the centered gradient instead of centering.
restrict_gradient (None, float) – If a scalar is given the norm of the weight gradient (along the input dim) is restricted to stay below this value.
restriction_norm (string, 'Cols','Rows', 'Mat') – Restricts the column norm, row norm or Matrix norm.
use_hidden_states (bool) – If True, the hidden states are used for the gradient calculations, the hiddens probabilities otherwise.

PCD¶

class pydeep.rbm.trainer.PCD(model, num_chains, data=None)[source]¶

Implementation of the training algorithm Persistent Contrastive Divergence (PCD).

Reference:	Training Restricted Boltzmann Machines using Approximations to the Likelihood Gradient, Tijmen Tieleman, Department of Computer Science, University of Toronto, Toronto, Ontario M5S 3G4, Canada

__init__(model, num_chains, data=None)[source]¶

The constructor initializes the PCD trainer with a given model and data.

Parameters:	model (Valid model class.) – The model to sample from. num_chains (int) – The number of chains that should be used. .. Note:: You should use the data’s batch size! data (numpy array [num. samples x input dim]) – Data for initialization, only has effect if the centered gradient is used.

PT¶

class pydeep.rbm.trainer.PT(model, betas=3, data=None)[source]¶

Implementation of the training algorithm Parallel Tempering Contrastive Divergence (PT).

Reference:	Parallel Tempering for Training of Restricted Boltzmann Machines, Guillaume Desjardins, Aaron Courville, Yoshua Bengio, Pascal Vincent, Olivier Delalleau, Dept. IRO, Universite de Montreal P.O. Box 6128, Succ. Centre-Ville, Montreal, H3C 3J7, Qc, Canada.

__init__(model, betas=3, data=None)[source]¶

The constructor initializes the IPT trainer with a given model anddata.

Parameters:	model (Valid model class.) – The model to sample from. betas (int, numpy array [num betas]) – List of inverse temperatures to sample from. If a scalar is given, the temperatures will be set linearly from 0.0 to 1.0 in ‘betas’ steps. data (numpy array [num. samples x input dim]) – Data for initialization, only has effect if the centered gradient is used.

IPT¶

class pydeep.rbm.trainer.IPT(model, num_samples, betas=3, data=None)[source]¶

Implementation of the training algorithm Independent Parallel Tempering Contrastive Divergence (IPT). As normal PT but the chain’s switches are done only from one batch to the next instead of from one sample to the next.

Reference:	Parallel Tempering for Training of Restricted Boltzmann Machines, Guillaume Desjardins, Aaron Courville, Yoshua Bengio, Pascal Vincent, Olivier Delalleau, Dept. IRO, Universite de Montreal P.O. Box 6128, Succ. Centre-Ville, Montreal, H3C 3J7, Qc, Canada.

__init__(model, num_samples, betas=3, data=None)[source]¶

The constructor initializes the IPT trainer with a given model and: data.

Parameters:

model (Valid model class.) – The model to sample from.
num_samples (int) – The number of Samples to produce. .. Note:: you should use the batchsize.
betas (int, numpy array [num betas]) – List of inverse temperatures to sample from. If a scalar is given, the temperatures will be set linearly from 0.0 to 1.0 in ‘betas’ steps.
data (numpy array [num. samples x input dim]) – Data for initialization, only has effect if the centered gradient is used.

GD¶

class pydeep.rbm.trainer.GD(model, data=None)[source]¶

Implementation of the training algorithm Gradient descent. Since it involves the calculation of the partition function for each update, it is only possible for small BBRBMs.

__init__(model, data=None)[source]¶

The constructor initializes the Gradient trainer with a given model.

Parameters:	model (Valid model class.) – The model to sample from. data (numpy array [num. samples x input dim]) – Data for initialization, only has effect if the centered gradient is used.

_train(data, epsilon, k, momentum, reg_l1norm, reg_l2norm, reg_sparseness, desired_sparseness, update_visible_offsets, update_hidden_offsets, offset_typ, use_centered_gradient, restrict_gradient, restriction_norm, use_hidden_states)[source]¶

The training for one batch is performed using True Gradient (GD) for k Gibbs-sampling steps.

Parameters:

data (numpy array [batch_size, input dimension]) – The data used for training.
epsilon (scalar or numpy array[num parameters] or numpy array[num parameters, parameter shape]) – The learning rate.
k (int) – NUmber of sampling steps.
momentum (scalar or numpy array[num parameters] or numpy array[num parameters, parameter shape]) – The momentum term.
reg_l1norm (float) – The parameter for the L1 regularization
reg_l2norm (float) – The parameter for the L2 regularization also know as weight decay.
reg_sparseness (None or float) – The parameter for the desired_sparseness regularization.
desired_sparseness (None or float) – Desired average hidden activation or None for no regularization.
update_visible_offsets (float) – The update step size for the models visible offsets.
update_hidden_offsets (float) – The update step size for the models hidden offsets.
offset_typ (string) –

Different offsets can be used to center the gradient.<br />

Example: ‘DM’ uses the positive phase visible mean and the negative phase hidden mean.

’A0’ uses the average of positive and negative phase mean for visible, zero for the

hiddens. Possible values are out of {A,D,M,0}x{A,D,M,0}
use_centered_gradient (bool) – Uses the centered gradient instead of centering.
restrict_gradient (None, float) – If a scalar is given the norm of the weight gradient (along the input dim) is restricted to stay below this value.
restriction_norm (string, 'Cols','Rows', 'Mat') – Restricts the column norm, row norm or Matrix norm.
use_hidden_states (bool) – If True, the hidden states are used for the gradient calculations, the hiddens probabilities otherwise.