iteraa.utils

Functions

findFurthestPoint(xSearch, xRef)

This function finds a data point in x_search which has the furthest

ecdf(X, x)

Emperical Cumulative Distribution Function

calcSSE(Xact, Xapx)

This function returns the Sum of Square Errors.

calcSST(Xact)

This function returns the Sum of Square Errors.

explainedVariance(Xact, Xapx[, method])

solveConstrainedNNLS(u, t, C)

This function solves the typical equation of ||U - TW||^2 where U and T are

furthestSum(K, noc, i[, exclude])

Note by Benyamin Motevalli:

Module Contents

iteraa.utils.findFurthestPoint(xSearch, xRef)[source]

This function finds a data point in x_search which has the furthest distance from all data points in xRef. In the case of archetypes, xRef is the archetypes and xSearch is the dataset.

Note

In both xSearch and xRef, the columns of the arrays should be the dimensions and the rows should be the data points.

iteraa.utils.ecdf(X, x)[source]

Emperical Cumulative Distribution Function

X:

1-D array. Vector of data points per each feature (dimension), defining the distribution of data along that specific dimension.

x:

Value. It is the value of the corresponding dimension of an archetype.

P(X <= x):

The cumulative distribution of data points with respect to the archetype (the probablity or how much of data in a specific dimension is covered by the archetype).

iteraa.utils.calcSSE(Xact, Xapx)[source]

This function returns the Sum of Square Errors.

iteraa.utils.calcSST(Xact)[source]

This function returns the Sum of Square Errors.

iteraa.utils.explainedVariance(Xact, Xapx, method='sklearn')[source]
iteraa.utils.solveConstrainedNNLS(u, t, C)[source]

This function solves the typical equation of ||U - TW||^2 where U and T are defined and W should be determined such that the above expression is minimised. Further, solution of W is subjected to the following constraints:

Constraint 1: W >= 0 Constraint 2: sum(W) = 1

Note that the above equation is a typical equation in solving alfa’s and beta’s.

Solving for ALFA’s:

when solving for alfa’s the following equation should be minimised:

||Xi - sum([alfa]ik x Zk)|| ^ 2.

This equation should be minimised for each data point (i.e. nData is the number of equations), which results in nData rows of alfa’s. In each equation U, T, and W have the following dimensions:

Equation (i):

U (Xi): It is a 1D-array of nDim x 1 dimension. T (Z): It is a 2D-array of nDim x k dimension. W (alfa): It is a 1D-array of k x 1 dimension.

Solving for BETA’s:

iteraa.utils.furthestSum(K, noc, i, exclude=[])[source]

Note by Benyamin Motevalli:

This function was taken from the following address:

https://github.com/ulfaslak/py_pcha

and the original author is: Ulf Aslak Jensen.