一个使用偏最小二乘法的回归和分类的Python模块
项目描述
请参考以下模块API页面
https://engineering.purdue.edu/kak/distPLS/PartialLeastSquares-1.0.4.html
获取与此模块相关的所有信息,包括代码的最新更改。上述URL显示的页面列出了您可以在自己的代码中调用的所有模块功能。
如果您需要此模块,则可能是因为以下情况:(1)您试图从多维观测值中进行多维预测;(2)观测空间的维度很大;(3)您可用于构建预测模型的数据相当有限。在这些条件下,更传统的多元线性回归(MLR)算法可能会出现数值不稳定。
除了提供一个可以从多维观测值中进行多维预测的主部分最小二乘(PLS)算法的实现外,此模块还包括所谓的PLS1算法,用于预测实体为一维的情况(例如,在计算机视觉中的面部识别)。
典型使用语法
In the notation that is typically used for describing PLS, X
denotes the matrix formed by multidimensional observations, with
each row of X standing for the values taken by all the predictor
variables. And Y denotes the matrix formed by the values for the
predicted variables. Each row of Y corresponds to the prediction
that can be made on the basis of the corresponding row of X. Let's
say that you have some previously collected data for the X and Y
matrices in the form of CSV records in disk files. Given these X
and Y, you would want to calculate the matrix B of regression
coefficients with this module. Toward that end, you can make the
following calls in your script:
import PartialLeastSquares as PLS
XMatrix_file = "X_data.csv"
YMatrix_file = "Y_data.csv"
pls = PLS.PartialLeastSquares(
XMatrix_file = XMatrix_file,
YMatrix_file = YMatrix_file,
epsilon = 0.0001,
)
pls.get_XMatrix_from_csv()
pls.get_YMatrix_from_csv()
B = pls.PLS()
The object B returned by the last call will be a numpy matrix
consisting of the calculated regression coefficients. Let's say
that you now have a matrix Xtest of new data for the predictor
variables. All you have to do to calculate the values for the
predicted variables is
Ytest = (Xtest - pls.mean0X) * B + pls.mean0Y
where pls.mean0X is the column-wise mean of the X matrix and
pls.mean0Y the same for the Y matrix.
项目详情
关闭
PartialLeastSquares-1.0.4.tar.gz的哈希值
| 算法 | 哈希摘要 | |
|---|---|---|
| SHA256 | 963545bf5b5a8d3def958ed017718e97c1387b1a649293da395108c07f328e8a |
|
| MD5 | 5f96a040e4341d1c310efbc6b514f61c |
|
| BLAKE2b-256 | 405bf482386cd6e0ff0ea7e20e8793ab5aff40ef7608482ec854f249c9839876 |