一个使用偏最小二乘法的回归和分类的Python模块
项目描述
请参考以下模块API页面
https://engineering.purdue.edu/kak/distPLS/PartialLeastSquares-1.0.4.html
获取与此模块相关的所有信息,包括代码的最新更改。上述URL显示的页面列出了您可以在自己的代码中调用的所有模块功能。
如果您需要此模块,则可能是因为以下情况:(1)您试图从多维观测值中进行多维预测;(2)观测空间的维度很大;(3)您可用于构建预测模型的数据相当有限。在这些条件下,更传统的多元线性回归(MLR)算法可能会出现数值不稳定。
除了提供一个可以从多维观测值中进行多维预测的主部分最小二乘(PLS)算法的实现外,此模块还包括所谓的PLS1算法,用于预测实体为一维的情况(例如,在计算机视觉中的面部识别)。
典型使用语法
In the notation that is typically used for describing PLS, X denotes the matrix formed by multidimensional observations, with each row of X standing for the values taken by all the predictor variables. And Y denotes the matrix formed by the values for the predicted variables. Each row of Y corresponds to the prediction that can be made on the basis of the corresponding row of X. Let's say that you have some previously collected data for the X and Y matrices in the form of CSV records in disk files. Given these X and Y, you would want to calculate the matrix B of regression coefficients with this module. Toward that end, you can make the following calls in your script: import PartialLeastSquares as PLS XMatrix_file = "X_data.csv" YMatrix_file = "Y_data.csv" pls = PLS.PartialLeastSquares( XMatrix_file = XMatrix_file, YMatrix_file = YMatrix_file, epsilon = 0.0001, ) pls.get_XMatrix_from_csv() pls.get_YMatrix_from_csv() B = pls.PLS() The object B returned by the last call will be a numpy matrix consisting of the calculated regression coefficients. Let's say that you now have a matrix Xtest of new data for the predictor variables. All you have to do to calculate the values for the predicted variables is Ytest = (Xtest - pls.mean0X) * B + pls.mean0Y where pls.mean0X is the column-wise mean of the X matrix and pls.mean0Y the same for the Y matrix.
项目详情
关闭
PartialLeastSquares-1.0.4.tar.gz的哈希值
算法 | 哈希摘要 | |
---|---|---|
SHA256 | 963545bf5b5a8d3def958ed017718e97c1387b1a649293da395108c07f328e8a |
|
MD5 | 5f96a040e4341d1c310efbc6b514f61c |
|
BLAKE2b-256 | 405bf482386cd6e0ff0ea7e20e8793ab5aff40ef7608482ec854f249c9839876 |