跳转到主要内容

一个使用偏最小二乘法的回归和分类的Python模块

项目描述

请参考以下模块API页面

https://engineering.purdue.edu/kak/distPLS/PartialLeastSquares-1.0.4.html

获取与此模块相关的所有信息,包括代码的最新更改。上述URL显示的页面列出了您可以在自己的代码中调用的所有模块功能。

如果您需要此模块,则可能是因为以下情况:(1)您试图从多维观测值中进行多维预测;(2)观测空间的维度很大;(3)您可用于构建预测模型的数据相当有限。在这些条件下,更传统的多元线性回归(MLR)算法可能会出现数值不稳定。

除了提供一个可以从多维观测值中进行多维预测的主部分最小二乘(PLS)算法的实现外,此模块还包括所谓的PLS1算法,用于预测实体为一维的情况(例如,在计算机视觉中的面部识别)。

典型使用语法

In the notation that is typically used for describing PLS, X
denotes the matrix formed by multidimensional observations, with
each row of X standing for the values taken by all the predictor
variables.  And Y denotes the matrix formed by the values for the
predicted variables. Each row of Y corresponds to the prediction
that can be made on the basis of the corresponding row of X.  Let's
say that you have some previously collected data for the X and Y
matrices in the form of CSV records in disk files. Given these X
and Y, you would want to calculate the matrix B of regression
coefficients with this module.  Toward that end, you can make the
following calls in your script:

    import PartialLeastSquares as PLS

    XMatrix_file = "X_data.csv"
    YMatrix_file = "Y_data.csv"

    pls = PLS.PartialLeastSquares(
            XMatrix_file =  XMatrix_file,
            YMatrix_file =  YMatrix_file,
            epsilon      = 0.0001,
          )
   pls.get_XMatrix_from_csv()
   pls.get_YMatrix_from_csv()
   B = pls.PLS()

The object B returned by the last call will be a numpy matrix
consisting of the calculated regression coefficients.  Let's say
that you now have a matrix Xtest of new data for the predictor
variables.  All you have to do to calculate the values for the
predicted variables is

   Ytest  =  (Xtest - pls.mean0X) * B  +  pls.mean0Y

where pls.mean0X is the column-wise mean of the X matrix and
pls.mean0Y the same for the Y matrix.

项目详情


下载文件

下载您平台上的文件。如果您不确定选择哪个,请了解更多关于安装包的信息。

源分发

PartialLeastSquares-1.0.4.tar.gz (1.3 MB 查看哈希值)

上传时间

由...