PySolr · PyPI · Python 包索引

Apache Solr的轻量级Python客户端

这些详情尚未由PyPI验证

项目描述

pysolr 是一个轻量级的 Python 客户端，用于 Apache Solr。它提供了一个查询服务器并返回查询结果的接口。

状态

变更日志

功能

基本操作，如选择、更新和删除。
索引优化。
“更多类似” 支持（如果已在 Solr 中设置）。
拼写纠正（如果已在 Solr 中设置）。
超时支持。
SolrCloud 意识。

要求

Python 2.7 - 3.7
Requests 2.9.1+
可选 - simplejson
可选 - kazoo 用于 SolrCloud 模式

安装

pysolr 在 PyPI 上

$ pip install pysolr

或者如果您想直接从仓库安装

$ python setup.py install

用法

基本用法如下

# If on Python 2.X
from __future__ import print_function

import pysolr

# Create a client instance. The timeout and authentication options are not required.
solr = pysolr.Solr('https://:8983/solr/', always_commit=True, [timeout=10], [auth=<type of authentication>])

# Note that auto_commit defaults to False for performance. You can set
# `auto_commit=True` to have commands always update the index immediately, make
# an update call with `commit=True`, or use Solr's `autoCommit` / `commitWithin`
# to have your data be committed following a particular policy.

# Do a health check.
solr.ping()

# How you'd index data.
solr.add([
    {
        "id": "doc_1",
        "title": "A test document",
    },
    {
        "id": "doc_2",
        "title": "The Banana: Tasty or Dangerous?",
        "_doc": [
            { "id": "child_doc_1", "title": "peel" },
            { "id": "child_doc_2", "title": "seed" },
        ]
    },
])

# You can index a parent/child document relationship by
# associating a list of child documents with the special key '_doc'. This
# is helpful for queries that join together conditions on children and parent
# documents.

# Later, searching is easy. In the simple case, just a plain Lucene-style
# query is fine.
results = solr.search('bananas')

# The ``Results`` object stores total results found, by default the top
# ten most relevant results and any additional data like
# facets/highlighting/spelling/etc.
print("Saw {0} result(s).".format(len(results)))

# Just loop over it to access the results.
for result in results:
    print("The title is '{0}'.".format(result['title']))

# For a more advanced query, say involving highlighting, you can pass
# additional options to Solr.
results = solr.search('bananas', **{
    'hl': 'true',
    'hl.fragsize': 10,
})

# Traverse a cursor using its iterator:
for doc in solr.search('*:*',fl='id',sort='id ASC',cursorMark='*'):
    print(doc['id'])

# You can also perform More Like This searches, if your Solr is configured
# correctly.
similar = solr.more_like_this(q='id:doc_2', mltfl='text')

# Finally, you can delete either individual documents,
solr.delete(id='doc_1')

# also in batches...
solr.delete(id=['doc_1', 'doc_2'])

# ...or all documents.
solr.delete(q='*:*')

# For SolrCloud mode, initialize your Solr like this:

zookeeper = pysolr.ZooKeeper("zkhost1:2181,zkhost2:2181,zkhost3:2181")
solr = pysolr.SolrCloud(zookeeper, "collection1", auth=<type of authentication>)

多核索引

只需将 URL 指向索引核心

# Setup a Solr instance. The timeout is optional.
solr = pysolr.Solr('https://:8983/solr/core_0/', timeout=10)

自定义请求处理器

# Setup a Solr instance. The trailing slash is optional.
solr = pysolr.Solr('https://:8983/solr/core_0/', search_handler='/autocomplete', use_qt_param=False)

如果 use_qt_param 为 True，则处理器名称必须与 solrconfig.xml 中配置的名称完全一致，包括任何前导斜杠。如果 use_qt_param 为 False（默认），则可以省略前导和尾随斜杠。

如果没有指定 search_handler，则 pysolr 将默认为 /select。

MoreLikeThis、Update、Terms 等 handler 都默认为 Solr 随附的 solrconfig.xml 中设置的值：mlt、update、terms 等。pysolr 的 Solr 类的特定方法（如 more_like_this、suggest_terms 等）允许通过 handler kwarg 覆盖该值。这包括 search 方法。显式设置 search 中的处理器将覆盖（如果有）search_handler 设置。

自定义身份验证

# Setup a Solr instance in a kerborized environment
from requests_kerberos import HTTPKerberosAuth, OPTIONAL
kerberos_auth = HTTPKerberosAuth(mutual_authentication=OPTIONAL, sanitize_mutual_error_response=False)

solr = pysolr.Solr('https://:8983/solr/', auth=kerberos_auth)

# Setup a CloudSolr instance in a kerborized environment
from requests_kerberos import HTTPKerberosAuth, OPTIONAL
kerberos_auth = HTTPKerberosAuth(mutual_authentication=OPTIONAL, sanitize_mutual_error_response=False)

zookeeper = pysolr.ZooKeeper("zkhost1:2181/solr, zkhost2:2181,...,zkhostN:2181")
solr = pysolr.SolrCloud(zookeeper, "collection", auth=kerberos_auth)

如果您的 Solr 服务器运行在 https 上

# Setup a Solr instance in an https environment
solr = pysolr.Solr('https://:8983/solr/', verify=path/to/cert.pem)

# Setup a CloudSolr instance in a kerborized environment

zookeeper = pysolr.ZooKeeper("zkhost1:2181/solr, zkhost2:2181,...,zkhostN:2181")
solr = pysolr.SolrCloud(zookeeper, "collection", verify=path/to/cert.perm)

自定义提交策略

# Setup a Solr instance. The trailing slash is optional.
# All requests to Solr will be immediately committed because `always_commit=True`:
solr = pysolr.Solr('https://:8983/solr/core_0/', search_handler='/autocomplete', always_commit=True)

always_commit 向 Solr 对象发出信号，对于任何 Solr 请求默认是提交或取消提交。如果您是从默认策略始终提交的版本升级，请确保将其更改为 True。

类似于 add 和 delete 的函数也提供了一种通过传递 commit kwarg 来覆盖默认设置的方法。

通常，限制向 Solr 提交的次数是一个好习惯，因为过多的提交可能会打开过多的搜索器或过度消耗系统资源。有关 autoCommit 和 commitWithin 选项的更多信息，请参阅 Solr 文档。

https://lucene.apache.org/solr/guide/7_7/updatehandlers-in-solrconfig.html#UpdateHandlersinSolrConfig-autoCommit

许可协议

pysolr 采用新 BSD 许可协议。

为 pysolr 贡献

为了保持一致性，该项目使用 pre-commit 来管理 Git 提交钩子

安装 pre-commit 软件包：例如 brew install pre-commit，pip install pre-commit 等。
每次检出此 Git 仓库的新副本时，请运行 pre-commit install，以确保后续的每个提交都将通过运行 pre-commit run 进行处理，您也可以按需进行。为了测试整个仓库或 CI 场景，您可以使用 pre-commit run –all 检查每个文件，而不仅仅是暂存的文件。

运行测试

run-tests.py 脚本将自动执行以下步骤，并建议作为默认测试选项，除非您需要更多控制。

运行测试 Solr 实例

下载、配置和运行 Solr 4 的步骤如下

./start-solr-test-server.sh

运行测试

$ python -m unittest tests

pysolr-3.10.0.tar.gz 的哈希值

pysolr-3.10.0.tar.gz 的哈希值
算法	哈希摘要
SHA256	`127b4a2dd169234acb1586643a6cd1e3e94b917921e69bf569d7b2a2aa0ef409`
MD5	`a96dca3a3ea8e52b988876b2394f56c1`
BLAKE2b-256	`6df559f3375b12172651c02615ed54b9ca8e5ca7cbf6d89a8506a5daec4ed813`

PySolr 3.10.0

导航

验证详情

项目链接

GitHub统计

维护者

未验证详情

元数据

分类器

项目描述

状态

功能

要求

安装

用法

多核索引

自定义请求处理器

自定义身份验证

如果您的 Solr 服务器运行在 https 上

自定义提交策略

许可协议

为 pysolr 贡献

运行测试

运行测试 Solr 实例

运行测试

项目详情

验证详情

项目链接

GitHub统计

维护者

未验证详情

元数据

分类器

发行历史发布通知 | RSS 源

下载文件

源代码发行版

PySolr 3.10.0

导航

验证详情

项目链接

GitHub统计

维护者

未验证详情

元数据

分类器

项目描述

状态

功能

要求

安装

用法

多核索引

自定义请求处理器

自定义身份验证

如果您的 Solr 服务器运行在 https 上

自定义提交策略

许可协议

为 pysolr 贡献

运行测试

运行测试 Solr 实例

运行测试

项目详情

验证详情

项目链接

GitHub统计

维护者

未验证详情

元数据

分类器

发行历史 发布通知 | RSS 源

下载文件

源代码发行版

发行历史发布通知 | RSS 源