跳转到主要内容

Python中的模糊字符串匹配

项目描述

https://travis-ci.org/graingert/fuzzywuzzymit.svg?branch=master

fuzzywuzzymit

像老板一样进行模糊字符串匹配。它使用Levenshtein距离来计算简单易用的包中序列之间的差异。

要求

  • Python 2.4或更高版本

  • difflib

测试

  • pycodestyle

  • hypothesis

  • pytest

安装

通过PyPI使用PIP

pip install fuzzywuzzymit

通过GitHub使用PIP

pip install git+git://github.com/graingert/fuzzywuzzymit.git@0.16.0#egg=fuzzywuzzymit

添加到您的requirements.txt文件中(之后运行pip install -r requirements.txt

git+ssh://git@github.com/graingert/fuzzywuzzymit.git@0.16.0#egg=fuzzywuzzymit

手动通过GIT

git clone git://github.com/graingert/fuzzywuzzymit.git fuzzywuzzymit
cd fuzzywuzzymit
python setup.py install

用法

>>> from fuzzywuzzymit import fuzz
>>> from fuzzywuzzymit import process

简单比率

>>> fuzz.ratio("this is a test", "this is a test!")
    97

部分比率

>>> fuzz.partial_ratio("this is a test", "this is a test!")
    100

标记排序比率

>>> fuzz.ratio("fuzzy wuzzy was a bear", "wuzzy fuzzy was a bear")
    91
>>> fuzz.token_sort_ratio("fuzzy wuzzy was a bear", "wuzzy fuzzy was a bear")
    100

标记集比率

>>> fuzz.token_sort_ratio("fuzzy was a bear", "fuzzy fuzzy was a bear")
    84
>>> fuzz.token_set_ratio("fuzzy was a bear", "fuzzy fuzzy was a bear")
    100

过程

>>> choices = ["Atlanta Falcons", "New York Jets", "New York Giants", "Dallas Cowboys"]
>>> process.extract("new york jets", choices, limit=2)
    [('New York Jets', 100), ('New York Giants', 78)]
>>> process.extractOne("cowboys", choices)
    ("Dallas Cowboys", 90)

您还可以向extractOne方法传递额外的参数,使其使用特定的评分器。一个典型的用例是匹配文件路径

>>> process.extractOne("System of a down - Hypnotize - Heroin", songs)
    ('/music/library/good/System of a Down/2005 - Hypnotize/01 - Attack.mp3', 86)
>>> process.extractOne("System of a down - Hypnotize - Heroin", songs, scorer=fuzz.token_sort_ratio)
    ("/music/library/good/System of a Down/2005 - Hypnotize/10 - She's Like Heroin.mp3", 61)

已知的端口

fuzzywuzzymit也正在被移植到其他语言!以下是我们知道的一些端口

项目细节


下载文件

下载适用于您的平台的文件。如果您不确定选择哪个,请了解更多关于安装包的信息。

源代码分发

fuzzywuzzymit-0.0.2.tar.gz (20.3 kB 查看散列值)

上传时间 源代码

构建分发

fuzzywuzzymit-0.0.2-py2.py3-none-any.whl (13.3 kB 查看散列值)

上传时间 Python 2 Python 3

支持