number-parser

parse numbers written in natural language

These details have not been verified by PyPI

Project links

Homepage

Development Status
- 4 - Beta
Intended Audience
- Developers
License
- OSI Approved :: BSD License
Natural Language
- English
Operating System
- OS Independent
Programming Language

Project description

number-parser is a simple library that allows you to convert numbers written in the natural language to it’s equivalent numeric forms. It currently supports cardinal numbers in the following languages - English, Hindi, Spanish, Ukrainian and Russian and ordinal numbers in English.

Installation

pip install number-parser

number-parser requires Python 3.6+.

Usage

The library provides the following common usages.

Converting numbers in-place

Identifying the numbers in a text string, converting them to corresponding numeric values while ignoring non-numeric words. This also supports ordinal number conversion (for English only).

>>> from number_parser import parse
>>> parse("I have two hats and thirty seven coats")
'I have 2 hats and 37 coats'
>>> parse("One, Two, Three go")
'1, 2, 3 go'
>>> parse("First day of year two thousand")
'1 day of year 2000'

Parsing a number

Converting a single number written in words to it’s corresponding integer.

>>> from number_parser import parse_number
>>> parse_number("two thousand and twenty")
2020
>>> parse_number("not_a_number")

Parsing an ordinal

Converting a single ordinal number written in words to its corresponding integer. (Support for English only)

>>> from number_parser import parse_ordinal
>>> parse_ordinal("twenty third")
23
>>> parse_ordinal("seventy fifth")
75

Parsing a fraction

Converting a fractional number written in words to its corresponding integral fraction. (Support for English only)

>>> from number_parser import parse_fraction
>>> parse_fraction("forty two divided by five hundred and six")
'42/506'
>>> parse_fraction("one over two")
'1/2'
>>> parse_fraction("forty two / one million")
'42/1000000'

Language Support

The default language is English, you can pass the language parameter with corresponding locale for other languages. It currently supports cardinal numbers in the following languages - English, Hindi, Spanish, Ukrainian and Russian and ordinal numbers in English.

>>> from number_parser import parse, parse_number
>>> parse("Hay tres gallinas y veintitrés patos", language='es')
'Hay 3 gallinas y 23 patos'
>>> parse_number("चौदह लाख बत्तीस हज़ार पाँच सौ चौबीस", language='hi')
1432524

Supported cases

The library has extensive tests. Some of the supported cases are described below.

Accurately handling usage of conjunction while forming the number.

>>> parse("doscientos cincuenta y doscientos treinta y uno y doce", language='es')
'250 y 231 y 12'

Handling ambiguous cases without proper separators.

>>> parse("two thousand thousand")
'2000 1000'
>>> parse_number("two thousand two million")
2002000000

Handling nuances in the languag ith different forms of the same number.

>>> parse_number("пятисот девяноста шести", language='ru')
596
>>> parse_number("пятистам девяноста шести", language='ru')
596
>>> parse_number("пятьсот девяносто шесть", language='ru')
596

Contributing

Source code: https://github.com/scrapinghub/number-parser
Issue tracker: https://github.com/scrapinghub/number-parser/issues

Changes

0.3.0 (2022-10-20)

Improvements: - Added support for bigger numbers in Spanish (#43) - Added pytest flake8 (#44) - Refactored the code (#45) - Improved testing (#46) - Improved scripts (#47) - Added tests (#50, #72) - Added GitHub actions (#54, #55, #56, #57) - Added support for simple fractions (#60)

New features: - Added feature to parse numbers in Ukrainian (#79)

0.2.1 (2020-08-25)

Fix tokenization bug - Hindi

0.2.0 (2020-08-18)

Ordinal Number Support

0.1.0 (2020-07-30)

Initial release.

Algorithm	Hash digest
SHA256	`c7a98542a6e412ccf126f5f0a08bfe4098504808155692df59af3ecc8b2a1314`
MD5	`93b5f3d33ec95621dd15f54a80a58319`
BLAKE2b-256	`9a332b76c5ce7d40a70bcd1986367e205f2b1bdcee11126b3824ecef76d25f4e`

Algorithm	Hash digest
SHA256	`0ea8250a51cf3176f6d59e04e3a4180446d1cafb57b44211ca726f8566bc3630`
MD5	`99086206368a107c280224aa55f892e7`
BLAKE2b-256	`52cea215fd999c19d172d0d91e4afbb78fb410590ca44b77e41ad8bb33da8893`

number-parser 0.3.0

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

Installation

Usage

Converting numbers in-place

Parsing a number

Parsing an ordinal

Parsing a fraction

Language Support

Supported cases

Contributing

Changes

0.3.0 (2022-10-20)

0.2.1 (2020-08-25)

0.2.0 (2020-08-18)

0.1.0 (2020-07-30)

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes