smart-importer · PyPI · Python 包索引

增强Beancount导入器以具有机器学习功能。

这些详情尚未由PyPI验证

项目链接

主页

项目描述

https://github.com/beancount/smart_importer

增强Beancount导入器以具有机器学习功能。

状态

工作原型，开发状态：测试版

安装

可以从PyPI安装smart_importer

pip install smart_importer

快速入门

此软件包提供了可以修改导入条目的导入钩子。在运行导入器时，现有条目将被用作机器学习模型的训练数据，该模型将预测条目属性。

以下示例展示了如何将PredictPostings钩子应用于现有的CSV导入器

from beancount.ingest.importers import csv
from beancount.ingest.importers.csv import Col

from smart_importer import apply_hooks, PredictPostings


class MyBankImporter(csv.Importer):
    '''Conventional importer for MyBank'''

    def __init__(self, *, account):
        super().__init__(
            {Col.DATE: 'Date',
             Col.PAYEE: 'Transaction Details',
             Col.AMOUNT_DEBIT: 'Funds Out',
             Col.AMOUNT_CREDIT: 'Funds In'},
            account,
            'EUR',
            (
                'Date, Transaction Details, Funds Out, Funds In'
            )
        )


CONFIG = [
    apply_hooks(MyBankImporter(account='Assets:MyBank:MyAccount'), [PredictPostings()])
]

文档

本节详细解释了增强Beancount导入器以具有机器学习功能所需的相关概念和工件。

Beancount导入器

假设你已经为“我的银行”创建了一个名为MyBankImporter的导入器

class MyBankImporter(importer.ImporterProtocol):
    """My existing importer"""
    # the actual importer logic would be here...

注意：本文档假设您已经知道如何创建Beancount导入器。相关文档可以在Beancount导入文档中找到。通过beancount.ingest的功能，用户可以编写自己的导入器，并使用它们将下载的银行对账单转换为Beancount条目列表。示例作为beancount v2源代码的一部分提供，位于examples/ingest/office。

smart_importer仅通过附加到不完整的单腿分录（即，它不会通过修改带有“费用：TODO”等账户的分录来工作）。导入器中的extract方法应遵循最新接口，并包含一个existing_entries参数。

应用smart_importer钩子

任何Beancount导入器都可以通过应用以下钩子之一转换为智能导入器

PredictPostings - 预测分录列表。
PredictPayees- 预测交易的付款人。
DuplicateDetector - 检测重复项

例如，要将现有的MyBankImporter转换为智能导入器

from your_custom_importer import MyBankImporter
from smart_importer import apply_hooks, PredictPayees, PredictPostings

my_bank_importer =  MyBankImporter('whatever', 'config', 'is', 'needed')
apply_hooks(my_bank_importer, [PredictPostings(), PredictPayees()])

CONFIG = [
    my_bank_importer,
]

请注意，导入器钩子需要应用于导入器实例，如上所示。

指定训练数据

smart_importer钩子需要训练数据，即现有交易列表，才能有效。可以通过调用bean-extract并使用引用现有Beancount交易的参数来指定训练数据，例如，bean-extract -f existing_transactions.beancount。在Fava中使用导入器时，现有条目将自动用作训练数据。

Fava中的使用

智能导入器与Fava兼容。这意味着您可以使用智能导入器与Fava以与常规导入器完全相同的方式进行交互。有关更多信息，请参阅Fava关于导入器的帮助。

开发

欢迎提交拉取请求！

执行单元测试

只需运行（需要tox）

make test

配置日志记录

smart_importer模块使用Python的logging模块。可以通过以下方式更改相应的日志级别

import logging
logging.getLogger('smart_importer').setLevel(logging.DEBUG)

使用分词器

自定义分词器可以让smart_importer支持更多语言，例如中文。

如果您正在寻找中文分词器，可以参考以下示例

首先确保在您的Python环境中已安装jieba

pip install jieba

然后，您可以在导入器代码中将jieba传递给用作分词器的参数

from smart_importer import PredictPostings
import jieba

jieba.initialize()
tokenizer = lambda s: list(jieba.cut(s))

predictor = PredictPostings(string_tokenizer=tokenizer)

项目详情

这些详情尚未由PyPI验证

项目链接

主页

版本历史发布通知 | RSS源

此版本

0.5

2024年1月21日

0.4

2022年12月16日

0.3

2021年2月20日

0.2

2021年2月20日

0.1

2018年12月25日

下载文件

下载适合您平台的自定义文件。如果您不确定选择哪个，请了解更多关于安装软件包的信息。

源分发

smart_importer-0.5.tar.gz (16.8 kB 查看散列)

2024年1月21日 源

构建分发

smart_importer-0.5-py3-none-any.whl (10.6 kB 查看哈希)

上传于 2024年1月21日 Python 3

哈希对 smart_importer-0.5.tar.gz

smart_importer-0.5.tar.gz 的哈希
算法	哈希摘要
SHA256	`9f49816b2837372ff9787072a270e7aa90de12bbf7b43869e7bedc0a833a9752`
MD5	`8f9fbb9090765de180ef5661772d29c5`
BLAKE2b-256	`b9bec7096f5a569e10456338a1db9ef126da8e06aa1c79a9ae61319fb202b210`

哈希对 smart_importer-0.5-py3-none-any.whl

smart_importer-0.5-py3-none-any.whl 的哈希
算法	哈希摘要
SHA256	`8f99a4f444a485477aec6a3b70ed566c6de415ea1cb1552ac4d5524023c2883b`
MD5	`66f1e69f91340774f30003eb5c9195cc`
BLAKE2b-256	`e604aafd1307007c2133c859f18699c28be2a68c067d77879b1e5451673d4765`

smart-importer 0.5

导航

验证详情

维护者

未验证详情

项目链接

元数据

分类器

项目描述

状态

安装

快速入门

文档

Beancount导入器

应用smart_importer钩子

指定训练数据

Fava中的使用

开发

执行单元测试

配置日志记录

使用分词器

项目详情

验证详情

维护者

未验证详情

项目链接

元数据

分类器

版本历史发布通知 | RSS源

下载文件

源分发

构建分发

smart-importer 0.5

导航

验证详情

维护者

未验证详情

项目链接

元数据

分类器

项目描述

状态

安装

快速入门

文档

Beancount导入器

应用smart_importer钩子

指定训练数据

Fava中的使用

开发

执行单元测试

配置日志记录

使用分词器

项目详情

验证详情

维护者

未验证详情

项目链接

元数据

分类器

版本历史 发布通知 | RSS源

下载文件

源分发

构建分发

版本历史发布通知 | RSS源