Microsoft Azure AI 文档智能客户端库，用于Python

这些详情尚未由PyPI 验证

项目链接

主页

项目描述

Azure AI 文档智能客户端库，用于Python

Azure AI 文档智能（之前称为表单识别器）是一个云服务，它使用机器学习分析您的文档中的文本和结构化数据。它包括以下主要功能

布局 - 从文档中提取内容结构（例如，单词、选择标记、表格）。
文档 - 除了常规布局外，还分析文档中的键值对。
读取 - 从文档中读取页面信息。
预构建 - 使用预构建模型从选定文档类型（例如，收据、发票、名片、ID文件、美国W-2税表等）中提取常用字段值。
自定义 - 使用您自己的数据构建自定义模型，从文档中提取定制字段值以及常规布局。
分类器 - 构建自定义分类模型，结合布局和语言功能，以准确地检测和识别您在应用程序中处理的文档。
附加功能 - 提取条形码/二维码、公式、字体/样式等，或使用可选参数启用大文档的高分辨率模式。

源代码 | 包（PyPI） | API 参考文档 | 产品文档 | 示例

免责声明

最新的服务API目前仅在部分Azure区域可用，可用区域可在此处找到。

入门

安装包

python -m pip install azure-ai-documentintelligence

此表显示了SDK版本与支持的API服务版本之间的关系

SDK版本	支持的API服务版本
1.0.0b1	2023-10-31-preview
1.0.0b2	2024-02-29-preview

旧API版本在azure-ai-formrecognizer中受支持，请参阅迁移指南以获取如何更新应用的详细说明。

先决条件

使用此包需要Python 3.8或更高版本。
您需要一个Azure订阅才能使用此包。
现有的Azure AI 文档智能实例。

创建认知服务或文档智能资源

文档智能支持多服务和单服务访问。如果您计划通过单个端点/密钥访问多个认知服务，请创建认知服务资源。仅用于文档智能访问，请创建文档智能资源。请注意，如果您打算使用Azure Active Directory身份验证，则需要单服务资源。

您可以使用以下任一方式创建资源

选项1: Azure门户。
选项2: Azure CLI。

以下是如何使用CLI创建文档智能资源的一个示例

# Create a new resource group to hold the Document Intelligence resource
# if using an existing resource group, skip this step
az group create --name <your-resource-name> --location <location>

# Create the Document Intelligence resource
az cognitiveservices account create \
    --name <your-resource-name> \
    --resource-group <your-resource-group-name> \
    --kind FormRecognizer \
    --sku <sku> \
    --location <location> \
    --yes

有关创建资源或获取位置和SKU信息的更多信息，请参阅此处。

认证客户端

为了与文档智能服务交互，您需要创建客户端实例。实例化客户端对象需要端点和凭证。

获取端点

您可以使用Azure门户或Azure CLI找到您的文档智能资源的端点。

# Get the endpoint for the Document Intelligence resource
az cognitiveservices account show --name "resource-name" --resource-group "resource-group-name" --query "properties.endpoint"

可以使用区域端点或自定义子域进行身份验证。它们的格式如下

Regional endpoint: https://<region>.api.cognitive.microsoft.com/
Custom subdomain: https://<resource-name>.cognitiveservices.azure.com/

区域端点对同一区域的每个资源都是相同的。支持的完整区域端点列表可在此处查阅。请注意，区域端点不支持AAD身份验证。

另一方面，自定义子域是文档智能资源独有的名称。它们只能由单服务资源使用。

获取API密钥

API密钥可在Azure门户中找到，或通过运行以下Azure CLI命令获取

az cognitiveservices account keys list --name "<resource-name>" --resource-group "<resource-group-name>"

使用AzureKeyCredential创建客户端

要将API密钥作为credential参数使用，请将密钥作为字符串传递给AzureKeyCredential实例。

from azure.core.credentials import AzureKeyCredential
from azure.ai.documentintelligence import DocumentIntelligenceClient

endpoint = "https://<my-custom-subdomain>.cognitiveservices.azure.com/"
credential = AzureKeyCredential("<api_key>")
document_intelligence_client = DocumentIntelligenceClient(endpoint, credential)

使用 Azure Active Directory 凭据创建客户端

本入门指南中的示例使用 AzureKeyCredential 进行身份验证，但您也可以使用 azure-identity 库通过 Azure Active Directory 进行身份验证。请注意，区域端点不支持 AAD 身份验证。为您的资源创建一个自定义子域名，以便使用此类型身份验证。

要使用下面的 DefaultAzureCredential 类型或其他 Azure SDK 提供的凭据类型，请安装 azure-identity 包

pip install azure-identity

您还需要注册新的 AAD 应用程序并通过将 "Cognitive Services User" 角色分配给您的服务主体来授予对 Document Intelligence 的访问权限。

完成后，将 AAD 应用程序的客户端 ID、租户 ID 和客户端密钥的值设置为环境变量：AZURE_CLIENT_ID、AZURE_TENANT_ID、AZURE_CLIENT_SECRET。

"""DefaultAzureCredential will use the values from these environment
variables: AZURE_CLIENT_ID, AZURE_TENANT_ID, AZURE_CLIENT_SECRET
"""
from azure.ai.documentintelligence import DocumentIntelligenceClient
from azure.identity import DefaultAzureCredential

endpoint = os.environ["DOCUMENTINTELLIGENCE_ENDPOINT"]
credential = DefaultAzureCredential()

document_intelligence_client = DocumentIntelligenceClient(endpoint, credential)

关键概念

DocumentIntelligenceClient

DocumentIntelligenceClient 提供通过 begin_analyze_document API 使用预构建和自定义模型分析输入文档的操作。使用 model_id 参数选择分析模型类型。有关支持模型的全列表，请参阅此处。DocumentIntelligenceClient 还提供通过 begin_classify_document API 对文档进行分类的操作。自定义分类模型可以分类输入文件中的每一页，以识别其中的文档，还可以在输入文件中识别多个文档或单个文档的多个实例。

提供了示例代码片段以说明如何使用此处的 DocumentIntelligenceClient。有关分析文档的更多信息，包括支持的功能、区域和文档类型，请参阅服务文档。

DocumentIntelligenceAdministrationClient

DocumentIntelligenceAdministrationClient 提供以下操作

通过为您指定的特定字段创建自定义模型来构建自定义模型以分析。返回一个 DocumentModelDetails，指示模型可以分析哪些文档类型，以及每个字段的估计置信度。请参阅服务文档获取更详细的说明。
从现有模型集合中创建组合模型。
管理您账户中创建的模型。
列出操作或获取过去 24 小时内创建的特定模型操作。
将自定义模型从一个 Document Intelligence 资源复制到另一个。
构建和管理自定义分类模型以分类您应用程序中处理的文档。

请注意，模型还可以使用如图形用户界面（例如 Document Intelligence Studio）构建。

提供了示例代码片段以说明如何使用此处的 DocumentIntelligenceAdministrationClient。

长时间运行的操作

长时间运行的操作是包含一个初始请求以启动操作发送到服务，然后以间隔轮询服务以确定操作是否完成或失败，如果成功，则获取结果的操作。

分析文档、构建模型或复制/组合模型的方法被建模为长时间运行的操作。客户端公开了一个返回 LROPoller 或 AsyncLROPoller 的 begin_ 方法。调用者应通过在从 begin_ 方法返回的轮询对象上调用 result() 来等待操作完成。以下示例代码片段展示了如何使用长时间运行的操作如下。

示例

以下部分提供了几个代码片段，涵盖了文档智能的一些最常见任务，包括

提取布局
从文档中提取图形
分析PDF文档结果
使用通用文档模型
使用预构建模型
构建自定义模型
使用自定义模型分析文档
管理您的模型
附加功能

提取布局

从文档中提取文本、选择标记、文本样式和表格结构，以及它们的边界区域坐标。

from azure.core.credentials import AzureKeyCredential
from azure.ai.documentintelligence import DocumentIntelligenceClient
from azure.ai.documentintelligence.models import AnalyzeResult

def _in_span(word, spans):
    for span in spans:
        if word.span.offset >= span.offset and (word.span.offset + word.span.length) <= (span.offset + span.length):
            return True
    return False

def _format_polygon(polygon):
    if not polygon:
        return "N/A"
    return ", ".join([f"[{polygon[i]}, {polygon[i + 1]}]" for i in range(0, len(polygon), 2)])

endpoint = os.environ["DOCUMENTINTELLIGENCE_ENDPOINT"]
key = os.environ["DOCUMENTINTELLIGENCE_API_KEY"]

document_intelligence_client = DocumentIntelligenceClient(endpoint=endpoint, credential=AzureKeyCredential(key))
with open(path_to_sample_documents, "rb") as f:
    poller = document_intelligence_client.begin_analyze_document(
        "prebuilt-layout", analyze_request=f, content_type="application/octet-stream"
    )
result: AnalyzeResult = poller.result()

if result.styles and any([style.is_handwritten for style in result.styles]):
    print("Document contains handwritten content")
else:
    print("Document does not contain handwritten content")

for page in result.pages:
    print(f"----Analyzing layout from page #{page.page_number}----")
    print(f"Page has width: {page.width} and height: {page.height}, measured with unit: {page.unit}")

    if page.lines:
        for line_idx, line in enumerate(page.lines):
            words = []
            if page.words:
                for word in page.words:
                    print(f"......Word '{word.content}' has a confidence of {word.confidence}")
                    if _in_span(word, line.spans):
                        words.append(word)
            print(
                f"...Line # {line_idx} has word count {len(words)} and text '{line.content}' "
                f"within bounding polygon '{_format_polygon(line.polygon)}'"
            )

    if page.selection_marks:
        for selection_mark in page.selection_marks:
            print(
                f"Selection mark is '{selection_mark.state}' within bounding polygon "
                f"'{_format_polygon(selection_mark.polygon)}' and has a confidence of {selection_mark.confidence}"
            )

if result.paragraphs:
    print(f"----Detected #{len(result.paragraphs)} paragraphs in the document----")
    # Sort all paragraphs by span's offset to read in the right order.
    result.paragraphs.sort(key=lambda p: (p.spans.sort(key=lambda s: s.offset), p.spans[0].offset))
    print("-----Print sorted paragraphs-----")
    for paragraph in result.paragraphs:
        if not paragraph.bounding_regions:
            print(f"Found paragraph with role: '{paragraph.role}' within N/A bounding region")
        else:
            print(f"Found paragraph with role: '{paragraph.role}' within")
            print(
                ", ".join(
                    f" Page #{region.page_number}: {_format_polygon(region.polygon)} bounding region"
                    for region in paragraph.bounding_regions
                )
            )
        print(f"...with content: '{paragraph.content}'")
        print(f"...with offset: {paragraph.spans[0].offset} and length: {paragraph.spans[0].length}")

if result.tables:
    for table_idx, table in enumerate(result.tables):
        print(f"Table # {table_idx} has {table.row_count} rows and " f"{table.column_count} columns")
        if table.bounding_regions:
            for region in table.bounding_regions:
                print(
                    f"Table # {table_idx} location on page: {region.page_number} is {_format_polygon(region.polygon)}"
                )
        for cell in table.cells:
            print(f"...Cell[{cell.row_index}][{cell.column_index}] has text '{cell.content}'")
            if cell.bounding_regions:
                for region in cell.bounding_regions:
                    print(
                        f"...content on page {region.page_number} is within bounding polygon '{_format_polygon(region.polygon)}'"
                    )

print("----------------------------------------")

从文档中提取图形

将文档中的图形提取为裁剪图像。

from azure.core.credentials import AzureKeyCredential
from azure.ai.documentintelligence import DocumentIntelligenceClient
from azure.ai.documentintelligence.models import AnalyzeOutputOption, AnalyzeResult

endpoint = os.environ["DOCUMENTINTELLIGENCE_ENDPOINT"]
key = os.environ["DOCUMENTINTELLIGENCE_API_KEY"]

document_intelligence_client = DocumentIntelligenceClient(endpoint=endpoint, credential=AzureKeyCredential(key))

with open(path_to_sample_documents, "rb") as f:
    poller = document_intelligence_client.begin_analyze_document(
        "prebuilt-layout",
        analyze_request=f,
        output=[AnalyzeOutputOption.FIGURES],
        content_type="application/octet-stream",
    )
result: AnalyzeResult = poller.result()
operation_id = poller.details["operation_id"]

if result.figures:
    for figure in result.figures:
        if figure.id:
            response = document_intelligence_client.get_analyze_result_figure(
                model_id=result.model_id, result_id=operation_id, figure_id=figure.id
            )
            with open(f"{figure.id}.png", "wb") as writer:
                writer.writelines(response)
else:
    print("No figures found.")

分析PDF文档结果

将模拟PDF转换为嵌入文本的PDF。这样的文本可以在PDF中实现文本搜索，或在LLM聊天场景中使用PDF。

注意：目前，此功能仅由 prebuilt-read 支持。所有其他模型将返回错误。

from azure.core.credentials import AzureKeyCredential
from azure.ai.documentintelligence import DocumentIntelligenceClient
from azure.ai.documentintelligence.models import AnalyzeOutputOption, AnalyzeResult

endpoint = os.environ["DOCUMENTINTELLIGENCE_ENDPOINT"]
key = os.environ["DOCUMENTINTELLIGENCE_API_KEY"]

document_intelligence_client = DocumentIntelligenceClient(endpoint=endpoint, credential=AzureKeyCredential(key))

with open(path_to_sample_documents, "rb") as f:
    poller = document_intelligence_client.begin_analyze_document(
        "prebuilt-read",
        analyze_request=f,
        output=[AnalyzeOutputOption.PDF],
        content_type="application/octet-stream",
    )
result: AnalyzeResult = poller.result()
operation_id = poller.details["operation_id"]

response = document_intelligence_client.get_analyze_result_pdf(model_id=result.model_id, result_id=operation_id)
with open("analyze_result.pdf", "wb") as writer:
    writer.writelines(response)

使用通用文档模型

使用文档智能服务提供的通用文档模型从文档中分析键值对、表格、样式和选择标记。通过将 model_id="prebuilt-document" 传递给 begin_analyze_document 方法来选择通用文档模型

from azure.core.credentials import AzureKeyCredential
from azure.ai.documentintelligence import DocumentIntelligenceClient
from azure.ai.documentintelligence.models import DocumentAnalysisFeature, AnalyzeResult

def _in_span(word, spans):
    for span in spans:
        if word.span.offset >= span.offset and (word.span.offset + word.span.length) <= (span.offset + span.length):
            return True
    return False

def _format_bounding_region(bounding_regions):
    if not bounding_regions:
        return "N/A"
    return ", ".join(
        f"Page #{region.page_number}: {_format_polygon(region.polygon)}" for region in bounding_regions
    )

def _format_polygon(polygon):
    if not polygon:
        return "N/A"
    return ", ".join([f"[{polygon[i]}, {polygon[i + 1]}]" for i in range(0, len(polygon), 2)])

endpoint = os.environ["DOCUMENTINTELLIGENCE_ENDPOINT"]
key = os.environ["DOCUMENTINTELLIGENCE_API_KEY"]

document_intelligence_client = DocumentIntelligenceClient(endpoint=endpoint, credential=AzureKeyCredential(key))
with open(path_to_sample_documents, "rb") as f:
    poller = document_intelligence_client.begin_analyze_document(
        "prebuilt-layout",
        analyze_request=f,
        features=[DocumentAnalysisFeature.KEY_VALUE_PAIRS],
        content_type="application/octet-stream",
    )
result: AnalyzeResult = poller.result()

if result.styles:
    for style in result.styles:
        if style.is_handwritten:
            print("Document contains handwritten content: ")
            print(",".join([result.content[span.offset : span.offset + span.length] for span in style.spans]))

print("----Key-value pairs found in document----")
if result.key_value_pairs:
    for kv_pair in result.key_value_pairs:
        if kv_pair.key:
            print(
                f"Key '{kv_pair.key.content}' found within "
                f"'{_format_bounding_region(kv_pair.key.bounding_regions)}' bounding regions"
            )
        if kv_pair.value:
            print(
                f"Value '{kv_pair.value.content}' found within "
                f"'{_format_bounding_region(kv_pair.value.bounding_regions)}' bounding regions\n"
            )

for page in result.pages:
    print(f"----Analyzing document from page #{page.page_number}----")
    print(f"Page has width: {page.width} and height: {page.height}, measured with unit: {page.unit}")

    if page.lines:
        for line_idx, line in enumerate(page.lines):
            words = []
            if page.words:
                for word in page.words:
                    print(f"......Word '{word.content}' has a confidence of {word.confidence}")
                    if _in_span(word, line.spans):
                        words.append(word)
            print(
                f"...Line #{line_idx} has {len(words)} words and text '{line.content}' within "
                f"bounding polygon '{_format_polygon(line.polygon)}'"
            )

    if page.selection_marks:
        for selection_mark in page.selection_marks:
            print(
                f"Selection mark is '{selection_mark.state}' within bounding polygon "
                f"'{_format_polygon(selection_mark.polygon)}' and has a confidence of "
                f"{selection_mark.confidence}"
            )

if result.tables:
    for table_idx, table in enumerate(result.tables):
        print(f"Table # {table_idx} has {table.row_count} rows and {table.column_count} columns")
        if table.bounding_regions:
            for region in table.bounding_regions:
                print(
                    f"Table # {table_idx} location on page: {region.page_number} is {_format_polygon(region.polygon)}"
                )
        for cell in table.cells:
            print(f"...Cell[{cell.row_index}][{cell.column_index}] has text '{cell.content}'")
            if cell.bounding_regions:
                for region in cell.bounding_regions:
                    print(
                        f"...content on page {region.page_number} is within bounding polygon '{_format_polygon(region.polygon)}'\n"
                    )
print("----------------------------------------")

有关 prebuilt-document 模型提供的功能的更多信息，请参阅此处。

使用预构建模型

使用文档智能服务提供的预构建模型从以下文档类型中提取字段，例如收据、发票、名片、身份证件和U.S. W-2税务文件。

例如，要分析销售收据的字段，请使用提供的预构建收据模型，通过将 model_id="prebuilt-receipt" 传递给 begin_analyze_document 方法来分析

from azure.core.credentials import AzureKeyCredential
from azure.ai.documentintelligence import DocumentIntelligenceClient
from azure.ai.documentintelligence.models import AnalyzeResult

def _format_price(price_dict):
    return "".join([f"{p}" for p in price_dict.values()])

endpoint = os.environ["DOCUMENTINTELLIGENCE_ENDPOINT"]
key = os.environ["DOCUMENTINTELLIGENCE_API_KEY"]

document_intelligence_client = DocumentIntelligenceClient(endpoint=endpoint, credential=AzureKeyCredential(key))
with open(path_to_sample_documents, "rb") as f:
    poller = document_intelligence_client.begin_analyze_document(
        "prebuilt-receipt", analyze_request=f, locale="en-US", content_type="application/octet-stream"
    )
receipts: AnalyzeResult = poller.result()

if receipts.documents:
    for idx, receipt in enumerate(receipts.documents):
        print(f"--------Analysis of receipt #{idx + 1}--------")
        print(f"Receipt type: {receipt.doc_type if receipt.doc_type else 'N/A'}")
        if receipt.fields:
            merchant_name = receipt.fields.get("MerchantName")
            if merchant_name:
                print(
                    f"Merchant Name: {merchant_name.get('valueString')} has confidence: "
                    f"{merchant_name.confidence}"
                )
            transaction_date = receipt.fields.get("TransactionDate")
            if transaction_date:
                print(
                    f"Transaction Date: {transaction_date.get('valueDate')} has confidence: "
                    f"{transaction_date.confidence}"
                )
            items = receipt.fields.get("Items")
            if items:
                print("Receipt items:")
                for idx, item in enumerate(items.get("valueArray")):
                    print(f"...Item #{idx + 1}")
                    item_description = item.get("valueObject").get("Description")
                    if item_description:
                        print(
                            f"......Item Description: {item_description.get('valueString')} has confidence: "
                            f"{item_description.confidence}"
                        )
                    item_quantity = item.get("valueObject").get("Quantity")
                    if item_quantity:
                        print(
                            f"......Item Quantity: {item_quantity.get('valueString')} has confidence: "
                            f"{item_quantity.confidence}"
                        )
                    item_total_price = item.get("valueObject").get("TotalPrice")
                    if item_total_price:
                        print(
                            f"......Total Item Price: {_format_price(item_total_price.get('valueCurrency'))} has confidence: "
                            f"{item_total_price.confidence}"
                        )
            subtotal = receipt.fields.get("Subtotal")
            if subtotal:
                print(
                    f"Subtotal: {_format_price(subtotal.get('valueCurrency'))} has confidence: {subtotal.confidence}"
                )
            tax = receipt.fields.get("TotalTax")
            if tax:
                print(f"Total tax: {_format_price(tax.get('valueCurrency'))} has confidence: {tax.confidence}")
            tip = receipt.fields.get("Tip")
            if tip:
                print(f"Tip: {_format_price(tip.get('valueCurrency'))} has confidence: {tip.confidence}")
            total = receipt.fields.get("Total")
            if total:
                print(f"Total: {_format_price(total.get('valueCurrency'))} has confidence: {total.confidence}")
        print("--------------------------------------")

您不仅限于收据！这里有一些预构建模型可供选择，每个模型都有自己的支持字段集。有关其他支持的预构建模型，请参阅此处。

构建自定义模型

在您的文档类型上构建自定义模型。结果模型可以用于分析训练模型时的文档类型中的值。提供容器SAS URL，您在其中存储训练文档的Azure存储Blob容器。

有关设置容器和所需文件结构的更多详细信息，请参阅服务文档。

# Let's build a model to use for this sample
import uuid
from azure.ai.documentintelligence import DocumentIntelligenceAdministrationClient
from azure.ai.documentintelligence.models import (
    DocumentBuildMode,
    BuildDocumentModelRequest,
    AzureBlobContentSource,
    DocumentModelDetails,
)
from azure.core.credentials import AzureKeyCredential

endpoint = os.environ["DOCUMENTINTELLIGENCE_ENDPOINT"]
key = os.environ["DOCUMENTINTELLIGENCE_API_KEY"]
container_sas_url = os.environ["DOCUMENTINTELLIGENCE_STORAGE_CONTAINER_SAS_URL"]

document_intelligence_admin_client = DocumentIntelligenceAdministrationClient(endpoint, AzureKeyCredential(key))
poller = document_intelligence_admin_client.begin_build_document_model(
    BuildDocumentModelRequest(
        model_id=str(uuid.uuid4()),
        build_mode=DocumentBuildMode.TEMPLATE,
        azure_blob_source=AzureBlobContentSource(container_url=container_sas_url),
        description="my model description",
    )
)
model: DocumentModelDetails = poller.result()

print(f"Model ID: {model.model_id}")
print(f"Description: {model.description}")
print(f"Model created on: {model.created_date_time}")
print(f"Model expires on: {model.expiration_date_time}")
if model.doc_types:
    print("Doc types the model can recognize:")
    for name, doc_type in model.doc_types.items():
        print(f"Doc Type: '{name}' built with '{doc_type.build_mode}' mode which has the following fields:")
        if doc_type.field_schema:
            for field_name, field in doc_type.field_schema.items():
                if doc_type.field_confidence:
                    print(
                        f"Field: '{field_name}' has type '{field['type']}' and confidence score "
                        f"{doc_type.field_confidence[field_name]}"
                    )

使用自定义模型分析文档

分析文档字段、表格、选择标记等。这些模型使用您自己的数据进行训练，因此它们针对您的文档进行了定制。为了获得最佳结果，您应仅分析与构建自定义模型相同的文档类型的文档。

from azure.core.credentials import AzureKeyCredential
from azure.ai.documentintelligence import DocumentIntelligenceClient
from azure.ai.documentintelligence.models import AnalyzeResult

def _print_table(header_names, table_data):
    # Print a two-dimensional array like a table.
    max_len_list = []
    for i in range(len(header_names)):
        col_values = list(map(lambda row: len(str(row[i])), table_data))
        col_values.append(len(str(header_names[i])))
        max_len_list.append(max(col_values))

    row_format_str = "".join(map(lambda len: f"{{:<{len + 4}}}", max_len_list))

    print(row_format_str.format(*header_names))
    for row in table_data:
        print(row_format_str.format(*row))

endpoint = os.environ["DOCUMENTINTELLIGENCE_ENDPOINT"]
key = os.environ["DOCUMENTINTELLIGENCE_API_KEY"]
model_id = os.getenv("CUSTOM_BUILT_MODEL_ID", custom_model_id)

document_intelligence_client = DocumentIntelligenceClient(endpoint=endpoint, credential=AzureKeyCredential(key))

# Make sure your document's type is included in the list of document types the custom model can analyze
with open(path_to_sample_documents, "rb") as f:
    poller = document_intelligence_client.begin_analyze_document(
        model_id=model_id, analyze_request=f, content_type="application/octet-stream"
    )
result: AnalyzeResult = poller.result()

if result.documents:
    for idx, document in enumerate(result.documents):
        print(f"--------Analyzing document #{idx + 1}--------")
        print(f"Document has type {document.doc_type}")
        print(f"Document has document type confidence {document.confidence}")
        print(f"Document was analyzed with model with ID {result.model_id}")
        if document.fields:
            for name, field in document.fields.items():
                field_value = field.get("valueString") if field.get("valueString") else field.content
                print(
                    f"......found field of type '{field.type}' with value '{field_value}' and with confidence {field.confidence}"
                )

    # Extract table cell values
    SYMBOL_OF_TABLE_TYPE = "array"
    SYMBOL_OF_OBJECT_TYPE = "object"
    KEY_OF_VALUE_OBJECT = "valueObject"
    KEY_OF_CELL_CONTENT = "content"

    for doc in result.documents:
        if not doc.fields is None:
            for field_name, field_value in doc.fields.items():
                # Dynamic Table cell information store as array in document field.
                if field_value.type == SYMBOL_OF_TABLE_TYPE and field_value.value_array:
                    col_names = []
                    sample_obj = field_value.value_array[0]
                    if KEY_OF_VALUE_OBJECT in sample_obj:
                        col_names = list(sample_obj[KEY_OF_VALUE_OBJECT].keys())
                    print("----Extracting Dynamic Table Cell Values----")
                    table_rows = []
                    for obj in field_value.value_array:
                        if KEY_OF_VALUE_OBJECT in obj:
                            value_obj = obj[KEY_OF_VALUE_OBJECT]
                            extract_value_by_col_name = lambda key: (
                                value_obj[key].get(KEY_OF_CELL_CONTENT)
                                if key in value_obj and KEY_OF_CELL_CONTENT in value_obj[key]
                                else "None"
                            )
                            row_data = list(map(extract_value_by_col_name, col_names))
                            table_rows.append(row_data)
                    _print_table(col_names, table_rows)

                elif (
                    field_value.type == SYMBOL_OF_OBJECT_TYPE
                    and KEY_OF_VALUE_OBJECT in field_value
                    and field_value[KEY_OF_VALUE_OBJECT] is not None
                ):
                    rows_by_columns = list(field_value[KEY_OF_VALUE_OBJECT].values())
                    is_fixed_table = all(
                        (
                            rows_of_column["type"] == SYMBOL_OF_OBJECT_TYPE
                            and Counter(list(rows_by_columns[0][KEY_OF_VALUE_OBJECT].keys()))
                            == Counter(list(rows_of_column[KEY_OF_VALUE_OBJECT].keys()))
                        )
                        for rows_of_column in rows_by_columns
                    )

                    # Fixed Table cell information store as object in document field.
                    if is_fixed_table:
                        print("----Extracting Fixed Table Cell Values----")
                        col_names = list(field_value[KEY_OF_VALUE_OBJECT].keys())
                        row_dict: dict = {}
                        for rows_of_column in rows_by_columns:
                            rows = rows_of_column[KEY_OF_VALUE_OBJECT]
                            for row_key in list(rows.keys()):
                                if row_key in row_dict:
                                    row_dict[row_key].append(rows[row_key].get(KEY_OF_CELL_CONTENT))
                                else:
                                    row_dict[row_key] = [
                                        row_key,
                                        rows[row_key].get(KEY_OF_CELL_CONTENT),
                                    ]

                        col_names.insert(0, "")
                        _print_table(col_names, list(row_dict.values()))

print("------------------------------------")

此外，还可以使用文档URL通过 begin_analyze_document 方法分析文档。

from azure.core.credentials import AzureKeyCredential
from azure.ai.documentintelligence import DocumentIntelligenceClient
from azure.ai.documentintelligence.models import AnalyzeDocumentRequest, AnalyzeResult

endpoint = os.environ["DOCUMENTINTELLIGENCE_ENDPOINT"]
key = os.environ["DOCUMENTINTELLIGENCE_API_KEY"]

document_intelligence_client = DocumentIntelligenceClient(endpoint=endpoint, credential=AzureKeyCredential(key))
url = "https://raw.githubusercontent.com/Azure/azure-sdk-for-python/main/sdk/documentintelligence/azure-ai-documentintelligence/samples/sample_forms/receipt/contoso-receipt.png"
poller = document_intelligence_client.begin_analyze_document(
    "prebuilt-receipt", AnalyzeDocumentRequest(url_source=url)
)
receipts: AnalyzeResult = poller.result()

管理您的模型

管理附加到您账户的自定义模型。

# Let's build a model to use for this sample
import uuid
from azure.ai.documentintelligence import DocumentIntelligenceAdministrationClient
from azure.ai.documentintelligence.models import (
    DocumentBuildMode,
    BuildDocumentModelRequest,
    AzureBlobContentSource,
    DocumentModelDetails,
)
from azure.core.credentials import AzureKeyCredential

endpoint = os.environ["DOCUMENTINTELLIGENCE_ENDPOINT"]
key = os.environ["DOCUMENTINTELLIGENCE_API_KEY"]
container_sas_url = os.environ["DOCUMENTINTELLIGENCE_STORAGE_CONTAINER_SAS_URL"]

document_intelligence_admin_client = DocumentIntelligenceAdministrationClient(endpoint, AzureKeyCredential(key))
poller = document_intelligence_admin_client.begin_build_document_model(
    BuildDocumentModelRequest(
        model_id=str(uuid.uuid4()),
        build_mode=DocumentBuildMode.TEMPLATE,
        azure_blob_source=AzureBlobContentSource(container_url=container_sas_url),
        description="my model description",
    )
)
model: DocumentModelDetails = poller.result()

print(f"Model ID: {model.model_id}")
print(f"Description: {model.description}")
print(f"Model created on: {model.created_date_time}")
print(f"Model expires on: {model.expiration_date_time}")
if model.doc_types:
    print("Doc types the model can recognize:")
    for name, doc_type in model.doc_types.items():
        print(f"Doc Type: '{name}' built with '{doc_type.build_mode}' mode which has the following fields:")
        if doc_type.field_schema:
            for field_name, field in doc_type.field_schema.items():
                if doc_type.field_confidence:
                    print(
                        f"Field: '{field_name}' has type '{field['type']}' and confidence score "
                        f"{doc_type.field_confidence[field_name]}"
                    )

account_details = document_intelligence_admin_client.get_resource_info()
print(
    f"Our resource has {account_details.custom_document_models.count} custom models, "
    f"and we can have at most {account_details.custom_document_models.limit} custom models"
)

# Next, we get a paged list of all of our custom models
models = document_intelligence_admin_client.list_models()

print("We have the following 'ready' models with IDs and descriptions:")
for model in models:
    print(f"{model.model_id} | {model.description}")

my_model = document_intelligence_admin_client.get_model(model_id=model.model_id)
print(f"\nModel ID: {my_model.model_id}")
print(f"Description: {my_model.description}")
print(f"Model created on: {my_model.created_date_time}")
print(f"Model expires on: {my_model.expiration_date_time}")
if my_model.warnings:
    print("Warnings encountered while building the model:")
    for warning in my_model.warnings:
        print(f"warning code: {warning.code}, message: {warning.message}, target of the error: {warning.target}")

# Finally, we will delete this model by ID
document_intelligence_admin_client.delete_model(model_id=my_model.model_id)

from azure.core.exceptions import ResourceNotFoundError

try:
    document_intelligence_admin_client.get_model(model_id=my_model.model_id)
except ResourceNotFoundError:
    print(f"Successfully deleted model with ID {my_model.model_id}")

附加功能

文档智能支持更复杂的分析功能。根据文档提取的场景，这些可选功能可以启用或禁用。

此SDK中提供以下附加功能：

请注意，某些附加功能可能会产生额外费用。请参阅定价：https://azure.microsoft.com/pricing/details/ai-document-intelligence/。

故障排除

一般

文档智能客户端库将抛出在 Azure Core 中定义的异常。文档智能服务抛出的错误代码和消息可以在服务文档中找到。

日志记录

此库使用标准的 logging 库进行日志记录。

HTTP会话（URL、头信息等）的基本信息将以 INFO 级别进行记录。

可以通过 logging_enable 关键字参数在客户端或每个操作中启用详细的 DEBUG 级别日志记录，包括请求数据和响应数据以及未编辑的头信息。

有关示例，请参阅完整的 SDK 日志记录文档此处。

可选配置

可以在客户端和每个操作级别传递可选的关键字参数。azure-core 参考文档描述了重试、日志记录、传输协议等可用的配置。

下一步

其他文档

有关 Azure AI 文档智能的更详细文档，请参阅 docs.microsoft.com 上的文档智能文档。

贡献

本项目欢迎贡献和建议。大多数贡献都需要您同意贡献者许可协议（CLA），声明您有权利，并且实际上确实授予我们使用您贡献的权利。有关详细信息，请访问 https://cla.microsoft.com。

提交拉取请求时，CLA-bot 会自动确定您是否需要提供 CLA，并相应地装饰 PR（例如，标签、注释）。只需遵循机器人提供的说明即可。您只需在整个使用我们的 CLA 的所有存储库中做一次。

本项目已采用 Microsoft 开源行为准则。有关更多信息，请参阅行为准则常见问题解答或通过 opencode@microsoft.com 联系我们，提出任何其他问题或意见。

项目详情

这些详情尚未由PyPI 验证

项目链接

主页

发布历史发布通知 | RSS 源

本版本

1.0.0b4 预发布

2024年9月6日

1.0.0b3 预发布

2024年4月9日

1.0.0b2 预发布

2024年3月8日

1.0.0b1 预发布

2023年11月18日

下载文件

下载适用于您的平台的文件。如果您不确定要选择哪个，请了解更多关于安装包的信息。

源代码分发

azure_ai_documentintelligence-1.0.0b4.tar.gz (159.8 kB 查看哈希值)

上传时间 2024年9月6日 源代码

构建分发

azure_ai_documentintelligence-1.0.0b4-py3-none-any.whl (99.5 kB 查看哈希值)

上传于 2024年9月6日 Python 3

哈希值 for azure_ai_documentintelligence-1.0.0b4.tar.gz

哈希值 for azure_ai_documentintelligence-1.0.0b4.tar.gz
算法	哈希摘要
SHA256	`1aa36f0617b0c129fdc82b039b7084fd5b69af08e8e0cb500108b9f6efd61b36`
MD5	`a4c90a06333893e95290431cb1c4c3ab`
BLAKE2b-256	`183a1a8f5cb7df48eeb456bb3498bf49f236316095267be4df82ae09a562c52a`

哈希值 for azure_ai_documentintelligence-1.0.0b4-py3-none-any.whl

哈希值 for azure_ai_documentintelligence-1.0.0b4-py3-none-any.whl
算法	哈希摘要
SHA256	`c3a90560b4029e232dbab1334ac2f3dda4cae7c1f60dad277fe21a876dd6bb9f`
MD5	`7e0f6f0fcb8de48c5542e5c93183b502`
BLAKE2b-256	`b793282ce2ab36081d33d79b9c825d775ee556713af8137c7af6de1a42ccf5e5`

azure-ai-documentintelligence 1.0.0b4

导航

验证详情

维护者

未验证详情

项目链接

元数据

分类器

项目描述

Azure AI 文档智能客户端库，用于Python

免责声明

入门

安装包

先决条件

创建认知服务或文档智能资源

认证客户端

获取端点

获取API密钥

使用AzureKeyCredential创建客户端

使用 Azure Active Directory 凭据创建客户端

关键概念

DocumentIntelligenceClient

DocumentIntelligenceAdministrationClient

长时间运行的操作

示例

提取布局

从文档中提取图形

分析PDF文档结果

使用通用文档模型

使用预构建模型

构建自定义模型

使用自定义模型分析文档

管理您的模型

附加功能

故障排除

一般

日志记录

可选配置

下一步

更多示例代码

其他文档

贡献

项目详情

验证详情

维护者

未验证详情

项目链接

元数据

分类器

发布历史 发布通知 | RSS 源

下载文件

源代码分发

构建分发

发布历史发布通知 | RSS 源