Dataset Viber是您进行数据收集、标注和情感检查的悠闲仓库。

项目描述

Dataset Viber

避免炒作，检查情感！

我制作了Dataset Viber，一套使您在处理AI模型数据时更加轻松的工具。Dataset Viber旨在使您的数据处理之旅更加顺畅和有趣。它不适用于团队合作或生产，也不是试图变得复杂和正式 - 只是一系列帮助您作为AI工程师或爱好者收集反馈和进行情感检查的**酷工具**。想看看它的实际效果？只需将其插入并开始与数据互动即可。就这么简单！

CollectorInterface：在没有人工标注的情况下，懒加载模型交互数据。
AnnotatorInterface：遍历您的数据，并在循环中使用模型进行标注。
Synthesizer：在循环中使用distilabel合成数据。
BulkInterface：探索您的数据分布，并批量进行标注。

需要任何调整或想了解更多关于特定工具的信息？只需创建一个问题或向我提出建议！

[!NOTE]

数据记录到本地的CSV文件或直接到Hugging Face Hub。

所有工具也都在.ipynb笔记本中运行。

通过fn_model循环使用模型。

使用自定义数据流器或带有fn_next_input参数的预构建Synthesizer类进行输入。

它支持文本、聊天和图像等多种模态的任务。

从Hugging Face Hub或CSV文件导入和导出。

[!TIP]

代码示例：src/dataset_viber/examples。

Hub示例：https://hugging-face.cn/dataset-viber。

安装

您可以通过pip安装此软件包

pip install dataset-viber

或者安装 合成器 依赖。注意，合成器 依赖于 distilabel[hf-inference-endpoints]，但您也可以使用 distilabel 提供的其他 LLM，例如 distilabel[ollama]。

pip install dataset-viber[synthesizer]

或者安装 批量接口 依赖

pip install dataset-viber[bulk]

我们感觉怎么样？

收集器接口

基于 gr.Interface 和 gr.ChatInterface 构建，用于自动懒加载收集交互数据。

https://github.com/user-attachments/assets/4ddac8a1-62ab-4b3b-9254-f924f5898075

枢纽数据集

收集器接口

import gradio as gr
from dataset_viber import CollectorInterface

def calculator(num1, operation, num2):
    if operation == "add":
        return num1 + num2
    elif operation == "subtract":
        return num1 - num2
    elif operation == "multiply":
        return num1 * num2
    elif operation == "divide":
        return num1 / num2

inputs = ["number", gr.Radio(["add", "subtract", "multiply", "divide"]), "number"]
outputs = "number"

interface = CollectorInterface(
    fn=calculator,
    inputs=inputs,
    outputs=outputs,
    csv_logger=False, # True if you want to log to a CSV
    dataset_name="<my_hf_org>/<my_dataset>"
)
interface.launch()

CollectorInterface.from_interface

interface = gr.Interface(
    fn=calculator,
    inputs=inputs,
    outputs=outputs
)
interface = CollectorInterface.from_interface(
   interface=interface,
   csv_logger=False, # True if you want to log to a CSV
   dataset_name="<my_hf_org>/<my_dataset>"
)
interface.launch()

CollectorInterface.from_pipeline

from transformers import pipeline
from dataset_viber import CollectorInterface

pipeline = pipeline("text-classification", model="mrm8488/bert-tiny-finetuned-sms-spam-detection")
interface = CollectorInterface.from_pipeline(
    pipeline=pipeline,
    csv_logger=False, # True if you want to log to a CSV
    dataset_name="<my_hf_org>/<my_dataset>"
)
interface.launch()

标注器接口

基于 收集器接口 构建，用于收集和标注数据，并将其记录到枢纽。

文本

https://github.com/user-attachments/assets/d1abda66-9972-4c60-89d2-7626f5654f15

枢纽数据集

text-classification/multi-label-text-classification

from dataset_viber import AnnotatorInterFace

texts = [
    "Anthony Bourdain was an amazing chef!",
    "Anthony Bourdain was a terrible tv persona!"
]
labels = ["positive", "negative"]

interface = AnnotatorInterFace.for_text_classification(
    texts=texts,
    labels=labels,
    multi_label=False, # True if you have multi-label data
    fn_model=None, # a callable e.g. (function or transformers pipelines) that returns `str`
    fn_next_input=None, # a function that feeds gradio components actively with the next input
    csv_logger=False, # True if you want to log to a CSV
    dataset_name=None # "<my_hf_org>/<my_dataset>" if you want to log to the hub
)
interface.launch()

token-classification

from dataset_viber import AnnotatorInterFace

texts = ["Anthony Bourdain was an amazing chef in New York."]
labels = ["NAME", "LOC"]

interface = AnnotatorInterFace.for_token_classification(
    texts=texts,
    labels=labels,
    fn_model=None, # a callable e.g. (function or transformers pipelines) that returns `str`
    fn_next_input=None, # a function that feeds gradio components actively with the next input
    csv_logger=False, # True if you want to log to a CSV
    dataset_name=None # "<my_hf_org>/<my_dataset>" if you want to log to the hub
)
interface.launch()

抽取式问答

from dataset_viber import AnnotatorInterFace

questions = ["Where was Anthony Bourdain located?"]
contexts = ["Anthony Bourdain was an amazing chef in New York."]

interface = AnnotatorInterFace.for_question_answering(
    questions=questions,
    contexts=contexts,
    fn_model=None, # a callable e.g. (function or transformers pipelines) that returns `str`
    fn_next_input=None, # a function that feeds gradio components actively with the next input
    csv_logger=False, # True if you want to log to a CSV
    dataset_name=None # "<my_hf_org>/<my_dataset>" if you want to log to the hub
)
interface.launch()

text-generation/translation/completion

from dataset_viber import AnnotatorInterFace

prompts = ["Tell me something about Anthony Bourdain."]
completions = ["Anthony Michael Bourdain was an American celebrity chef, author, and travel documentarian."]

interface = AnnotatorInterFace.for_text_generation(
    prompts=prompts, # source
    completions=completions, # optional to show initial completion / target
    fn_model=None, # a callable e.g. (function or transformers pipelines) that returns `str`
    fn_next_input=None, # a function that feeds gradio components actively with the next input
    csv_logger=False, # True if you want to log to a CSV
    dataset_name=None # "<my_hf_org>/<my_dataset>" if you want to log to the hub
)
interface.launch()

text-generation-preference

from dataset_viber import AnnotatorInterFace

prompts = ["Tell me something about Anthony Bourdain."]
completions_a = ["Anthony Michael Bourdain was an American celebrity chef, author, and travel documentarian."]
completions_b = ["Anthony Michael Bourdain was an cool guy that knew how to cook."]

interface = AnnotatorInterFace.for_text_generation_preference(
    prompts=prompts,
    completions_a=completions_a,
    completions_b=completions_b,
    fn_model=None, # a callable e.g. (function or transformers pipelines) that returns `str`
    fn_next_input=None, # a function that feeds gradio components actively with the next input
    csv_logger=False, # True if you want to log to a CSV
    dataset_name=None # "<my_hf_org>/<my_dataset>" if you want to log to the hub
)
interface.launch()

聊天和多模态聊天

https://github.com/user-attachments/assets/fe7f0139-95a3-40e8-bc03-e37667d4f7a9

枢纽数据集

[!TIP] 我建议将文件上传到云存储，并使用远程 URL 避免任何问题。这可以通过使用 Hugging Face Datasets 完成。如 utils 中所示。此外，GradioChatbot 展示了如何使用聊天机器人界面进行多模态。

chat-classification

from dataset_viber import AnnotatorInterFace

prompts = [
    [
        {
            "role": "user",
            "content": "Tell me something about Anthony Bourdain."
        },
        {
            "role": "assistant",
            "content": "Anthony Michael Bourdain was an American celebrity chef, author, and travel documentarian."
        }
    ]
]

interface = AnnotatorInterFace.for_chat_classification(
    prompts=prompts,
    labels=["toxic", "non-toxic"],
    multi_label=False, # True if you have multi-label data
    fn_model=None, # a callable e.g. (function or transformers pipelines) that returns `str`
    fn_next_input=None, # a function that feeds gradio components actively with the next input
    csv_logger=False, # True if you want to log to a CSV
    dataset_name=None # "<my_hf_org>/<my_dataset>" if you want to log to the hub
)
interface.launch()

chat-generation

from dataset_viber import AnnotatorInterFace

prompts = [
    [
        {
            "role": "user",
            "content": "Tell me something about Anthony Bourdain."
        }
    ]
]

completions = [
    "Anthony Michael Bourdain was an American celebrity chef, author, and travel documentarian.",
]

interface = AnnotatorInterFace.for_chat_generation(
    prompts=prompts,
    completions=completions,
    fn_model=None, # a callable e.g. (function or transformers pipelines) that returns `str`
    fn_next_input=None, # a function that feeds gradio components actively with the next input
    csv_logger=False, # True if you want to log to a CSV
    dataset_name=None # "<my_hf_org>/<my_dataset>" if you want to log to the hub
)
interface.launch()

chat-generation-preference

from dataset_viber import AnnotatorInterFace

prompts = [
    [
        {
            "role": "user",
            "content": "Tell me something about Anthony Bourdain."
        }
    ]
]
completions_a = [
    "Anthony Michael Bourdain was an American celebrity chef, author, and travel documentarian.",
]
completions_b = [
    "Anthony Michael Bourdain was an cool guy that knew how to cook."
]

interface = AnnotatorInterFace.for_chat_generation_preference(
    prompts=prompts,
    completions_a=completions_a,
    completions_b=completions_b,
    fn_model=None, # a callable e.g. (function or transformers pipelines) that returns `str`
    fn_next_input=None, # a function that feeds gradio components actively with the next input
    csv_logger=False, # True if you want to log to a CSV
    dataset_name=None # "<my_hf_org>/<my_dataset>" if you want to log to the hub
)
interface.launch()

图像和多模态

https://github.com/user-attachments/assets/57d89edf-ae40-4942-a20a-bf8443100b66

枢纽数据集

[!TIP] 我建议将文件上传到云存储，并使用远程 URL 避免任何问题。这可以通过使用 Hugging Face Datasets 完成。

image-classification/multi-label-image-classification

from dataset_viber import AnnotatorInterFace

images = [
    "https://upload.wikimedia.org/wikipedia/commons/thumb/a/a5/Anthony_Bourdain_Peabody_2014b.jpg/440px-Anthony_Bourdain_Peabody_2014b.jpg",
    "https://upload.wikimedia.org/wikipedia/commons/8/85/David_Chang_David_Shankbone_2010.jpg"
]
labels = ["anthony-bourdain", "not-anthony-bourdain"]

interface = AnnotatorInterFace.for_image_classification(
    images=images,
    labels=labels,
    multi_label=False, # True if you have multi-label data
    fn_model=None, # a callable e.g. (function or transformers pipelines) that returns `str`
    fn_next_input=None, # a function that feeds gradio components actively with the next input
    csv_logger=False, # True if you want to log to a CSV
    dataset_name=None # "<my_hf_org>/<my_dataset>" if you want to log to the hub
)
interface.launch()

image-generation

from dataset_viber import AnnotatorInterFace

prompts = [
    "Anthony Bourdain laughing",
    "David Chang wearing a suit"
]
images = [
    "https://upload.wikimedia.org/wikipedia/commons/8/85/David_Chang_David_Shankbone_2010.jpg",
    "https://upload.wikimedia.org/wikipedia/commons/thumb/a/a5/Anthony_Bourdain_Peabody_2014b.jpg/440px-Anthony_Bourdain_Peabody_2014b.jpg",
]

interface = AnnotatorInterFace.for_image_generation(
    prompts=prompts,
    completions=images,
    fn_model=None, # a callable e.g. (function or transformers pipelines) that returns `str`
    fn_next_input=None, # a function that feeds gradio components actively with the next input
    csv_logger=False, # True if you want to log to a CSV
    dataset_name=None # "<my_hf_org>/<my_dataset>" if you want to log to the hub
)

interface.launch()

image-description

from dataset_viber import AnnotatorInterFace

images = [
    "https://upload.wikimedia.org/wikipedia/commons/thumb/a/a5/Anthony_Bourdain_Peabody_2014b.jpg/440px-Anthony_Bourdain_Peabody_2014b.jpg",
    "https://upload.wikimedia.org/wikipedia/commons/8/85/David_Chang_David_Shankbone_2010.jpg"
]
descriptions = ["Anthony Bourdain laughing", "David Chang wearing a suit"]

interface = AnnotatorInterFace.for_image_description(
    images=images,
    descriptions=descriptions, # optional to show initial descriptions
    fn_model=None, # a callable e.g. (function or transformers pipelines) that returns `str`
    fn_next_input=None, # a function that feeds gradio components actively with the next input
    csv_logger=False, # True if you want to log to a CSV
    dataset_name=None # "<my_hf_org>/<my_dataset>" if you want to log to the hub
)
interface.launch()

image-question-answering/visual-question-answering

from dataset_viber import AnnotatorInterFace

images = [
    "https://upload.wikimedia.org/wikipedia/commons/thumb/a/a5/Anthony_Bourdain_Peabody_2014b.jpg/440px-Anthony_Bourdain_Peabody_2014b.jpg",
    "https://upload.wikimedia.org/wikipedia/commons/8/85/David_Chang_David_Shankbone_2010.jpg"
]
questions = ["Who is this?", "What is he wearing?"]
answers = ["Anthony Bourdain", "a suit"]

interface = AnnotatorInterFace.for_image_question_answering(
    images=images,
    questions=questions, # optional to show initial questions
    answers=answers, # optional to show initial answers
    fn_model=None, # a callable e.g. (function or transformers pipelines) that returns `str`
    fn_next_input=None, # a function that feeds gradio components actively with the next input
    csv_logger=False, # True if you want to log to a CSV
    dataset_name=None # "<my_hf_org>/<my_dataset>" if you want to log to the hub
)
interface.launch()

image-generation-preference

from dataset_viber import AnnotatorInterFace

prompts = [
    "Anthony Bourdain laughing",
    "David Chang wearing a suit"
]

images_a = [
    "https://upload.wikimedia.org/wikipedia/commons/8/85/David_Chang_David_Shankbone_2010.jpg",
    "https://upload.wikimedia.org/wikipedia/commons/thumb/a/a5/Anthony_Bourdain_Peabody_2014b.jpg/440px-Anthony_Bourdain_Peabody_2014b.jpg",
]

images_b = [
    "https://upload.wikimedia.org/wikipedia/commons/thumb/a/a5/Anthony_Bourdain_Peabody_2014b.jpg/440px-Anthony_Bourdain_Peabody_2014b.jpg",
    "https://upload.wikimedia.org/wikipedia/commons/8/85/David_Chang_David_Shankbone_2010.jpg"
]

interface = AnnotatorInterFace.for_image_generation_preference(
    prompts=prompts,
    completions_a=images_a,
    completions_b=images_b,
    fn_model=None, # a callable e.g. (function or transformers pipelines) that returns `str`
    fn_next_input=None, # a function that feeds gradio components actively with the next input
    csv_logger=False, # True if you want to log to a CSV
    dataset_name=None # "<my_hf_org>/<my_dataset>" if you want to log to the hub
)
interface.launch()

合成器

基于 distilabel 构建，用于在循环中使用模型合成数据。

[!TIP] 您也可以直接调用合成器生成数据。使用 synthesizer() -> Tuple 或 Synthesizer.batch_synthesize(n) -> List[Tuple] 获取各种任务的输入。

text-classification

from dataset_viber import AnnotatorInterFace
from dataset_viber.synthesizer import Synthesizer

synthesizer = Synthesizer.for_text_classification(prompt_context="IMDB movie reviews")

interface = AnnotatorInterFace.for_text_classification(
    fn_next_input=synthesizer,
    labels=["positive", "negative"]
)
interface.launch()

text-generation

from dataset_viber import AnnotatorInterFace
from dataset_viber.synthesizer import Synthesizer

synthesizer = Synthesizer.for_text_generation(
    prompt_context="A phone company customer support expert"
)

interface = AnnotatorInterFace.for_text_generation(
    fn_next_input=synthesizer
)
interface.launch()

chat-classification

from dataset_viber import AnnotatorInterFace
from dataset_viber.synthesizer import Synthesizer

synthesizer = Synthesizer.for_chat_classification(
    prompt_context="A phone company customer support expert"
)

interface = AnnotatorInterFace.for_chat_classification(
    fn_next_input=synthesizer,
    labels=["positive", "negative"]
)
interface.launch()

chat-generation

from dataset_viber import AnnotatorInterFace
from dataset_viber.synthesizer import Synthesizer

synthesizer = Synthesizer.for_chat_generation(
    prompt_context="A phone company customer support expert"
)

interface = AnnotatorInterFace.for_chat_generation(
    fn_next_input=synthesizer
)
interface.launch()

chat-generation-preference

from dataset_viber import AnnotatorInterFace
from dataset_viber.synthesizer import Synthesizer

synthesizer = Synthesizer.for_chat_generation_preference(prompt_context="A phone company customer support expert")

interface = AnnotatorInterFace.for_chat_generation_preference(
    fn_next_input=synthesizer
)
interface.launch()

image-classification

from dataset_viber import AnnotatorInterFace
from dataset_viber.synthesizer import Synthesizer

synthesizer = Synthesizer.for_image_classification(prompt_context="A phone company customer support expert")

interface = AnnotatorInterFace.for_image_classification(
    fn_next_input=synthesizer,
    labels=["positive", "negative"]
)
interface.launch()

image-generation

from dataset_viber import AnnotatorInterFace
from dataset_viber.synthesizer import Synthesizer

synthesizer = Synthesizer.for_image_generation(prompt_context="A phone company customer support expert")

interface = AnnotatorInterFace.for_image_generation(
    fn_next_input=synthesizer
)
interface.launch()

image-description

from dataset_viber import AnnotatorInterFace
from dataset_viber.synthesizer import Synthesizer

synthesizer = Synthesizer.for_image_description(prompt_context="A phone company customer support expert")

interface = AnnotatorInterFace.for_image_description(
    fn_next_input=synthesizer
)
interface.launch()

image-question-answering

from dataset_viber import AnnotatorInterFace
from dataset_viber.synthesizer import Synthesizer

synthesizer = Synthesizer.for_image_question_answering(prompt_context="A phone company customer support expert")

interface = AnnotatorInterFace.for_image_question_answering(
    fn_next_input=synthesizer
)
interface.launch()

image-generation-preference

from dataset_viber import AnnotatorInterFace
from dataset_viber.synthesizer import Synthesizer

synthesizer = Synthesizer.for_image_generation_preference(prompt_context="A phone company customer support expert")

interface = AnnotatorInterFace.for_image_generation_preference(
    fn_next_input=synthesizer
)
interface.launch()

批量接口

基于 Dash、plotly-express、umap-learn 和 fast-sentence-transformers 构建，用于嵌入和理解您的分布，并标注您的数据。

https://github.com/user-attachments/assets/5e96c06d-e37f-45a0-9633-1a8e714d71ed

枢纽数据集

文本可视化

from dataset_viber import BulkInterface
from datasets import load_dataset

ds = load_dataset("SetFit/ag_news", split="train[:2000]")

interface: BulkInterface = BulkInterface.for_text_visualization(
    ds.to_pandas()[["text", "label_text"]],
    content_column='text',
    label_column='label_text',
)
interface.launch()

text-classification

from dataset_viber import BulkInterface
from datasets import load_dataset

ds = load_dataset("SetFit/ag_news", split="train[:2000]")
df = ds.to_pandas()[["text", "label_text"]]

interface = BulkInterface.for_text_classification(
    dataframe=df,
    content_column='text',
    label_column='label_text',
    labels=df['label_text'].unique().tolist()
)
interface.launch()

聊天可视化

from dataset_viber.bulk import BulkInterface
from datasets import load_dataset

ds = load_dataset("argilla/distilabel-capybara-dpo-7k-binarized", split="train[:1000]")
df = ds.to_pandas()[["chosen"]]

interface = BulkInterface.for_chat_visualization(
    dataframe=df,
    chat_column='chosen',
)
interface.launch()

chat-classification

from dataset_viber.bulk import BulkInterface
from datasets import load_dataset

ds = load_dataset("argilla/distilabel-capybara-dpo-7k-binarized", split="train[:1000]")
df = ds.to_pandas()[["chosen"]]

interface = BulkInterface.for_chat_classification(
    dataframe=df,
    chat_column='chosen',
    labels=["math", "science", "history", "question seeking"],
)
interface.launch()

实用工具

按相同顺序打乱输入

当处理多个输入时，您可能希望按相同顺序打乱它们。

def shuffle_lists(*lists):
    if not lists:
        return []

    # Get the length of the first list
    length = len(lists[0])

    # Check if all lists have the same length
    if not all(len(lst) == length for lst in lists):
        raise ValueError("All input lists must have the same length")

    # Create a list of indices and shuffle it
    indices = list(range(length))
    random.shuffle(indices)

    # Reorder each list based on the shuffled indices
    return [
        [lst[i] for i in indices]
        for lst in lists
    ]

随机交换以随机化完成

当处理多个完成时，您可能希望交换相同索引的完成，其中每个完成索引 x 与相同索引的随机完成交换。这对于偏好学习很有用。

def swap_completions(*lists):
    # Assuming all lists are of the same length
    length = len(lists[0])

    # Check if all lists have the same length
    if not all(len(lst) == length for lst in lists):
        raise ValueError("All input lists must have the same length")

    # Convert the input lists (which are tuples) to a list of lists
    lists = [list(lst) for lst in lists]

    # Iterate over each index
    for i in range(length):
        # Get the elements at index i from all lists
        elements = [lst[i] for lst in lists]

        # Randomly shuffle the elements
        random.shuffle(elements)

        # Assign the shuffled elements back to the lists
        for j, lst in enumerate(lists):
            lst[i] = elements[j]

    return lists

从 Hugging Face Hub 加载远程图像 URL

当处理图像时，您可能希望从 Hugging Face Hub 加载远程 URL。

from datasets import Dataset, Image, load_dataset

dataset = load_dataset(
    "my_hf_org/my_image_dataset"
).cast_column("my_image_column", Image(decode=False))
dataset[0]["my_image_column"]
# {'bytes': None, 'path': 'path_to_image.jpg'}

贡献和开发设置

首先，安装 PDM。

然后，安装环境，这将自动创建 .venv 虚拟环境和安装开发环境。

pdm install

最后，运行 pre-commit 进行提交时的格式化。

pre-commit install

遵循此指南以做出首次贡献。

参考文献

标志

键盘图标由 srip - Flaticon 制作

灵感来源

项目详情

发布历史发布通知 | RSS源

本版本

0.3.1

2024年9月4日

0.3

2024年9月3日

0.2.4

2024年8月22日

0.2.3

2024年8月21日

0.2.3rc1 预发布

2024年8月21日

0.2.2

2024年8月21日

0.2.2rc3 预发布

2024年8月21日

0.2.2rc2 预发布

2024年8月21日

0.2.2rc1 预发布

2024年8月21日

0.2.1

2024年8月21日

0.2.0

2024年8月21日

0.2.0rc8 预发布

2024年8月21日

0.2.0rc7 预发布

2024年8月21日

0.2.0rc6 预发布

2024年8月21日

0.2.0rc5 预发布

2024年8月21日

0.2.0rc4 预发布

2024年8月21日

0.2.0rc3 预发布

2024年8月16日

0.2.0rc2 预发布

2024年8月16日

0.2.0rc1 预发布

2024年8月16日

0.2.0rc0 预发布

2024年8月16日

0.1.2

2024年8月14日

0.1.1

2024年8月14日

下载文件

下载适合您平台文件的文件。如果您不确定该选择哪个，请了解更多关于安装包的信息。

源分发

dataset_viber-0.3.1.tar.gz (40.8 kB 查看散列值)

上传时间 2024年9月4日 源

构建分发

dataset_viber-0.3.1-py3-none-any.whl (49.7 kB 查看散列值)

上传时间 2024年9月4日 Python 3

dataset_viber-0.3.1.tar.gz的散列值

dataset_viber-0.3.1.tar.gz的散列值
算法	散列摘要
SHA256	`7e5477cbc7377694eceecd6b9b25d8346077a777e29393ed6a54361271b944e7`
MD5	`e49f2359681fbdde03511c37ee152a2f`
BLAKE2b-256	`8b8453c99d8677863f78aa8171cd94fcf0c5f43efa6b8b084697b0e1600c4b78`

dataset_viber-0.3.1-py3-none-any.whl的散列值

dataset_viber-0.3.1-py3-none-any.whl的散列值
算法	散列摘要
SHA256	`6502ecbbf00a45e1ff6da343279ad266d6997fc3eeeb2263290e99b4454ddd2a`
MD5	`4647b0390630f75fe26eb1bf72ab56ff`
BLAKE2b-256	`0d1e409ecb943c5ba2358eb130d24d19d819bd68f78cfd35ace8b4f027575710`

dataset-viber 0.3.1

导航

验证详情

维护者

未验证详情

元数据

项目描述

Dataset Viber

避免炒作，检查情感！

安装

我们感觉怎么样？

收集器接口

标注器接口

文本

聊天和多模态聊天

图像和多模态

合成器

批量接口

实用工具

贡献和开发设置

参考文献

标志

灵感来源

项目详情

验证详情

维护者

未验证详情

元数据

发布历史发布通知 | RSS源

下载文件

源分发

构建分发

dataset-viber 0.3.1

导航

验证详情

维护者

未验证详情

元数据

项目描述

Dataset Viber

避免炒作，检查情感！

安装

我们感觉怎么样？

收集器接口

标注器接口

文本

聊天和多模态聊天

图像和多模态

合成器

批量接口

实用工具

贡献和开发设置

参考文献

标志

灵感来源

项目详情

验证详情

维护者

未验证详情

元数据

发布历史 发布通知 | RSS源

下载文件

源分发

构建分发

发布历史发布通知 | RSS源