Python接口到LLM。

项目描述

LlamaBot: 一个Python风格的LLM接口

LlamaBot实现了对LLM的Python接口，使得在Jupyter笔记本中实验LLM以及构建利用LLM的Python应用变得更加容易。LlamaBot支持LiteLLM支持的所有的模型。

安装LlamaBot

要安装LlamaBot

pip install llamabot

获取访问LLM的权限

选项1：使用Ollama本地模型

LlamaBot支持通过Ollama使用本地模型。为此，请访问Ollama网站并安装Ollama。然后按照以下说明操作。

选项2：使用API提供商

OpenAI

如果您有OpenAI API密钥，则可以通过运行以下命令配置LlamaBot使用该API密钥：

export OPENAI_API_KEY="sk-your1api2key3goes4here"

Mistral

如果您有Mistral API密钥，则可以通过运行以下命令配置LlamaBot使用该API密钥：

export MISTRAL_API_KEY="your-api-key-goes-here"

其他API提供商

其他API提供商通常指定一个环境变量来设置。如果您有API密钥，请相应地设置环境变量。

使用方法

SimpleBot

LlamaBot最简单的用法是创建一个SimpleBot，不记录聊天历史。这实际上等同于一个无状态的函数，您使用自然语言指令而不是代码来编程。这对于提示实验或创建简单的机器人非常有用，这些机器人根据指令预先配置以处理文本，然后可以重复使用不同的文本进行调用。

使用API提供商的`SimpleBot`

例如，创建一个像Richard Feynman那样解释给定文本的机器人

from llamabot import SimpleBot

system_prompt = "You are Richard Feynman. You will be given a difficult concept, and your task is to explain it back."
feynman = SimpleBot(
  system_prompt,
  model_name="gpt-3.5-turbo"
)

使用GPT，您需要配置OPENAI_API_KEY环境变量。如果您想使用本地Ollama模型的SimpleBot，请查看此示例

现在，feynman 可以在任意文本块上调用，并将以理查德·费曼的风格（或更准确地说，根据 system_prompt 指定的风格）重写该文本。例如

prompt = """
Enzyme function annotation is a fundamental challenge, and numerous computational tools have been developed.
However, most of these tools cannot accurately predict functional annotations,
such as enzyme commission (EC) number,
for less-studied proteins or those with previously uncharacterized functions or multiple activities.
We present a machine learning algorithm named CLEAN (contrastive learning–enabled enzyme annotation)
to assign EC numbers to enzymes with better accuracy, reliability,
and sensitivity compared with the state-of-the-art tool BLASTp.
The contrastive learning framework empowers CLEAN to confidently (i) annotate understudied enzymes,
(ii) correct mislabeled enzymes, and (iii) identify promiscuous enzymes with two or more EC numbers—functions
that we demonstrate by systematic in silico and in vitro experiments.
We anticipate that this tool will be widely used for predicting the functions of uncharacterized enzymes,
thereby advancing many fields, such as genomics, synthetic biology, and biocatalysis.
"""
feynman(prompt)

这将返回类似以下内容

Alright, let's break this down.

Enzymes are like little biological machines that help speed up chemical reactions in our
bodies. Each enzyme has a specific job, or function, and we use something called an
Enzyme Commission (EC) number to categorize these functions.

Now, the problem is that we don't always know what function an enzyme has, especially if
it's a less-studied or new enzyme. This is where computational tools come in. They try
to predict the function of these enzymes, but they often struggle to do so accurately.

So, the folks here have developed a new tool called CLEAN, which stands for contrastive
learning–enabled enzyme annotation. This tool uses a machine learning algorithm, which
is a type of artificial intelligence that learns from data to make predictions or
decisions.

CLEAN uses a method called contrastive learning. Imagine you have a bunch of pictures of
cats and dogs, and you want to teach a machine to tell the difference. You'd show it
pairs of pictures, some of the same animal (two cats or two dogs) and some of different
animals (a cat and a dog). The machine would learn to tell the difference by contrasting
the features of the two pictures. That's the basic idea behind contrastive learning.

CLEAN uses this method to predict the EC numbers of enzymes more accurately than
previous tools. It can confidently annotate understudied enzymes, correct mislabeled
enzymes, and even identify enzymes that have more than one function.

The creators of CLEAN have tested it with both computer simulations and lab experiments,
and they believe it will be a valuable tool for predicting the functions of unknown
enzymes. This could have big implications for fields like genomics, synthetic biology,
and biocatalysis, which all rely on understanding how enzymes work.

使用 `SimpleBot` 和本地 Ollama 模型

如果您想使用本地托管的 Ollama 模型，则可以使用以下语法

from llamabot import SimpleBot

system_prompt = "You are Richard Feynman. You will be given a difficult concept, and your task is to explain it back."
bot = SimpleBot(
    system_prompt,
    model_name="ollama/llama2:13b"
)

只需指定 model_name 关键字参数，格式为 <provider>/<model name>。例如

前缀为 ollama/，以及
来自 Ollama 模型库的模型名称

您需要确保 Ollama 在本地运行；有关更多详细信息，请参阅 Ollama 文档。（同样，也可以对下面的 ChatBot 和 QueryBot 类进行操作！）

model_name 参数是可选的。如果您不提供它，Llamabot 将尝试使用默认模型。您可以在 DEFAULT_LANGUAGE_MODEL 环境变量中配置它。

聊天机器人

为了在 Jupyter Notebook 中进行聊天机器人的实验，我们还提供了 ChatBot 接口。此接口会自动跟踪 Jupyter 会话生命周期内的聊天历史。这样做可以使您将本地的 Jupyter Notebook 用作聊天界面。

例如

from llamabot import ChatBot

system_prompt="You are Richard Feynman. You will be given a difficult concept, and your task is to explain it back."
feynman = ChatBot(
  system_prompt,
  session_name="feynman_chat",
  # Optional:
  # model_name="gpt-3.5-turbo"
  # or
  # model_name="ollama/mistral"
)

有关 model_name 的更多解释，请参阅使用 SimpleBot 的示例。

现在，您已经有一个 ChatBot 实例，您可以开始与之进行对话。

prompt = """
Enzyme function annotation is a fundamental challenge, and numerous computational tools have been developed.
However, most of these tools cannot accurately predict functional annotations,
such as enzyme commission (EC) number,
for less-studied proteins or those with previously uncharacterized functions or multiple activities.
We present a machine learning algorithm named CLEAN (contrastive learning–enabled enzyme annotation)
to assign EC numbers to enzymes with better accuracy, reliability,
and sensitivity compared with the state-of-the-art tool BLASTp.
The contrastive learning framework empowers CLEAN to confidently (i) annotate understudied enzymes,
(ii) correct mislabeled enzymes, and (iii) identify promiscuous enzymes with two or more EC numbers—functions
that we demonstrate by systematic in silico and in vitro experiments.
We anticipate that this tool will be widely used for predicting the functions of uncharacterized enzymes,
thereby advancing many fields, such as genomics, synthetic biology, and biocatalysis.
"""
feynman(prompt)

在可用的聊天历史中，您可以提出后续问题

feynman("Is there a simpler way to rephrase the text such that a high schooler would understand it?")

并且您的机器人将利用聊天历史进行响应。

查询机器人

提供的最后一个机器人是查询机器人。此机器人允许您查询文档集合。要使用它，您有两种选择

传递一个包含文本文件路径的列表，让 Llamabot 为它们创建一个新的集合，或者
传递先前实例化的 QueryBot 模型的 collection_name。（这将加载先前计算好的文本索引到内存中。）

创建新集合的示例

from llamabot import QueryBot
from pathlib import Path

bot = QueryBot(
  system_prompt="You are an expert on Eric Ma's blog.",
  collection_name="eric_ma_blog",
  document_paths=[
    Path("/path/to/blog/post1.txt"),
    Path("/path/to/blog/post2.txt"),
    ...,
  ],
  # Optional:
  # model_name="gpt-3.5-turbo"
  # or
  # model_name="ollama/mistral"
) # This creates a new embedding for my blog text.
result = bot("Do you have any advice for me on career development?")

使用现有集合的示例

from llamabot import QueryBot

bot = QueryBot(
  system_prompt="You are an expert on Eric Ma's blog",
  collection_name="eric_ma_blog",
  # Optional:
  # model_name="gpt-3.5-turbo"
  # or
  # model_name="ollama/mistral"
)  # This loads my previously-embedded blog text.
result = bot("Do you have any advice for me on career development?")

有关 model_name 的更多解释，请参阅使用 SimpleBot 的示例。

图像机器人

随着 OpenAI API 更新的发布，只要您有 OpenAI API 密钥，您就可以使用 LlamaBot 生成图像

from llamabot import ImageBot

bot = ImageBot()
# Within a Jupyter notebook:
url = bot("A painting of a dog.")

# Or within a Python script
filepath = bot("A painting of a dog.")

# Now, you can do whatever you need with the url or file path.

如果您在 Jupyter Notebook 中，您还会看到图像神奇地作为输出单元格的一部分出现。

CLI 示例

Llamabot 包含 CLI 示例，展示了可以使用它构建的内容，以及一些辅助代码。

这里有一个示例，我在命令行中直接使用 llamabot chat 暴露聊天机器人

还有另一个示例，其中 llamabot 被用作 CLI 应用程序的后端，用于使用 llamabot zotero chat 与 Zotero 库聊天

最后，这里有一个示例，我使用 llamabot 的 SimpleBot 创建了一个机器人，该机器人可以自动为我编写提交消息。

缓存

LlamaBot 使用缓存机制来提高性能并减少不必要的 API 调用。默认情况下，所有缓存条目在 1 天后（86400 秒）过期。此行为是通过使用 diskcache 库实现的。

缓存配置

在您使用任何机器人类（SimpleBot、ChatBot 或 QueryBot）时，缓存会自动配置。您无需手动设置缓存。

缓存位置

默认缓存目录位于

~/.llamabot/cache

缓存超时

缓存超时可以通过使用环境变量 LLAMABOT_CACHE_TIMEOUT 进行配置。默认情况下，缓存超时设置为1天（86400秒）。要自定义缓存超时，将环境变量 LLAMABOT_CACHE_TIMEOUT 设置为所需的秒数。例如

export LLAMABOT_CACHE_TIMEOUT=3600

这将设置缓存超时为1小时（3600秒）。

贡献

新功能

欢迎提出新功能！对于大型语言模型的用户来说，这是一个早期且令人兴奋的日子。我们的开发目标是尽可能保持项目简单。带有pull request的功能请求将被优先考虑；功能的实现越简单（就维护负担而言），越有可能被批准。

错误报告

请使用问题跟踪器提交错误报告。

问题/讨论

请使用GitHub上的问题跟踪器。

贡献者

_陆瑞娜 💻	_{andrew giessel} 🤔 🎨 💻	_{艾登·布鲁伊斯} 💻	_{马修·艾瑞克} 🤔 🎨 💻	_{马克·哈里森} 🤔	_reka 📖 💻	_anujsinha3 💻 📖
_{埃利奥特·萨尔兹伯里} 📖	_{Ethan Fricker, PhD} 📖	_{Ikko Eltociear Ashimine} 📖

项目详情

发布历史发布通知 | RSS源

此版本

0.8.1

2024年10月3日

0.8.0

2024年9月27日

0.7.0

2024年9月22日

0.6.3

2024年9月13日

0.6.2

2024年9月10日

0.6.1

2024年9月6日

0.6.0

2024年8月30日

0.5.5

2024年8月10日

0.5.4

2024年8月7日

0.5.3

2024年8月5日

0.5.2

2024年8月1日

0.5.1

2024年7月29日

0.5.0

2024年7月21日

0.4.8

2024年7月16日

0.4.7

2024年7月13日

0.4.6

2024年6月10日

0.4.5

2024年6月3日

0.4.4

2024年4月13日

0.4.3

2024年4月12日

0.4.2

2024年4月8日

0.4.1

2024年3月31日

0.4.0

2024年3月24日

0.3.1

2024年3月16日

0.3.0 已撤回

2024年3月16日

此版本被撤回的原因

发布与仓库中的竞争条件同时发生，因此我们没有正确标记。

0.2.5

2024年3月5日

0.2.4

2024年2月19日

0.2.3

2024年2月18日

0.2.2

2024年2月15日

0.2.1

2024年2月14日

0.2.0

2024年2月9日

0.1.2

2024年1月22日

0.1.1

2024年1月21日

0.1.0

2024年1月11日

0.0.89

2023年11月11日

0.0.88

2023年11月7日

0.0.87

2023年10月31日

0.0.86

2023年10月31日

0.0.85

2023年10月20日

0.0.84

2023年10月7日

0.0.83

2023年10月5日

0.0.82

2023年10月2日

0.0.81

2023年10月2日

0.0.80

2023年9月30日

0.0.79

2023年9月26日

0.0.78

2023年9月25日

0.0.77

2023年9月25日

0.0.76

2023年9月23日

0.0.75

2023年9月21日

0.0.74

2023年9月19日

0.0.73

2023年9月10日

0.0.72

2023年9月10日

0.0.71

2023年8月27日

0.0.70

2023年8月27日

0.0.69

2023年8月24日

0.0.68

2023年8月21日

0.0.67

2023年7月23日

0.0.66

2023年7月23日

0.0.65

2023年7月23日

0.0.64

2023年7月14日

0.0.63

2023年7月13日

0.0.62

2023年7月6日

0.0.61

2023年7月6日

0.0.60

2023年7月6日

0.0.59

2023年7月6日

0.0.58

2023年7月6日

0.0.57

2023年7月6日

0.0.56

2023年7月3日

0.0.55

2023年7月2日

0.0.54

2023年7月2日

0.0.53

2023年7月2日

0.0.52

2023年7月1日

0.0.51

2023年7月1日

0.0.50

2023年6月29日

0.0.49

2023年6月28日

0.0.48

2023年6月28日

0.0.47

2023年6月28日

0.0.46

2023年6月28日

0.0.45

2023年6月28日

0.0.44

2023年6月28日

0.0.43

2023年6月21日

0.0.42

2023年6月21日

0.0.41

2023年6月20日

0.0.40

2023年6月20日

0.0.39

2023年6月20日

0.0.38

2023年6月20日

0.0.37

2023年6月20日

0.0.36

2023年6月20日

0.0.35

2023年6月20日

0.0.34

2023年6月20日

0.0.33

2023年6月20日

0.0.32

2023年6月20日

0.0.31

2023年6月20日

0.0.30

2023年6月20日

0.0.29

2023年6月18日

0.0.28

2023年6月17日

0.0.27

2023年6月16日

0.0.26

2023年6月16日

0.0.25

2023年6月16日

0.0.24

2023年6月16日

0.0.23

2023年6月16日

0.0.22

2023年6月15日

0.0.21

2023年5月12日

0.0.19

2023年5月12日

0.0.18

2023年5月12日

0.0.17

2023年5月9日

0.0.16

2023年5月9日

0.0.15

2023年5月9日

0.0.14

2023年5月7日

0.0.13

2023年5月4日

0.0.12

2023年5月4日

0.0.11

2023年5月1日

0.0.10

2023年4月22日

0.0.9

2023年4月20日

0.0.8

2023年4月16日

0.0.6

2023年4月11日

0.0.5

2023年4月10日

0.0.4

2023年4月3日

0.0.3

2023年4月3日

0.0.2

2023年4月3日

下载文件

下载适用于您平台的文件。如果您不确定选择哪个，请了解更多关于安装包的信息。

源代码分发

llamabot-0.8.1.tar.gz (63.5 kB 查看哈希值)

上传时间 2024年10月3日 源代码

构建分发

llamabot-0.8.1-py3-none-any.whl (69.5 kB 查看哈希值)

上传时间 2024年10月3日 Python 3

哈希值 for llamabot-0.8.1.tar.gz

llamabot-0.8.1.tar.gz的哈希值
算法	哈希摘要
SHA256	`5fbac83d76201563269bf2f468ff7803c10b1e00a8f60fd11bae7f76f8b8bb61`
MD5	`be2ffb1bb0842dc453d18f1ad0a54712`
BLAKE2b-256	`a29a849bac05340350b37a3900df864b91a60bc00eee67c9654ce29dc657c705`

哈希值 for llamabot-0.8.1-py3-none-any.whl

llamabot-0.8.1-py3-none-any.whl的哈希值
算法	哈希摘要
SHA256	`2022266bed177f0c04b09500388d4793d62119b85a019afc60a18845485e7511`
MD5	`d0dce90dacb9b21d182525bd03b39e25`
BLAKE2b-256	`19d1ef7c326b0082edfca7b4c5c6ff7c45708d7ad81c27037dacedb00fa9fd9a`

llamabot 0.8.1

导航

验证详情

维护者

未验证详情

项目链接

元数据

项目描述

LlamaBot: 一个Python风格的LLM接口

安装LlamaBot

获取访问LLM的权限

选项1：使用Ollama本地模型

选项2：使用API提供商

OpenAI

Mistral

其他API提供商

使用方法

SimpleBot

使用API提供商的SimpleBot

使用 SimpleBot 和本地 Ollama 模型

聊天机器人

查询机器人

图像机器人

CLI 示例

缓存

缓存配置

缓存位置

缓存超时

贡献

新功能

错误报告

问题/讨论

贡献者

项目详情

验证详情

维护者

未验证详情

项目链接

元数据

发布历史 发布通知 | RSS源

下载文件

源代码分发

构建分发

使用API提供商的`SimpleBot`

使用 `SimpleBot` 和本地 Ollama 模型

发布历史发布通知 | RSS源