高层Python中的高效汇编代码生成
项目描述
PEACH-Py是一个用于编写高性能汇编内核的Python框架。PEACH-Py旨在简化编写优化的汇编内核,同时保留传统汇编的所有优化机会。一些PEACH-Py功能
自动寄存器分配
栈帧管理,包括根据需要重新对齐栈帧
从同一源生成函数的不同调用约定版本(例如,可以从中生成适用于Microsoft x64 ABI和System V x86-64 ABI的函数)
允许在定义它们的地方定义常量(就像在高级语言中一样)
跟踪函数中使用的指令扩展。
多指令流复用(对软件流水线很有帮助)
示例
from peachpy.x64 import *
# Use 'x64-ms' for Microsoft x64 ABI
abi = peachpy.c.ABI('x64-sysv')
assembler = Assembler(abi)
# Implement function void add_1(const uint32_t *src, uint32_t *dst, size_t length)
src_argument = peachpy.c.Parameter("src", peachpy.c.Type("const uint32_t*"))
dst_argument = peachpy.c.Parameter("dst", peachpy.c.Type("uint32_t*"))
len_argument = peachpy.c.Parameter("length", peachpy.c.Type("size_t"))
# This optimized kernel will target Intel Nehalem processors. Any instructions which are not
# supported on Intel Nehalem (e.g. AVX instructions) will generate an error. If you don't have
# a particular target in mind, use "Unknown"
with Function(assembler, "add_1", (src_argument, dst_argument, len_argument), "Nehalem"):
# Load arguments into registers
srcPointer = GeneralPurposeRegister64()
LOAD.PARAMETER( srcPointer, src_argument )
dstPointer = GeneralPurposeRegister64()
LOAD.PARAMETER( dstPointer, dst_argument )
length = GeneralPurposeRegister64()
LOAD.PARAMETER( length, len_argument )
# Main processing loop. Length must be a multiple of 4.
LABEL( 'loop' )
x = SSERegister()
MOVDQU( x, [srcPointer] )
ADD( srcPointer, 16 )
# Add 1 to x
PADDD( x, Constant.uint32x4(1) )
MOVDQU( [dstPointer], x )
ADD( dstPointer, 16 )
SUB( length, 4 )
JNZ( 'loop' )
RETURN()
print assembler
项目详细信息
关闭
PeachPy-0.0.1.zip 的哈希值
| 算法 | 哈希摘要 | |
|---|---|---|
| SHA256 | 9eaa37c0a914900dd93051080a9510c488fff626a22cc9c855c35045e0ac2843 |
|
| MD5 | 9e41ae48b72537f7788482ddec6b9958 |
|
| BLAKE2b-256 | de53b576bafe2554e3ea952a1da2959f124d4d79906bddd82cd3275ff98ad498 |