reprospect.tools.binaries.nvdisasm module
- class reprospect.tools.binaries.nvdisasm.Function(registers: dict[RegisterType, tuple[int, int]] | None = None)View on GitHub
Bases:
TableMixinData structure holding resource usage information of a kernel, as extracted from a binary.
registersholds detailed register usage information per register type. Each entry is a tuple holding:the length of the span of used registers, i.e., the maximum register index + 1
the number of registers actually used within that span
For instance, if a kernel uses registers
R0,R1, andR3, then the entry forreprospect.tools.sass.decode.RegisterType.GPRwill be (4, 3) because the spanR0-R3contains 4 registers, from which 3 are actually used.Note
It is not decorated with
dataclasses.dataclass()because of https://github.com/mypyc/mypyc/issues/1061.- __init__(registers: dict[RegisterType, tuple[int, int]] | None = None) NoneView on GitHub
- to_table() TableView on GitHub
Convert the register usage to a
rich.table.Table.
- class reprospect.tools.binaries.nvdisasm.NVDisasm(file: ~pathlib.Path, arch: ~reprospect.tools.architecture.NVIDIAArch | None = None, demangler: type[~reprospect.tools.binaries.demangle.CuppFilt | ~reprospect.tools.binaries.demangle.LlvmCppFilt] = <class 'reprospect.tools.binaries.demangle.CuppFilt'>)View on GitHub
Bases:
objectExtract information from CUDA binaries using
nvdisasm.The main purpose of
nvdisasmis to disassemble CUDA binary files. Beyond the raw disassembly, it can also annotate the disassembled SASS with information, such as register liveness range information.nvdisasmprovides liveness ranges for all register types:GPR,PRED,UGPR,UPRED; see alsoreprospect.tools.sass.decode.RegisterType.This class provides functionalities to parse this register liveness range information to deduce how many registers each kernel uses.
Note that the register use information extracted by
reprospect.tools.binaries.CuObjDumpconcerns only thereprospect.tools.sass.decode.RegisterType.GPRregister type. As compared withreprospect.tools.binaries.CuObjDump, this class provides details for all register types.Note that register liveness range information can also be obtained by parsing the SASS code extracted by
reprospect.tools.binaries.CuObjDump. However, to implement such a parser, it is not sufficient to simply track the registers that appear in the SASS code. For instance, for certain instructions, operands span multiple consecutive registers, but only the first register index appears in the instruction string. For instance,In
STG.E desc[UR6][R6.64], R15, the memory address operand[R6.64]uses two consecutive registers, namely,R6-R7, but onlyR6appears explicitly.In
LDCU.64 UR8, c[0x0][0x3d8], the modifier64indicates that the destination is the two consecutive registersUR8-UR9, but onlyUR8appears explicitly.In
IMAD.WIDE.U32 R2, R0, 0x4, R8, the modifierWIDEindicates thatR2andR8are twice as wide asR0and0x4. Hence, the destination and the addend useR2-R3andR8-R9, but onlyR2andR8appear explicitly.
There are also complexities such as tracking register usage across function calls. Consequently, to deduce the register usage, this class relies on parsing the register annotations provided by
nvdisasm, rather than on implementing its own logic to infer register usage from dumped SASS code.References:
- TABLE_BEGIN_END: Final[Pattern[str]] = re.compile('(?:\\.[A-Za-z0-9_]+:)?[ ]+\\/\\/ \\+[\\-\\+]+\\+$')
- __init__(file: ~pathlib.Path, arch: ~reprospect.tools.architecture.NVIDIAArch | None = None, demangler: type[~reprospect.tools.binaries.demangle.CuppFilt | ~reprospect.tools.binaries.demangle.LlvmCppFilt] = <class 'reprospect.tools.binaries.demangle.CuppFilt'>) NoneView on GitHub
- Parameters:
arch – Optionally check that file is a CUDA binary file for that arch.
- __str__() strView on GitHub
Rich representation.
- extract_register_usage_from_liveness_range_info(mangled: Iterable[str]) NoneView on GitHub
Extract register usage from liveness range information.
- classmethod parse_sass_with_liveness_range_info(function_mangled: str, sass: Iterator[str]) FunctionView on GitHub
Parse the SASS with the liveness range information to extract the resource usage.
It typically looks like:
// +--------------------+--------+----------------+ // | GPR | PRED | UGPR | // | # 0 1 2 3 4 5 6 7 | # 0 | # 0 1 2 3 4 5 | // +--------------------+--------+----------------+ // | | | | // | 1 ^ | | | // | 2 ^ : | | | // | 2 : : | | 1 ^ | // | 2 v : | | 1 : | // +--------------------+--------+----------------+
- class reprospect.tools.binaries.nvdisasm.RegisterState(*values)View on GitHub
Bases:
StrEnumRegister state, typically found in the output of
nvdisasm.References:
- ASSIGNMENT = '^'
- IN_USE = ':'
- NOT_IN_USE = ' '
- USAGE = 'v'
- USAGE_AND_REASSIGNMENT = 'x'
- __str__()
Return str(self).
- property used: boolView on GitHub
Whether the state corresponds to a state in which the register is in use.