ELF

class tests.tools.binaries.test_elf.TestCUDARTView on GitHub

Bases: object

Tests for reprospect.tools.binaries.elf.ELF using the CUDA runtime shared library.

cudart(cmake_file_api: FileAPI) PathView on GitHub
test_embedded_cubin(cudart: Path) NoneView on GitHub

The CUDA runtime shared library does not contain embedded cubins.

test_shared_library(cudart: Path) NoneView on GitHub

The CUDA runtime shared library is a shared library and not itself a cubin.

class tests.tools.binaries.test_elf.TestCuBLASView on GitHub

Bases: object

Tests for reprospect.tools.binaries.elf.ELF using the cuBLAS shared library.

cublas(cmake_file_api: FileAPI) CuBLASView on GitHub
test_embedded_cubin(parameters: Parameters, cublas: CuBLAS, workdir: Path) NoneView on GitHub

The cuBLAS shared library contains embedded cubins for many, but not all, architectures.

test_embedded_cubin_cuinfo_and_tkinfo(parameters: Parameters, cublas: CuBLAS, workdir: Path) NoneView on GitHub

Retrieve the cuinfo and tkinfo note sections from a cuBLAS cubin.

test_shared_library(cublas: CuBLAS) NoneView on GitHub

The cuBLAS shared library is a shared library and not itself a cubin.

class tests.tools.binaries.test_elf.TestGetComputeCapabilityFromEFlagsView on GitHub

Bases: object

CC_E_FLAGS: Final[dict[int, int]] = {70: 4588870, 75: 4916555, 80: 5244240, 86: 5637462, 89: 5834073, 90: 5899610, 100: 100688898, 103: 100689666, 110: 100691458, 120: 100694018, 121: 100694274}

Values of e_flags obtained by calling cuobjdump --dump-elf on a cubin and reading the value of flags from the header for each architecture.

test(cc: int, e_flags: int) NoneView on GitHub
test_get_compute_capability_from_e_flags(cc: int, e_flags: int) NoneView on GitHub
class tests.tools.binaries.test_elf.TestNvInfoView on GitHub

Bases: object

Tests for reprospect.tools.binaries.elf.NvInfo.

DATA_0: Final[bytes] = b'\x047\x04\x00\x82\x00\x00\x00\x015\x00\x00\x04\n\x08\x00\t\x00\x00\x00`\x018\x00\x03\x198\x00\x04\x17\x0c\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\xf0\xe1\x00\x03\x1b\xff\x00\x03_\x00\x00\x04\x1c\x08\x00`\x00\x00\x00p\x01\x00\x00\x04\x05\x0c\x00\x01\x00\x00\x00\x01\x00\x00\x00\x01\x00\x00\x00\x04\x1e\x04\x00\x00\x00\x00\x00'
DATA_1: Final[bytes] = b'\x047\x04\x00\x82\x00\x00\x00\x015\x00\x00\x04\n\x08\x00\x0c\x00\x00\x00`\x01x\x00\x03\x19x\x00\x04\x17\x0c\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\xf0\xe1\x01\x03\x1b\xff\x00\x03_\x00\x00\x04\x1c\x04\x00\x10\x00\x00\x00'
DATA_2: Final[bytes] = b'\x047\x04\x00\x82\x00\x00\x00\x015\x00\x00\x04\n\x08\x00\x0f\x00\x00\x00`\x01x\x00\x03\x19x\x00\x04\x17\x0c\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\xf0\xe1\x01\x03\x1b\xff\x00\x03_\x00\x00\x04\x1c\x04\x00\x10\x00\x00\x00'
DATA_3: Final[bytes] = b'\x046\x04\x00\x01\x00\x00\x00\x047\x04\x00\x80\x00\x00\x00\x04\n\x08\x00\x02\x00\x00\x00`\x01\x1c\x00\x03\x19\x1c\x00\x04\x17\x0c\x00\x00\x00\x00\x00\x03\x00\x18\x00\x00\xf0\x11\x00\x04\x17\x0c\x00\x00\x00\x00\x00\x02\x00\x10\x00\x00\xf0!\x00\x04\x17\x0c\x00\x00\x00\x00\x00\x01\x00\x08\x00\x00\xf0!\x00\x04\x17\x0c\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\xf0\x11\x00\x03\x1b\xff\x00\x041\x04\x00\x10\x00\x00\x00\x04\x1c\x08\x00`\x00\x00\x00\xe0\x00\x00\x00'
test(version, parameters: Parameters, cmake_file_api: FileAPI, workdir: Path) NoneView on GitHub

Extract the .nv.info.<mangled> section of the kernel.

test_parse_data_0() NoneView on GitHub

Parse DATA_0.

test_parse_data_1() NoneView on GitHub

Parse DATA_1.

test_parse_data_2() NoneView on GitHub

Parse DATA_2.

test_parse_data_3() NoneView on GitHub

Parse DATA_3.

cuobjdump parses DATA_3 as:

<0x1>
Attribute: EIATTR_SW_WAR
Format: EIFMT_SVAL
Value: 0x1
<0x2>
Attribute: EIATTR_CUDA_API_VERSION
Format: EIFMT_SVAL
Value: 0x80
<0x3>
Attribute: EIATTR_PARAM_CBANK
Format: EIFMT_SVAL
Value: 0x2 0x1c0160
<0x4>
Attribute: EIATTR_CBANK_PARAM_SIZE
Format: EIFMT_HVAL
Value: 0x1c
<0x5>
Attribute: EIATTR_KPARAM_INFO
Format: EIFMT_SVAL
Value: Index : 0x0 Ordinal : 0x3 Offset  : 0x18 Size    : 0x4
    Pointee's logAlignment : 0x0 Space : 0x0 cbank : 0x1f Parameter Space : CBANK
<0x6>
Attribute: EIATTR_KPARAM_INFO
Format: EIFMT_SVAL
Value: Index : 0x0 Ordinal : 0x2 Offset  : 0x10 Size    : 0x8
    Pointee's logAlignment : 0x0 Space : 0x0 cbank : 0x1f Parameter Space : CBANK
<0x7>
Attribute: EIATTR_KPARAM_INFO
Format: EIFMT_SVAL
Value: Index : 0x0 Ordinal : 0x1 Offset  : 0x8 Size    : 0x8
    Pointee's logAlignment : 0x0 Space : 0x0 cbank : 0x1f Parameter Space : CBANK
<0x8>
Attribute: EIATTR_KPARAM_INFO
Format: EIFMT_SVAL
Value: Index : 0x0 Ordinal : 0x0 Offset  : 0x0 Size    : 0x4
    Pointee's logAlignment : 0x0 Space : 0x0 cbank : 0x1f Parameter Space : CBANK
<0x9>
Attribute: EIATTR_MAXREG_COUNT
Format: EIFMT_HVAL
Value: 0xff
<0x10>
Attribute: EIATTR_INT_WARP_WIDE_INSTR_OFFSETS
Format: EIFMT_SVAL
Value: 0x10
<0x11>
Attribute: EIATTR_EXIT_INSTR_OFFSETS
Format: EIFMT_SVAL
Value: 0x60 0xe0
version() VersionView on GitHub
class tests.tools.binaries.test_elf.TestSaxpyView on GitHub

Bases: object

Tests for reprospect.tools.binaries.elf.ELF using an object file.

FILE: Final[Path] = PosixPath('/__w/reprospect/reprospect/tests/tools/assets/saxpy.cu')
object_file(parameters: Parameters, cmake_file_api: FileAPI, workdir: Path) PathView on GitHub

Compile into object file.

pytestmark = [Mark(name='parametrize', args=('parameters', (Parameters(arch=NVIDIAArch(family=<NVIDIAFamily.VOLTA: 'VOLTA'>, compute_capability=ComputeCapability(major=7, minor=0))), Parameters(arch=NVIDIAArch(family=<NVIDIAFamily.TURING: 'TURING'>, compute_capability=ComputeCapability(major=7, minor=5))), Parameters(arch=NVIDIAArch(family=<NVIDIAFamily.AMPERE: 'AMPERE'>, compute_capability=ComputeCapability(major=8, minor=0))), Parameters(arch=NVIDIAArch(family=<NVIDIAFamily.AMPERE: 'AMPERE'>, compute_capability=ComputeCapability(major=8, minor=6))), Parameters(arch=NVIDIAArch(family=<NVIDIAFamily.ADA: 'ADA'>, compute_capability=ComputeCapability(major=8, minor=9))), Parameters(arch=NVIDIAArch(family=<NVIDIAFamily.HOPPER: 'HOPPER'>, compute_capability=ComputeCapability(major=9, minor=0))), Parameters(arch=NVIDIAArch(family=<NVIDIAFamily.BLACKWELL: 'BLACKWELL'>, compute_capability=ComputeCapability(major=10, minor=0))), Parameters(arch=NVIDIAArch(family=<NVIDIAFamily.BLACKWELL: 'BLACKWELL'>, compute_capability=ComputeCapability(major=12, minor=0))))), kwargs={'ids': <class 'str'>})]
test_embedded_cubin_cuinfo_and_tkinfo(parameters: Parameters, object_file: Path, workdir: Path, cmake_file_api: FileAPI) NoneView on GitHub

Retrieve the cuinfo and tkinfo note sections from a compiled output.

test_object_file_and_embedded_cubin(parameters: Parameters, object_file: Path, workdir: Path) NoneView on GitHub

Check that the object file:

  • is a relocatable and not itself a cubin

  • contains an embedded cubin for the target architecture

tests.tools.binaries.test_elf.check_version(version: int | str) NoneView on GitHub

Check version against nvcc version.

Note

As of CUDA 13.1, tools report 13.0.

tests.tools.binaries.test_elf.get_cubin(arch: NVIDIAArch, cublas: CuBLAS, workdir: Path) PathView on GitHub
tests.tools.binaries.test_elf.get_cuinfo_and_tkinfo(*, arch: NVIDIAArch, file: Path, version: Version = Version('12.8.1')) tuple[CuInfo | None, TkInfo | None]View on GitHub

Extract cuinfo and tkinfo note sections.

It takes care of checking if each note section has to exist or not, given the arch and CUDA version.

tests.tools.binaries.test_elf.nvcc_version() VersionView on GitHub