cuobjdump

class tests.tools.binaries.test_cuobjdump.TestCuObjDumpView on GitHub

Bases: object

Tests related to reprospect.tools.binaries.CuObjDump.

class TestCuBLASView on GitHub

Bases: object

Play with the cuBLAS shared library.

pytestmark = [Mark(name='parametrize', args=('parameters', (Parameters(arch=NVIDIAArch(family=<NVIDIAFamily.VOLTA: 'VOLTA'>, compute_capability=ComputeCapability(major=7, minor=0))), Parameters(arch=NVIDIAArch(family=<NVIDIAFamily.TURING: 'TURING'>, compute_capability=ComputeCapability(major=7, minor=5))), Parameters(arch=NVIDIAArch(family=<NVIDIAFamily.AMPERE: 'AMPERE'>, compute_capability=ComputeCapability(major=8, minor=0))), Parameters(arch=NVIDIAArch(family=<NVIDIAFamily.AMPERE: 'AMPERE'>, compute_capability=ComputeCapability(major=8, minor=6))), Parameters(arch=NVIDIAArch(family=<NVIDIAFamily.ADA: 'ADA'>, compute_capability=ComputeCapability(major=8, minor=9))), Parameters(arch=NVIDIAArch(family=<NVIDIAFamily.HOPPER: 'HOPPER'>, compute_capability=ComputeCapability(major=9, minor=0))), Parameters(arch=NVIDIAArch(family=<NVIDIAFamily.BLACKWELL: 'BLACKWELL'>, compute_capability=ComputeCapability(major=10, minor=0))), Parameters(arch=NVIDIAArch(family=<NVIDIAFamily.BLACKWELL: 'BLACKWELL'>, compute_capability=ComputeCapability(major=12, minor=0))))), kwargs={'ids': <class 'str'>})]
test_extract_functions(parameters: Parameters, workdir: Path, cmake_file_api: FileAPI) NoneView on GitHub

Extract a subset of all functions from a randomly picked cubin.

Note

Sometimes, cuobjdump will still dump the SASS of unwanted functions. That is, even if –function=… is passed N mangled function names, it might output the SASS for more than N functions.

Note

The symbol table of cubin files from cuBLAS may contain internal CUDA runtime helper functions with symbols such as $__internal_11_$__cuda_sm20_div_u64.

While these symbols appear in the ELF symbol table as valid STT_FUNC, cuobjdump cannot dump any SASS code for them.

This test therefore filters out any function whose name starts with $__internal.

class TestManyView on GitHub

Bases: object

When there are many kernels.

Note

__device__ functions have been inlined.

CPP_FILE: Final[Path] = PosixPath('/__w/reprospect/reprospect/tests/tools/binaries/assets/many.cpp')
CUDA_FILE: Final[Path] = PosixPath('/__w/reprospect/reprospect/tests/tools/binaries/assets/many.cu')
FUNCTIONS: Final[dict[str, str]] = {'_Z20vector_atomic_add_42PKfS0_Pfj': 'vector_atomic_add_42(const float *, const float *, float *, unsigned int)', '_Z6say_hiv': 'say_hi()'}
pytestmark = [Mark(name='parametrize', args=('parameters', (Parameters(arch=NVIDIAArch(family=<NVIDIAFamily.VOLTA: 'VOLTA'>, compute_capability=ComputeCapability(major=7, minor=0))), Parameters(arch=NVIDIAArch(family=<NVIDIAFamily.TURING: 'TURING'>, compute_capability=ComputeCapability(major=7, minor=5))), Parameters(arch=NVIDIAArch(family=<NVIDIAFamily.AMPERE: 'AMPERE'>, compute_capability=ComputeCapability(major=8, minor=0))), Parameters(arch=NVIDIAArch(family=<NVIDIAFamily.AMPERE: 'AMPERE'>, compute_capability=ComputeCapability(major=8, minor=6))), Parameters(arch=NVIDIAArch(family=<NVIDIAFamily.ADA: 'ADA'>, compute_capability=ComputeCapability(major=8, minor=9))), Parameters(arch=NVIDIAArch(family=<NVIDIAFamily.HOPPER: 'HOPPER'>, compute_capability=ComputeCapability(major=9, minor=0))), Parameters(arch=NVIDIAArch(family=<NVIDIAFamily.BLACKWELL: 'BLACKWELL'>, compute_capability=ComputeCapability(major=10, minor=0))), Parameters(arch=NVIDIAArch(family=<NVIDIAFamily.BLACKWELL: 'BLACKWELL'>, compute_capability=ComputeCapability(major=12, minor=0))))), kwargs={'ids': <class 'str'>})]
test_extract_cubin_from_file(workdir, parameters: Parameters, cmake_file_api: FileAPI) NoneView on GitHub

Compile CPP_FILE as an executable, and extract the cubin from it.

test_extract_symbol_table(workdir, parameters: Parameters, cmake_file_api: FileAPI) NoneView on GitHub

Compile CPP_FILE as an executable, and extract the symbol table from it.

test_sass_from_object(workdir, parameters: Parameters, cmake_file_api: FileAPI) NoneView on GitHub

Compile CUDA_FILE as object, extract SASS.

class TestSaxpyView on GitHub

Bases: object

When the kernel performs a saxpy.

CPP_FILE: Final[Path] = PosixPath('/__w/reprospect/reprospect/tests/tools/assets/saxpy.cpp')
CUDA_FILE: Final[Path] = PosixPath('/__w/reprospect/reprospect/tests/tools/assets/saxpy.cu')
SIGNATURE: Final[str] = 'saxpy_kernel(float, const float *, float *, unsigned int)'
SYMBOL: Final[str] = '_Z12saxpy_kernelfPKfPfj'
pytestmark = [Mark(name='parametrize', args=('parameters', (Parameters(arch=NVIDIAArch(family=<NVIDIAFamily.VOLTA: 'VOLTA'>, compute_capability=ComputeCapability(major=7, minor=0))), Parameters(arch=NVIDIAArch(family=<NVIDIAFamily.TURING: 'TURING'>, compute_capability=ComputeCapability(major=7, minor=5))), Parameters(arch=NVIDIAArch(family=<NVIDIAFamily.AMPERE: 'AMPERE'>, compute_capability=ComputeCapability(major=8, minor=0))), Parameters(arch=NVIDIAArch(family=<NVIDIAFamily.AMPERE: 'AMPERE'>, compute_capability=ComputeCapability(major=8, minor=6))), Parameters(arch=NVIDIAArch(family=<NVIDIAFamily.ADA: 'ADA'>, compute_capability=ComputeCapability(major=8, minor=9))), Parameters(arch=NVIDIAArch(family=<NVIDIAFamily.HOPPER: 'HOPPER'>, compute_capability=ComputeCapability(major=9, minor=0))), Parameters(arch=NVIDIAArch(family=<NVIDIAFamily.BLACKWELL: 'BLACKWELL'>, compute_capability=ComputeCapability(major=10, minor=0))), Parameters(arch=NVIDIAArch(family=<NVIDIAFamily.BLACKWELL: 'BLACKWELL'>, compute_capability=ComputeCapability(major=12, minor=0))))), kwargs={'ids': <class 'str'>})]
test_extract_cubin_from_file(workdir, parameters: Parameters, cmake_file_api: FileAPI) NoneView on GitHub

Compile CPP_FILE as an executable, and extract the cubin from it.

test_extract_symbol_table(workdir, parameters: Parameters, cmake_file_api: FileAPI) NoneView on GitHub

Compile CPP_FILE as an executable, and extract the symbol table from it.

test_sass_from_object(workdir, parameters: Parameters, cmake_file_api: FileAPI) NoneView on GitHub

Compile CUDA_FILE as object, extract SASS and analyse resource usage.

static dump(*, file: Path, cuobjdump: CuObjDump) NoneView on GitHub
test_string_representation() NoneView on GitHub

Test reprospect.tools.binaries.CuObjDump.__str__().

class tests.tools.binaries.test_cuobjdump.TestFunctionView on GitHub

Bases: object

Tests related to reprospect.tools.binaries.Function.

CODE: Final[str] = '        .headerflags    @"EF_CUDA_SM120 EF_CUDA_VIRTUAL_SM(EF_CUDA_SM120)"\n        /*0000*/                   LDC R1, c[0x0][0x37c]                &wr=0x0          ?trans1;           /* 0x0000df00ff017b82 */\n                                                                                                            /* 0x000e220000000800 */\n        /*0010*/                   S2R R0, SR_TID.X                     &wr=0x1          ?trans7;           /* 0x0000000000007919 */\n                                                                                                            /* 0x000e6e0000002100 */\n'
RU: Final[ResourceUsage] = ResourceUsage(register=10, constant={0: 924}, shared=0, local=0, sampler=0, stack=0, surface=0, texture=0)
SYMBOL: Final[str] = '_Z9my_kernelfPKfPfj'
test_string_representation() NoneView on GitHub

Test reprospect.tools.binaries.Function.__str__().

class tests.tools.binaries.test_cuobjdump.TestResourceUsageView on GitHub

Bases: object

Tests related to reprospect.tools.binaries.cuobjdump.ResourceUsage.

class TestSaxpyView on GitHub

Bases: object

FILE: Final[Path] = PosixPath('/__w/reprospect/reprospect/tests/tools/assets/saxpy.cu')
SIGNATURE: Final[str] = '_Z12saxpy_kernelfPKfPfj'
pytestmark = [Mark(name='parametrize', args=('parameters', (Parameters(arch=NVIDIAArch(family=<NVIDIAFamily.VOLTA: 'VOLTA'>, compute_capability=ComputeCapability(major=7, minor=0))), Parameters(arch=NVIDIAArch(family=<NVIDIAFamily.TURING: 'TURING'>, compute_capability=ComputeCapability(major=7, minor=5))), Parameters(arch=NVIDIAArch(family=<NVIDIAFamily.AMPERE: 'AMPERE'>, compute_capability=ComputeCapability(major=8, minor=0))), Parameters(arch=NVIDIAArch(family=<NVIDIAFamily.AMPERE: 'AMPERE'>, compute_capability=ComputeCapability(major=8, minor=6))), Parameters(arch=NVIDIAArch(family=<NVIDIAFamily.ADA: 'ADA'>, compute_capability=ComputeCapability(major=8, minor=9))), Parameters(arch=NVIDIAArch(family=<NVIDIAFamily.HOPPER: 'HOPPER'>, compute_capability=ComputeCapability(major=9, minor=0))), Parameters(arch=NVIDIAArch(family=<NVIDIAFamily.BLACKWELL: 'BLACKWELL'>, compute_capability=ComputeCapability(major=10, minor=0))), Parameters(arch=NVIDIAArch(family=<NVIDIAFamily.BLACKWELL: 'BLACKWELL'>, compute_capability=ComputeCapability(major=12, minor=0))))), kwargs={'ids': <class 'str'>})]
test(workdir, parameters: Parameters, cmake_file_api: FileAPI) NoneView on GitHub
class TestSharedMemoryView on GitHub

Bases: object

FILE: Final[Path] = PosixPath('/__w/reprospect/reprospect/tests/tools/binaries/assets/shared_memory.cu')
SIGNATURE: Final[str] = '_Z20shared_memory_kernelPfj'
pytestmark = [Mark(name='parametrize', args=('parameters', (Parameters(arch=NVIDIAArch(family=<NVIDIAFamily.VOLTA: 'VOLTA'>, compute_capability=ComputeCapability(major=7, minor=0))), Parameters(arch=NVIDIAArch(family=<NVIDIAFamily.TURING: 'TURING'>, compute_capability=ComputeCapability(major=7, minor=5))), Parameters(arch=NVIDIAArch(family=<NVIDIAFamily.AMPERE: 'AMPERE'>, compute_capability=ComputeCapability(major=8, minor=0))), Parameters(arch=NVIDIAArch(family=<NVIDIAFamily.AMPERE: 'AMPERE'>, compute_capability=ComputeCapability(major=8, minor=6))), Parameters(arch=NVIDIAArch(family=<NVIDIAFamily.ADA: 'ADA'>, compute_capability=ComputeCapability(major=8, minor=9))), Parameters(arch=NVIDIAArch(family=<NVIDIAFamily.HOPPER: 'HOPPER'>, compute_capability=ComputeCapability(major=9, minor=0))), Parameters(arch=NVIDIAArch(family=<NVIDIAFamily.BLACKWELL: 'BLACKWELL'>, compute_capability=ComputeCapability(major=10, minor=0))), Parameters(arch=NVIDIAArch(family=<NVIDIAFamily.BLACKWELL: 'BLACKWELL'>, compute_capability=ComputeCapability(major=12, minor=0))))), kwargs={'ids': <class 'str'>})]
test(workdir: Path, parameters: Parameters, cmake_file_api: FileAPI) NoneView on GitHub
class TestWideLoadStoreView on GitHub

Bases: object

FILE: Final[Path] = PosixPath('/__w/reprospect/reprospect/tests/tools/binaries/assets/wide_load_store.cu')
SIGNATURE: Final[str] = '_Z22wide_load_store_kernelP15MyAlignedStructIdEPKS0_'
pytestmark = [Mark(name='parametrize', args=('parameters', (Parameters(arch=NVIDIAArch(family=<NVIDIAFamily.VOLTA: 'VOLTA'>, compute_capability=ComputeCapability(major=7, minor=0))), Parameters(arch=NVIDIAArch(family=<NVIDIAFamily.TURING: 'TURING'>, compute_capability=ComputeCapability(major=7, minor=5))), Parameters(arch=NVIDIAArch(family=<NVIDIAFamily.AMPERE: 'AMPERE'>, compute_capability=ComputeCapability(major=8, minor=0))), Parameters(arch=NVIDIAArch(family=<NVIDIAFamily.AMPERE: 'AMPERE'>, compute_capability=ComputeCapability(major=8, minor=6))), Parameters(arch=NVIDIAArch(family=<NVIDIAFamily.ADA: 'ADA'>, compute_capability=ComputeCapability(major=8, minor=9))), Parameters(arch=NVIDIAArch(family=<NVIDIAFamily.HOPPER: 'HOPPER'>, compute_capability=ComputeCapability(major=9, minor=0))), Parameters(arch=NVIDIAArch(family=<NVIDIAFamily.BLACKWELL: 'BLACKWELL'>, compute_capability=ComputeCapability(major=10, minor=0))), Parameters(arch=NVIDIAArch(family=<NVIDIAFamily.BLACKWELL: 'BLACKWELL'>, compute_capability=ComputeCapability(major=12, minor=0))))), kwargs={'ids': <class 'str'>})]
test(workdir: Path, parameters: Parameters, cmake_file_api: FileAPI) NoneView on GitHub
test() NoneView on GitHub