Dispatch
- class examples.kokkos.graph.example_dispatch.TestDispatchView on GitHub
Bases:
CMakeAwareTestCaseTrace the CUDA API calls during
Kokkos::Experimental::Graphstages.It uses
examples/kokkos/graph/example_dispatch.cpp.- KOKKOS_TOOLS_NVTX_CONNECTOR_LIB
Used in
TestNSYS.report().
- classmethod get_target_name() strView on GitHub
- class examples.kokkos.graph.example_dispatch.TestNSYSView on GitHub
Bases:
TestDispatchnsys-focused analysis.
- static get(*, report: Report, kernels: DataFrame, label: str) DataFrameView on GitHub
Get kernels from kernels table that are correlated to the
cudaGraphLaunchAPI call in the NVTX region label.
- pytestmark = [Mark(name='skipif', args=(True,), kwargs={'reason': 'needs a GPU'})]
- report() ReportView on GitHub
Analyse with nsys, use
reprospect.tools.nsys.Cacher.
- test_streams(report: Report) NoneView on GitHub
Each kernel gets a unique stream ID.
It means that at the CUDA backend level, all nodes are shown to the kernel scheduler as independent and may be executed concurrently.
It must be noted that CUDA does not provide a way to create a graph node and enforce the stream on which it will eventually run. This has motivated a refactoring of the
Kokkos::Experimental::GraphAPI, see https://github.com/kokkos/kokkos/pull/8191.