reprospect.test.sass.instruction.half module
- class reprospect.test.sass.instruction.half.Fp16View on GitHub
Bases:
objectHelper for FP16 matchers.
These instructions typically operate on packed half-precision pairs (2 lanes per 32-bit register).
Each 32-bit register holds two FP16 values (H0 on bits 0-15 and H1 on bits 16-31).
The instruction can process both lanes in parallel.
They optionally use lane selectors, e.g.
.H0_H0or.H1_H1.R0.H0_H0broadcasts the low half ofR0to both lanes.R0.H1_H1broadcasts the high half ofR0to both lanes.
References:
[HW17]
- classmethod build_pattern_operand(*, half_sel: bool | None = None, math: bool | None = None, immediate: bool | None = None, captured: bool = True) strView on GitHub
Build pattern for an FP16 operand.
- classmethod mod(reg: str, *, half_sel: bool | None = None, captured: bool = True) strView on GitHub
Wrap a register pattern with a half selector modifier.
- classmethod regz_half_sel() strView on GitHub
- class reprospect.test.sass.instruction.half.Fp16AddMatcher(*, packed: bool | None = None)View on GitHub
Bases:
PatternMatcherMatcher for 16-bit floating-point add (
HADD2) instruction.It may apply on
__half:HADD2 R0, R2.H0_H0, R3.H0_H0
or on
__half2:HADD2 R0, R2, R3
- PATTERN_ANY: Final[Pattern[str]] = regex.Regex('(?P<opcode>HADD2)(?:\\.(?P<modifiers>F32))? (?P<operands>(?P<dst>R[0-9]+)), (?P<operands>(?:(?:(?:!|\\-\\||\\-|\\~|\\|))?R(?:Z|\\d+)(?:\\.H[01]_H[01])?(?:\\|)?|(?:-?\\d+)(?:\\.\\d*)?(?:[eE][-+]?\\d+)?)), (?P<operands>(?:(?:(?:!|\\-\\||\\-|\\~|\\|))?R(?:Z|\\d+)(?:\\.H[01]_H[01])?(?:\\|)?|(?:-?\\d+)(?:\\.\\d*)?(?:[eE][-+]?\\d+)?))', flags=regex.V0)
- PATTERN_INDIVIDUAL: Final[Pattern[str]] = regex.Regex('(?P<opcode>HADD2)(?:\\.(?P<modifiers>F32))? (?P<operands>(?P<dst>R[0-9]+)), (?P<operands>(?:(?:!|\\-\\||\\-|\\~|\\|))?R(?:Z|\\d+)\\.H[01]_H[01](?:\\|)?), (?P<operands>(?:(?:!|\\-\\||\\-|\\~|\\|))?R(?:Z|\\d+)\\.H[01]_H[01](?:\\|)?)', flags=regex.V0)
- PATTERN_PACKED: Final[Pattern[str]] = regex.Regex('(?P<opcode>HADD2)(?:\\.(?P<modifiers>F32))? (?P<operands>(?P<dst>R[0-9]+)), (?P<operands>R(?:Z|\\d+)), (?P<operands>R(?:Z|\\d+))', flags=regex.V0)
- TEMPLATE: Final[str] = '(?P<opcode>HADD2)(?:\\.(?P<modifiers>F32))? (?P<operands>(?P<dst>R[0-9]+)), {op1}, {op2}'
- __init__(*, packed: bool | None = None) NoneView on GitHub
- class reprospect.test.sass.instruction.half.Fp16FusedMulAddMatcher(*, packed: bool | None = None)View on GitHub
Bases:
PatternMatcherMatcher for 16-bit floating-point fused multiply add (
HFMA2) instruction, such as:HFMA2 R7, R2, R2, -RZ HFMA2 R0, R0.H0_H0, R0.H1_H1, 3, 3 HFMA2 R0, R2, 5, 2, R0.H1_H1
Note
The FMA computes
D = A * B + C. However, it may accept 4 or 5 operands.4 operands (
HFMA2 D, A, B, C)The addend
Cis a register or a single immediate value applied to both lanes. Therefore, the following:HFMA2 R7, R2, R2, R0
computes
R7.H0 = R2.H0 * R2.H0 + R0.H0andR7.H1 = R2.H1 * R2.H1 + R0.H1.5 operands (
HFMA2 E, A, B, C, D)The addend is split per lane with separate immediate values. Therefore, the following:
HFMA2 R0, R1.H0_H0, R2.H1_H1, 3, 1
computes
R0.H0 = R1.H0 * R2.H1 + 3.0andR0.H1 = R1.H0 * R2.H1 + 1.0.
Note
For
HFMA2.MMAwith 5 operands, operands 2 and 3 appear to be multipliers for each lane. The following results fromdst += srcwhere operands 2 and 3 are set to immediate unit:LDG.E.CONSTANT R5, [R4.64] LDG.E R0, [R2.64] HFMA2.MMA R7, R0, 1, 1, R5 STG.E [R2.64], R7
This likely means “multiply both lanes of
R0by 1 and addR5”.- PATTERN_ANY: Final[Pattern[str]] = regex.Regex('(?P<opcode>HFMA2)(?:\\.(?P<modifiers>MMA))? (?P<operands>(?P<dst>R[0-9]+)), (?P<operands>(?:(?:(?:!|\\-\\||\\-|\\~|\\|))?R(?:Z|\\d+)(?:\\.H[01]_H[01])?(?:\\|)?|(?:-?\\d+)(?:\\.\\d*)?(?:[eE][-+]?\\d+)?)), (?P<operands>(?:(?:(?:!|\\-\\||\\-|\\~|\\|))?R(?:Z|\\d+)(?:\\.H[01]_H[01])?(?:\\|)?|(?:-?\\d+)(?:\\.\\d*)?(?:[eE][-+]?\\d+)?)), (?P<operands>(?:(?:(?:!|\\-\\||\\-|\\~|\\|))?R(?:Z|\\d+)(?:\\.H[01]_H[01])?(?:\\|)?|(?:-?\\d+)(?:\\.\\d*)?(?:[eE][-+]?\\d+)?))(?:, (?P<operands>(?:(?:(?:!|\\-\\||\\-|\\~|\\|))?R(?:Z|\\d+)(?:\\.H[01]_H[01])?(?:\\|)?|(?:-?\\d+)(?:\\.\\d*)?(?:[eE][-+]?\\d+)?)))?', flags=regex.V0)
- PATTERN_INDIVIDUAL: Final[Pattern[str]] = regex.Regex('(?P<opcode>HFMA2)(?:\\.(?P<modifiers>MMA))? (?P<operands>(?P<dst>R[0-9]+)), (?P<operands>(?:(?:!|\\-\\||\\-|\\~|\\|))?R(?:Z|\\d+)\\.H[01]_H[01](?:\\|)?), (?P<operands>(?:(?:!|\\-\\||\\-|\\~|\\|))?R(?:Z|\\d+)\\.H[01]_H[01](?:\\|)?), (?P<operands>(?:(?:(?:!|\\-\\||\\-|\\~|\\|))?R(?:Z|\\d+)\\.H[01]_H[01](?:\\|)?|(?:-?\\d+)(?:\\.\\d*)?(?:[eE][-+]?\\d+)?))(?:, (?P<operands>(?:(?:(?:!|\\-\\||\\-|\\~|\\|))?R(?:Z|\\d+)\\.H[01]_H[01](?:\\|)?|(?:-?\\d+)(?:\\.\\d*)?(?:[eE][-+]?\\d+)?)))?', flags=regex.V0)
- PATTERN_PACKED: Final[Pattern[str]] = regex.Regex('(?P<opcode>HFMA2)(?:\\.(?P<modifiers>MMA))? (?P<operands>(?P<dst>R[0-9]+)), (?P<operands>(?:(?:!|\\-\\||\\-|\\~|\\|))?R(?:Z|\\d+)(?:\\|)?), (?P<operands>(?:(?:(?:!|\\-\\||\\-|\\~|\\|))?R(?:Z|\\d+)(?:\\|)?|(?:-?\\d+)(?:\\.\\d*)?(?:[eE][-+]?\\d+)?)), (?P<operands>(?:(?:(?:!|\\-\\||\\-|\\~|\\|))?R(?:Z|\\d+)(?:\\|)?|(?:-?\\d+)(?:\\.\\d*)?(?:[eE][-+]?\\d+)?))(?:, (?P<operands>(?:(?:(?:!|\\-\\||\\-|\\~|\\|))?R(?:Z|\\d+)(?:\\|)?|(?:-?\\d+)(?:\\.\\d*)?(?:[eE][-+]?\\d+)?)))?', flags=regex.V0)
- TEMPLATE: Final[str] = '(?P<opcode>HFMA2)(?:\\.(?P<modifiers>MMA))? (?P<operands>(?P<dst>R[0-9]+)), {op1}, {op2}, {op3}(?:, {op4})?'
- __init__(*, packed: bool | None = None) NoneView on GitHub
- Parameters:
packed – If it is packed or not.
- class reprospect.test.sass.instruction.half.Fp16MinMaxMatcher(*, pmax: bool | None = None)View on GitHub
Bases:
PatternMatcherMatcher for 16-bit floating-point min-max (
HMNMX2) instruction.- PATTERN_ANY: Final[Pattern[str]] = regex.Regex('(?P<opcode>HMNMX2) (?P<operands>(?P<dst>R[0-9]+)), (?P<operands>(?:(?:(?:!|\\-\\||\\-|\\~|\\|))?R(?:Z|\\d+)(?:\\.H[01]_H[01])?(?:\\|)?|(?:-?\\d+)(?:\\.\\d*)?(?:[eE][-+]?\\d+)?)), (?P<operands>(?:(?:(?:!|\\-\\||\\-|\\~|\\|))?R(?:Z|\\d+)(?:\\.H[01]_H[01])?(?:\\|)?|(?:-?\\d+)(?:\\.\\d*)?(?:[eE][-+]?\\d+)?)), (?P<operands>!?PT)', flags=regex.V0)
- PATTERN_MAX: Final[Pattern[str]] = regex.Regex('(?P<opcode>HMNMX2) (?P<operands>(?P<dst>R[0-9]+)), (?P<operands>(?:(?:(?:!|\\-\\||\\-|\\~|\\|))?R(?:Z|\\d+)(?:\\.H[01]_H[01])?(?:\\|)?|(?:-?\\d+)(?:\\.\\d*)?(?:[eE][-+]?\\d+)?)), (?P<operands>(?:(?:(?:!|\\-\\||\\-|\\~|\\|))?R(?:Z|\\d+)(?:\\.H[01]_H[01])?(?:\\|)?|(?:-?\\d+)(?:\\.\\d*)?(?:[eE][-+]?\\d+)?)), (?P<operands>!PT)', flags=regex.V0)
- PATTERN_MIN: Final[Pattern[str]] = regex.Regex('(?P<opcode>HMNMX2) (?P<operands>(?P<dst>R[0-9]+)), (?P<operands>(?:(?:(?:!|\\-\\||\\-|\\~|\\|))?R(?:Z|\\d+)(?:\\.H[01]_H[01])?(?:\\|)?|(?:-?\\d+)(?:\\.\\d*)?(?:[eE][-+]?\\d+)?)), (?P<operands>(?:(?:(?:!|\\-\\||\\-|\\~|\\|))?R(?:Z|\\d+)(?:\\.H[01]_H[01])?(?:\\|)?|(?:-?\\d+)(?:\\.\\d*)?(?:[eE][-+]?\\d+)?)), (?P<operands>PT)', flags=regex.V0)
- __init__(*, pmax: bool | None = None) NoneView on GitHub
- class reprospect.test.sass.instruction.half.Fp16MulMatcher(*, packed: bool | None = None)View on GitHub
Bases:
PatternMatcherMatcher for 16-bit floating-point multiply (
HMUL2) instruction.It may apply on
__half:HMUL2 R0, R2.H0_H0, R3.H0_H0
or on
__half2:HMUL2 R0, R2, R3
- PATTERN_ANY: Final[Pattern[str]] = regex.Regex('(?P<opcode>HMUL2) (?P<operands>(?P<dst>R[0-9]+)), (?P<operands>(?:(?:(?:!|\\-\\||\\-|\\~|\\|))?R(?:Z|\\d+)(?:\\.H[01]_H[01])?(?:\\|)?|(?:-?\\d+)(?:\\.\\d*)?(?:[eE][-+]?\\d+)?)), (?P<operands>(?:(?:(?:!|\\-\\||\\-|\\~|\\|))?R(?:Z|\\d+)(?:\\.H[01]_H[01])?(?:\\|)?|(?:-?\\d+)(?:\\.\\d*)?(?:[eE][-+]?\\d+)?))', flags=regex.V0)
- PATTERN_INDIVIDUAL: Final[Pattern[str]] = regex.Regex('(?P<opcode>HMUL2) (?P<operands>(?P<dst>R[0-9]+)), (?P<operands>(?:(?:!|\\-\\||\\-|\\~|\\|))?R(?:Z|\\d+)\\.H[01]_H[01](?:\\|)?), (?P<operands>(?:(?:!|\\-\\||\\-|\\~|\\|))?R(?:Z|\\d+)\\.H[01]_H[01](?:\\|)?)', flags=regex.V0)
- PATTERN_PACKED: Final[Pattern[str]] = regex.Regex('(?P<opcode>HMUL2) (?P<operands>(?P<dst>R[0-9]+)), (?P<operands>R(?:Z|\\d+)), (?P<operands>R(?:Z|\\d+))', flags=regex.V0)
- __init__(*, packed: bool | None = None) NoneView on GitHub
- Parameters:
packed – If it is packed or not.
References: