Skip to content

Add WebAssembly SIMD STRSM and DTRSM kernels#5760

Merged
martin-frbg merged 2 commits intoOpenMathLib:developfrom
teddygood:wasm-trsm
Apr 16, 2026
Merged

Add WebAssembly SIMD STRSM and DTRSM kernels#5760
martin-frbg merged 2 commits intoOpenMathLib:developfrom
teddygood:wasm-trsm

Conversation

@teddygood
Copy link
Copy Markdown
Contributor

@teddygood teddygood commented Apr 16, 2026

strsm_multiplot dtrsm_multiplot

This PR adds WebAssembly SIMD TRSM kernels for WASM128_GENERIC and wires them up for STRSM and DTRSM.

The overall blocked TRSM structure is unchanged. This keeps the existing generic control flow and specializes the small triangular solve path for WASM friendly fixed-size cases used inside the kernel.

To make the target-specific selections take effect reliably, this also changes the base kernel/wasm/KERNEL defaults to allow WASM128_GENERIC overrides for GEMM/TRSM kernel entries.

In local direct WASM benchmarking with Pyodide/Emscripten, direct cblas_strsm improved over the current generic WASM128_GENERIC path by about 2.3x–3.4x across LN/LT/RN/RT cases for the sizes tested (128, 160, 192, 256, 384). Direct cblas_dtrsm improved by about 1.6x–2.2x across the LN/LT/RN/RT cases tested (128, 192, 256, 384).

@martin-frbg martin-frbg added this to the 0.3.33 milestone Apr 16, 2026
@teddygood teddygood changed the title Wasm trsm Add WebAssembly SIMD STRSM and DTRSM kernels Apr 16, 2026
@martin-frbg martin-frbg merged commit b77cd0a into OpenMathLib:develop Apr 16, 2026
100 of 102 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants