Skip to content

Add OpenCL implementation

Sadly bottlenecked by VRAM latency due to the uncached nature of global memory on my Nvidia system and therefore only with similar performance like rust-safe.

Merge request reports

Loading