Adding SIMD Sum #133

michaelciraci · 2024-10-11T02:07:24Z

Right now, Complex sum for floats does not vectorize because real and imaginary are interleaved. This uses an intermediate data type during sum to vectorize the sum. There is no unsafe, and on my computer I get almost a 4x speed improvement for f32:

sum_simd                time:   [5.6591 µs 5.7808 µs 5.9707 µs]
Found 10 outliers among 100 measurements (10.00%)
  4 (4.00%) high mild
  6 (6.00%) high severe

sum_scalar              time:   [19.787 µs 20.187 µs 20.938 µs]
Found 15 outliers among 100 measurements (15.00%)
  3 (3.00%) high mild
  12 (12.00%) high severe

I made a repo if you want to test the results yourself: https://github.com/michaelciraci/num-complex-simd-comparison

This however would technically be a breaking change, due to the order that the floats are summed (float1 + float2 + float3 may not equal float3 + float2 + float1).

This however might be an opportunity to have an SIMD feature for floats.

I waited to implement SIMD product to see what route you wanted to go down (if you were interested at all).

Adding SIMD sum

6d3782a

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Adding SIMD Sum #133

Adding SIMD Sum #133

michaelciraci commented Oct 11, 2024 •

edited

Loading

Adding SIMD Sum #133

Are you sure you want to change the base?

Adding SIMD Sum #133

Conversation

michaelciraci commented Oct 11, 2024 • edited Loading

michaelciraci commented Oct 11, 2024 •

edited

Loading