Skip to content

reduce seems slow #1107

@DiamonDinoia

Description

@DiamonDinoia

Hi all! Thank you for developing this library. It is super useful in multiple projects!

I noticed something that might be an issue. But I am not sure.

I have a code where I multiply a complex valued (a) array with a real valued array (ker):
Basically, I need to multiply each element of 'a' twice.
My code is as follows:

auto func(real a1, real a2 complex ker):
     // this trick halves the number of loads for ker also the reason why I use a1 and a2 instead of a
    const auto low = xsimd::zip_lo(ker, ker);
    const auto high= xsimd::zip_hi(ker, ker);
    const auto res0 = a1 * low;
    const auto res1 = a2 * high;

what I noticed is that the original implementation of reduce_add on my machine can be optimized. Is it possible to have a split function that returns low and hi? By doing split + add multiple times my code is 7 times faster.

I have pushed the benchmarks here:
https://github.com/DiamonDinoia/cpp-learning/tree/master/xsimd

it results in the following performance:

ns/op op/s err% ins/op cyc/op IPC bra/op miss% total benchmark
6.96 143,690,879.59 0.6% 19.00 21.47 0.885 0.00 0.0% 0.01 add+store
2.31 432,949,727.65 0.6% 24.00 7.11 3.374 0.00 0.0% 0.01 hsum
3.81 262,211,901.24 0.1% 36.00 11.75 3.064 2.00 0.0% 0.01 reduce_add
2.59 385,491,672.62 0.2% 20.00 7.99 2.503 0.00 0.0% 0.01 union pun
1.18 846,618,297.70 0.9% 17.00 3.64 4.672 0.00 0.0% 0.01 double union pun

I tweaked master a bit in https://github.com/DiamonDinoia/xsimd/tree/hadd-tweaks
and I got:

ns/op op/s err% ins/op cyc/op IPC bra/op miss% total benchmark
7.00 142,933,991.35 0.9% 19.00 21.50 0.884 0.00 0.0% 0.01 add+store
2.27 439,741,444.70 0.9% 24.00 6.99 3.434 0.00 0.0% 0.01 hsum
2.99 334,267,996.40 1.5% 36.00 9.15 3.935 2.00 0.0% 0.01 reduce_add
2.09 478,101,632.03 1.2% 28.00 6.44 4.346 2.00 0.0% 0.01 union pun
1.05 956,625,856.43 1.6% 17.00 3.21 5.289 0.00 0.0% 0.01 double union pun

Thanks,
Marco

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions