[oneDPL][ranges] support size limit for output for merge algorithm #1942

MikeDvorskiy · 2024-11-20T14:30:46Z

[oneDPL][ranges] support size limit for output for merge algorithm.
The change is according to https://www.open-std.org/jtc1/sc22/wg21/docs/papers/2024/p3179r2.html#range_as_output

serial pattern
parallel pattern (tbb)
parallel pattern (openMP)
parallel pattern (serial backend)
parallel pattern (DPCPP backend)

Update: Changes to draft status, causing faced to design issue, connected with different return types from the merge patterns - __result_and_scratch_storage/__result_and_scratch_storage_base. As an option - to have one common type of __result_and_scratch_storage for the all needs (ate least for pattern dpcpp merge patterns).
Update 2: the issue mentioned above has been resolved.

include/oneapi/dpl/pstl/glue_algorithm_ranges_impl.h

include/oneapi/dpl/pstl/hetero/algorithm_ranges_impl_hetero.h

include/oneapi/dpl/pstl/glue_algorithm_ranges_impl.h

SergeyKopienko · 2025-01-15T11:01:50Z

include/oneapi/dpl/pstl/algorithm_impl.h

@@ -2948,6 +2949,49 @@ __pattern_remove_if(__parallel_tag<_IsVector> __tag, _ExecutionPolicy&& __exec,
 // merge
 //------------------------------------------------------------------------

+template<typename It1, typename It2, typename ItOut, typename _Comp>
+std::pair<It1, It2>
+__brick_merge_2(It1 __it_1, It1 __it_1_e, It2 __it_2, It2 __it_2_e, ItOut __it_out, ItOut __it_out_e, _Comp __comp,


Probably the existing implementation of __serial_merge is more faster then this.

SergeyKopienko · 2025-01-15T11:03:52Z

include/oneapi/dpl/pstl/hetero/algorithm_ranges_impl_hetero.h

-    auto __n = __n1 + __n2;
-    if (__n == 0)
-        return 0;
+    if (__rng3.size() == 0)


Suggested change

if (__rng3.size() == 0)

if (__rng3.empty())

SergeyKopienko · 2025-01-15T11:05:11Z

include/oneapi/dpl/pstl/hetero/algorithm_ranges_impl_hetero.h

-    if (__n == 0)
-        return 0;
+    if (__rng3.size() == 0)
+        return {0, 0};

    //To consider the direct copying pattern call in case just one of sequences is empty.
    if (__n1 == 0)


We can make additional optimization here for the case when last(rng1) < first(rng2)

SergeyKopienko · 2025-01-15T11:08:49Z

include/oneapi/dpl/pstl/hetero/dpcpp/parallel_backend_sycl_merge.h

 {
    const _Index __rng1_size = std::min<_Index>(__n1 > __start1 ? __n1 - __start1 : _Index{0}, __chunk);
    const _Index __rng2_size = std::min<_Index>(__n2 > __start2 ? __n2 - __start2 : _Index{0}, __chunk);
    const _Index __rng3_size = std::min<_Index>(__rng1_size + __rng2_size, __chunk);

    const _Index __rng1_idx_end = __start1 + __rng1_size;
    const _Index __rng2_idx_end = __start2 + __rng2_size;
-    const _Index __rng3_idx_end = __start3 + __rng3_size;
+    const _Index __rng3_idx_end = std::min<_Index>(__n3, __start3 + __rng3_size);


looks like a logical error, because __n3 is the size but __rng3_idx_end is the last index.

SergeyKopienko · 2025-01-15T11:09:13Z

include/oneapi/dpl/pstl/hetero/dpcpp/parallel_backend_sycl_merge.h

            });
-        // We should return the same thing in the second param of __future for compatibility


Please restore this comment

SergeyKopienko · 2025-01-15T11:12:08Z

include/oneapi/dpl/pstl/hetero/dpcpp/parallel_backend_sycl_merge.h

@@ -320,8 +335,13 @@ struct __parallel_merge_submitter_large<_IdType, _CustomName,
                        __start = __base_diagonals_sp_global_ptr[__diagonal_idx];
                    }

-                    __serial_merge(__rng1, __rng2, __rng3, __start.first, __start.second, __i_elem,
-                                   __nd_range_params.chunk, __n1, __n2, __comp);
+                    auto __ends = __serial_merge(__rng1, __rng2, __rng3, __start.first, __start.second, __i_elem,


const auto

we know return type here, why you are using auto ?

SergeyKopienko · 2025-01-15T11:14:06Z

include/oneapi/dpl/pstl/hetero/dpcpp/parallel_backend_sycl_merge.h

@@ -391,7 +415,7 @@ __parallel_merge(oneapi::dpl::__internal::__device_backend_tag, _ExecutionPolicy

    using __value_type = oneapi::dpl::__internal::__value_t<_Range3>;

-    const std::size_t __n = __rng1.size() + __rng2.size();
+    const std::uint64_t __n = std::min<std::uint64_t>(__rng1.size() + __rng2.size(), __rng3.size());


Why we can't use std::size_t here as were before?

SergeyKopienko · 2025-01-15T11:15:18Z

include/oneapi/dpl/pstl/hetero/dpcpp/parallel_backend_sycl_utils.h

@@ -522,6 +522,7 @@ struct __usm_or_buffer_accessor
 struct __result_and_scratch_storage_base
 {
    virtual ~__result_and_scratch_storage_base() = default;
+    virtual std::size_t __get_data(sycl::event, std::size_t* __p_buf) const = 0;


As far as __result_and_scratch_storage_base already has __ in their name, I believe additional __ isn't required in method name.

SergeyKopienko · 2025-01-15T11:17:06Z

include/oneapi/dpl/pstl/hetero/dpcpp/parallel_backend_sycl_utils.h

+        return 0;
+    }
+
+    virtual std::size_t __get_data(sycl::event __event, std::size_t* __p_buf) const override


Technically this declaration is correct.
But for compatibility with the other code, as I seen, virtual aren't used together with override in our code.

MikeDvorskiy marked this pull request as draft November 20, 2024 14:30

MikeDvorskiy force-pushed the dev/mdvorski/merge_sized_output branch 9 times, most recently from 33cd332 to d443dbe Compare November 27, 2024 12:03

MikeDvorskiy force-pushed the dev/mdvorski/merge_sized_output branch 4 times, most recently from 9ebcfb6 to 0066210 Compare November 28, 2024 11:55

MikeDvorskiy marked this pull request as ready for review November 28, 2024 15:24

MikeDvorskiy requested review from dmitriy-sobolev and danhoeflinger November 28, 2024 16:29

MikeDvorskiy force-pushed the dev/mdvorski/merge_sized_output branch 7 times, most recently from 3f648a7 to 5b078ad Compare November 29, 2024 17:24

dmitriy-sobolev reviewed Dec 19, 2024

View reviewed changes

include/oneapi/dpl/pstl/glue_algorithm_ranges_impl.h Show resolved Hide resolved

include/oneapi/dpl/pstl/hetero/algorithm_ranges_impl_hetero.h Show resolved Hide resolved

include/oneapi/dpl/pstl/glue_algorithm_ranges_impl.h Show resolved Hide resolved

MikeDvorskiy force-pushed the dev/mdvorski/merge_sized_output branch from 98a7acb to c81b4c1 Compare December 23, 2024 13:50

MikeDvorskiy marked this pull request as draft December 29, 2024 10:00

MikeDvorskiy force-pushed the dev/mdvorski/merge_sized_output branch from 76c3c16 to c0c8ba4 Compare January 14, 2025 13:49

MikeDvorskiy marked this pull request as ready for review January 14, 2025 13:49

[oneDPL][make] + usage ONEAPI_DEVICE_SELECTOR variable

c1ff14b

MikeDvorskiy force-pushed the dev/mdvorski/merge_sized_output branch from c0c8ba4 to ffea24a Compare January 14, 2025 13:51

MikeDvorskiy added this to the 2022.8.0 milestone Jan 14, 2025

MikeDvorskiy force-pushed the dev/mdvorski/merge_sized_output branch 3 times, most recently from 90b640f to 02ed111 Compare January 14, 2025 17:37

[oneDPL][ranges][merge] support size limit for output

45a9bef

MikeDvorskiy force-pushed the dev/mdvorski/merge_sized_output branch from 02ed111 to 45a9bef Compare January 15, 2025 10:04

SergeyKopienko reviewed Jan 15, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[oneDPL][ranges] support size limit for output for merge algorithm #1942

[oneDPL][ranges] support size limit for output for merge algorithm #1942

MikeDvorskiy commented Nov 20, 2024 •

edited

Loading

SergeyKopienko Jan 15, 2025

SergeyKopienko Jan 15, 2025

SergeyKopienko Jan 15, 2025

SergeyKopienko Jan 15, 2025

SergeyKopienko Jan 15, 2025

SergeyKopienko Jan 15, 2025

SergeyKopienko Jan 15, 2025

SergeyKopienko Jan 15, 2025

SergeyKopienko Jan 15, 2025

		});
		// We should return the same thing in the second param of __future for compatibility

[oneDPL][ranges] support size limit for output for merge algorithm #1942

Are you sure you want to change the base?

[oneDPL][ranges] support size limit for output for merge algorithm #1942

Conversation

MikeDvorskiy commented Nov 20, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

MikeDvorskiy commented Nov 20, 2024 •

edited

Loading