Inconsistent behavior of correlated subqueries involving aggregations #18979

zhaner08 · 2023-09-08T22:04:27Z

Sample query:

with
  tmp as (select * from (VALUES (1,3),(2,5)) t(k,v))
select
  (select 3 from tmp t1 where t1.k=t0.k and t1.k>1 group by t1.v)
from (select k, v from tmp) t0;

This query works fine and returning 2 rows, 1 with value and 1 with null (due to t1.k>1)

When changing the aggregation column to t1.k, which is the correlated column, only single row will be returned

with
  tmp as (select * from (VALUES (1,3),(2,5)) t(k,v))
select
  (select 3 from tmp t1 where t1.k=t0.k and t1.k>1 group by **t1.k**)
from (select k, v from tmp) t0;

By looking through the optimizer, seems like the first one triggers TransformCorrelatedScalarSubquery before TransformCorrelatedJoinToJoin that creating am inner join instead of left join, with filter predicates t1.k>1 pushed down to the outer query so the row was eliminated.

If the aggregation is removed, both queries returns the same result.

The text was updated successfully, but these errors were encountered:

martint · 2023-09-13T17:30:34Z

I believe this is related to, or even the same as the issue reported and fixed here: #19002

zhaner08 · 2023-10-11T18:08:09Z

Looks like it is the same issue, closing this one

zhaner08 closed this as completed Oct 11, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Inconsistent behavior of correlated subqueries involving aggregations #18979

Inconsistent behavior of correlated subqueries involving aggregations #18979

zhaner08 commented Sep 8, 2023

martint commented Sep 13, 2023

zhaner08 commented Oct 11, 2023

Inconsistent behavior of correlated subqueries involving aggregations #18979

Inconsistent behavior of correlated subqueries involving aggregations #18979

Comments

zhaner08 commented Sep 8, 2023

martint commented Sep 13, 2023

zhaner08 commented Oct 11, 2023