[fix](mtmv) Infer null-reject from INNER JoinEdge for multi-hop outer join MV rewrite#62492
Open
seawinde wants to merge 1 commit intoapache:masterfrom
Open
[fix](mtmv) Infer null-reject from INNER JoinEdge for multi-hop outer join MV rewrite#62492seawinde wants to merge 1 commit intoapache:masterfrom
seawinde wants to merge 1 commit intoapache:masterfrom
Conversation
Contributor
|
Thank you for your contribution to Apache Doris. Please clearly describe your PR:
|
Member
Author
|
run buildall |
…oin MV rewrite ### What problem does this PR solve? Issue Number: close #xxx Problem Summary: In multi-hop LEFT JOIN MV rewrite (e.g., fact LEFT JOIN dim1 LEFT JOIN dim2), when the query has a WHERE clause that null-rejects the outermost table (dim2), EliminateOuterJoin converts all LEFT JOINs to INNER. However, containsNullRejectSlot only checked filter predicates for NOT NULL proof, which only covers the outermost table slots. The intermediate table (dim1) slots had no NOT NULL evidence, causing "Predicate compensate fail". The fix reads INNER JoinEdge conditions from the query HyperGraph. After EliminateOuterJoin converts LEFT→INNER, JoinEdge objects retain their INNER type and join condition expressions even though EliminateNotNull removes filter-level NOT NULL predicates. ExpressionUtils.inferNotNullSlots extracts NOT NULL slots from these INNER join conditions, covering all intermediate join tables. ### Release note Fix multi-hop LEFT JOIN materialized view transparent rewrite failure when WHERE clause only references the outermost dimension table. ### Check List (For Author) - Test: Unit Test (NullRejectInferenceTest) / Regression test (outer_join_two_hop_null_reject) - Behavior changed: No - Does this need documentation: No Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
f2f6c8a to
488c34d
Compare
Member
Author
|
run buildall |
Contributor
FE Regression Coverage ReportIncrement line coverage |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
What problem does this PR solve?
Issue Number: N/A
Related PR: #30374
Problem Summary:
In multi-hop LEFT JOIN materialized view transparent rewrite (e.g.,
fact LEFT JOIN dim1 LEFT JOIN dim2), when the query has a WHERE clause that null-rejects only the outermost dimension table (e.g.,WHERE dim2.col = 'value'), the MV rewrite fails with "Predicate compensate fail".Root cause: In
AbstractMaterializedViewRule.containsNullRejectSlot(), the original code only checked filter predicates (queryPredicates) for NOT NULL evidence. After the Nereids rewrite pipeline runs:EliminateOuterJoinconverts all eligible LEFT JOINs → INNER (cascading throughInferJoinNotNullacross multiple passes)EliminateNotNullunconditionally removes all generated NOT NULL predicates (isGeneratedIsNotNull=true)By the time MV rewrite (exploration phase) runs, the query plan has INNER JOINs but zero NOT NULL filter predicates. The only surviving predicate is the user's WHERE clause (e.g.,
dim2.region_name = 'West'), which can only prove NOT NULL for outermost dim2 slots — leaving intermediate dim1 slots uncovered.Fix: Read INNER JoinEdge conditions directly from the query HyperGraph. After
EliminateOuterJoinconverts LEFT→INNER, JoinEdge objects retain their INNER type and join condition expressions even thoughEliminateNotNullremoves filter-level NOT NULL predicates.ExpressionUtils.inferNotNullSlots()extracts NOT NULL slots from these INNER join conditions, covering all intermediate join tables.AbstractMaterializedViewRule.javacontainsNullRejectSlot(): Add loop over INNER JoinEdges to collect NOT NULL slots from join conditions viainferNotNullSlots. Also addshuttleExpressionWithLineagefor correct slot-level mapping.NullRejectInferenceTest.java(new)predicatesCompensatesucceedsouter_join_two_hop_null_reject.groovy(new)2-hop example walkthrough:
Release note
Fix multi-hop LEFT JOIN materialized view transparent rewrite failure when the WHERE clause only references the outermost dimension table.
Check List (For Author)
Test
Behavior changed:
Does this need documentation?
Check List (For Reviewer who merge this PR)