Skip to content

fix: parse EXCLUDE column_list after non-star columns in Redshift#7019

Closed
veeceey wants to merge 1 commit intotobymao:mainfrom
veeceey:fix/issue-6963-redshift-exclude-parsing
Closed

fix: parse EXCLUDE column_list after non-star columns in Redshift#7019
veeceey wants to merge 1 commit intotobymao:mainfrom
veeceey:fix/issue-6963-redshift-exclude-parsing

Conversation

@veeceey
Copy link

@veeceey veeceey commented Feb 8, 2026

Summary

Fixes #6963

In Redshift, the EXCLUDE clause can appear at the end of the entire SELECT projection list, not just immediately after *. For example, this is valid Redshift SQL:

SELECT *, 4 AS col4 EXCLUDE (col2, col3) FROM (SELECT 1 AS col1, 2 AS col2, 3 AS col3);

Previously this raised ParseError: Invalid expression / Unexpected token because the parser only recognized EXCLUDE as part of the Star expression (i.e., * EXCLUDE (...)).

Changes

  • expressions.py: Added exclude field to Select to store the trailing EXCLUDE column list
  • parser.py: Changed _parse_projections return type to include an optional exclude list
  • redshift.py (Parser): Override _parse_projections to detect and parse a trailing EXCLUDE clause after all projections
  • redshift.py (Generator): Set STAR_EXCEPT = "EXCLUDE" and STAR_EXCLUDE_REQUIRES_DERIVED_TABLE = False for correct Redshift output
  • generator.py: Added STAR_EXCLUDE_REQUIRES_DERIVED_TABLE flag; when True (default), transpiles to a derived table with * EXCEPT (...); when False (Redshift), emits EXCLUDE directly
  • tsql.py: Updated _parse_projections override to match new return type
  • test_redshift.py: Added test cases for identity, normalization, and cross-dialect transpilation

Transpilation behavior

Input (Redshift) Output dialect Result
SELECT *, 4 AS col4 EXCLUDE (col2, col3) FROM t redshift SELECT *, 4 AS col4 EXCLUDE (col2, col3) FROM t
SELECT *, 4 AS col4 EXCLUDE (col2, col3) FROM t default SELECT * EXCEPT (col2, col3) FROM (SELECT *, 4 AS col4 FROM t)

Test plan

  • All 926 existing tests pass (excluding test_executor and test_optimizer which require duckdb)
  • Added identity tests for EXCLUDE after non-star columns
  • Added normalization test (unparenthesized EXCLUDE)
  • Added cross-dialect transpilation test
  • Verified * EXCLUDE (...) (star-only) still works as before

In Redshift, the EXCLUDE clause can appear at the end of the entire
projection list, not just immediately after a `*`. For example:
`SELECT *, 4 AS col4 EXCLUDE (col2, col3) FROM t`

This was previously raising a ParseError because the parser only
recognized EXCLUDE as part of the Star expression parsing.

The fix adds an `exclude` field to Select, overrides `_parse_projections`
in the Redshift parser to extract a trailing EXCLUDE clause, and handles
generation by either emitting the EXCLUDE directly (for Redshift) or
wrapping in a derived table with `* EXCEPT (...)` (for other dialects).

Fixes tobymao#6963
@veeceey
Copy link
Author

veeceey commented Feb 8, 2026

Manual Testing Results

I've tested the fix for parsing clauses on aliased expressions in Redshift dialect. All tests passed successfully.

Test 1: Issue SQL Parsing

Original SQL from issue:

SELECT *, 4 AS col4 EXCLUDE (col2, col3) FROM (SELECT 1 AS col1, 2 AS col2, 3 AS col3)

Parses successfully - no more parse errors

Test 2: Round-trip Verification

# Parse and regenerate
parsed = sqlglot.parse_one(sql, dialect='redshift')
regenerated = parsed.sql(dialect='redshift')

Preserves syntax correctly:

SELECT *, 4 AS col4 EXCLUDE (col2, col3) FROM (SELECT 1 AS col1, 2 AS col2, 3 AS col3)

Test 3: Cross-Dialect Transpilation

# Spark
SELECT * EXCEPT (col2, col3) FROM (SELECT *, 4 AS col4 FROM (SELECT 1 AS col1, 2 AS col2, 3 AS col3))

# BigQuery
SELECT * EXCEPT (col2, col3) FROM (SELECT *, 4 AS col4 FROM (SELECT 1 AS col1, 2 AS col2, 3 AS col3))

# DuckDB
SELECT * EXCLUDE (col2, col3) FROM (SELECT *, 4 AS col4 FROM (SELECT 1 AS col1, 2 AS col2, 3 AS col3))

Correctly transpiles to dialect-specific syntax (EXCEPT vs EXCLUDE)

Test 4: Edge Cases

All edge cases passed:

  1. Single column EXCLUDE:

    SELECT *, 5 AS col5 EXCLUDE (col1) FROM (SELECT 1 AS col1, 2 AS col2)

    ✅ Parses and regenerates correctly

  2. Unparenthesized EXCLUDE:

    SELECT *, 5 AS col5 EXCLUDE col1 FROM (SELECT 1 AS col1, 2 AS col2)

    ✅ Parses and normalizes to parenthesized form

  3. Multiple columns in EXCLUDE:

    SELECT *, 6 AS col6 EXCLUDE (col1, col2, col3, col4) FROM (...)

    ✅ Handles multiple columns correctly

  4. EXCLUDE on non-wildcard column:

    SELECT col1, col2 EXCLUDE (col3) FROM table1

    ✅ Works on regular column expressions

  5. EXCLUDE with complex aliases:

    SELECT *, new_col AS alias_name EXCLUDE (col1) FROM (...)

    ✅ Handles aliases correctly

All tests confirm the fix resolves the original issue and handles various edge cases properly.

@VaggelisD
Copy link
Collaborator

Hey @veeceey, we really appreciate the contribution but there's currently an open PR working towards the fix so I'll go ahead and close this one.

@VaggelisD VaggelisD closed this Feb 9, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Redshift parsing fails for valid sql using exclude column_list

2 participants