Skip to content

Add arrow_field(expr) scalar UDF#21389

Open
adriangb wants to merge 4 commits intoapache:mainfrom
pydantic:add-arrow-field-udf
Open

Add arrow_field(expr) scalar UDF#21389
adriangb wants to merge 4 commits intoapache:mainfrom
pydantic:add-arrow-field-udf

Conversation

@adriangb
Copy link
Copy Markdown
Contributor

@adriangb adriangb commented Apr 5, 2026

Which issue does this PR close?

Rationale for this change

DataFusion has individual introspection functions (arrow_typeof, arrow_metadata, is_nullable) but no single function that returns the complete Arrow Field representation. Having arrow_field(expr) that returns a struct with all field info avoids multiple function calls when you need the full picture, and provides a natural complement to the existing introspection suite.

What changes are included in this PR?

Adds arrow_field(expr) scalar UDF that returns a struct:

> SELECT arrow_field(x) FROM my_table;
{name: x, data_type: Int32, nullable: false, metadata: {}}

> SELECT arrow_field(x)['data_type'] FROM my_table;
Int32

The returned struct has four fields:

  • name (Utf8) — the field name
  • data_type (Utf8) — the Arrow data type as string
  • nullable (Boolean) — whether the field is nullable
  • metadata (Map<Utf8, Utf8>) — the field metadata

Individual fields are accessible via bracket syntax.

Files:

  • datafusion/functions/src/core/arrow_field.rs — new UDF implementation
  • datafusion/functions/src/core/mod.rs — registration
  • datafusion/sqllogictest/test_files/arrow_field.slt — tests

Are these changes tested?

Yes, sqllogictest covering literals (int, null, bool, string, float, list), table columns, nullability, and struct field access.

Are there any user-facing changes?

New SQL function arrow_field(expr) is available.

🤖 Generated with Claude Code

Adds a new introspection function that returns a struct containing the
complete Arrow Field information for any expression: name, data_type,
nullable, and metadata. This unifies what `arrow_typeof`,
`arrow_metadata`, and `is_nullable` provide individually.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@github-actions github-actions bot added sqllogictest SQL Logic Tests (.slt) functions Changes to functions implementation labels Apr 5, 2026
query ?
SELECT arrow_field(ARRAY[1,2,3])
----
{name: lit, data_type: List(Int64), nullable: false, metadata: {}}
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be good to have tests for more complex types like Map and Struct too! Maybe a Dictionary too.

Co-authored-by: Martin Grigorov <martin-g@users.noreply.github.com>
@martin-g
Copy link
Copy Markdown
Member

martin-g commented Apr 7, 2026

My suggestions broke the build :-/ Sorry!

@adriangb adriangb marked this pull request as ready for review April 7, 2026 12:41
@adriangb
Copy link
Copy Markdown
Contributor Author

adriangb commented Apr 7, 2026

no worries @martin-g, fixed now!

@github-actions github-actions bot added the documentation Improvements or additions to documentation label Apr 7, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

documentation Improvements or additions to documentation functions Changes to functions implementation sqllogictest SQL Logic Tests (.slt)

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants