End-to-end SQL support: schema discovery, SELECT * expansion, DataFrame bridge, helpers, guardrails by Copilot · Pull Request #141 · microsoft/PowerPlatform-DataverseClient-Python

Copilot · 2026-03-13T16:12:41Z

SQL developers working with Dataverse shouldn't need to learn OData or the Web API. This PR closes that gap by making the Python SDK's SQL endpoint usable end-to-end -- from schema discovery, through querying (including SELECT *, JOINs, aggregates), to writing results back via DataFrames.

Live-validated against Aurora VM: 66/66 SQL tests + 15-table JOIN confirmed + all anti-patterns guarded.

What's new

Schema discovery APIs

list_columns(table), list_relationships(), list_table_relationships(table) (OneToMany + ManyToOne + ManyToMany)

SELECT * auto-expansion

SDK transparently rewrites SELECT * into explicit column names (server blocks * directly). Warns when used with JOINs.

DataFrame SQL bridge

client.dataframe.sql() -- SQL to pandas DataFrame. Full bidirectional: SQL read -> pandas transform -> SDK write-back.

SQL helpers (`client.query`)

sql_columns, sql_select, sql_joins, sql_join -- auto-discover columns and JOINs from metadata.

OData helpers (`client.query`)

odata_select, odata_expands, odata_expand, odata_bind -- auto-discover nav properties, build @odata.bind.

Auto-lowercase table names in `create_lookup_field`

_build_lookup_field_models now automatically lowercases referencing_table and referenced_table to valid Dataverse logical names. Callers no longer need manual .lower() calls. lookup_field_name is preserved as-is (it is a SchemaName).

SQL safety guardrails (3 layers of protection)

Layer 1 -- Server rejects (user cannot do these at all):

JOIN without ON -- server error
INSERT/UPDATE/DELETE -- server error
SELECT * -- server error (SDK rewrites transparently)
TOP > 5000 -- server error
RIGHT/FULL OUTER/CROSS JOIN -- server error
Subqueries -- server error

Layer 2 -- SDK blocks or warns (before hitting the server):

INSERT/UPDATE/DELETE -- ValidationError with clear message
FROM a, b (cartesian) -- UserWarning (server allows but produces N*M intermediate rows)
LIKE '%value' -- UserWarning (full table scan advisory)
SELECT * with JOIN -- UserWarning (partial expansion advisory)

Layer 3 -- Documentation warns (sql_examples.py Section 32):

Cartesian products, leading-wildcard LIKE, unfiltered system tables, SELECT * on wide tables, deep JOINs without TOP

Anti-patterns and best practices (Section 32 of examples)

Dedicated section documenting 5 SQL anti-patterns with clear explanations of WHY they're dangerous (shared database impact) and HOW to fix them. The SDK's guardrails automatically catch patterns 1 and 2.

Key findings from live testing

SQL JOINs use raw attribute name (parentcustomerid), not _value suffix
No server depth limit on JOINs -- tested up to 15 tables (all PASS, ~0.6s)
JOINs with proper ON clauses use indexed FK lookups -- safe for production
Server enforces: 5000-row cap, ON clause required for JOINs, blocks write SQL
This exceeds OData's 10-level \ limit (SQL JOINs are flat, not nested)

Test coverage

1042 unit tests passing
66 live validation tests (custom + built-in tables + all helpers)
15-table JOIN depth confirmed on Aurora VM
SQL vs OData comparison with live benchmark in examples

Comprehensive examples (`sql_examples.py` -- 34 sections, 1194 lines)

saurabhrb · 2026-03-13T16:24:08Z

@copilot
update the PR description to even show case where no columns are selected, i.e., with wild card *

Copilot · 2026-03-13T16:25:29Z

@copilot
update the PR description to even show case where no columns are selected, i.e., with wild card *

Updated the PR description to show the wildcard case — calling list_columns("account") with no select parameter returns all column properties (equivalent to *). The description now clearly distinguishes between "all properties" and "narrowed via select".

Copilot

Pull request overview

Adds public schema discovery methods under client.tables so SDK users can programmatically discover table columns (attributes) and relationship definitions, which is particularly helpful when authoring client.query.sql() queries.

Changes:

Added TableOperations.list_columns(), list_relationships(), and list_table_relationships() public APIs that delegate to data-layer helpers.
Implemented the corresponding data-layer methods _ODataClient._list_columns() and _RelationshipOperationsMixin._list_relationships() / _list_table_relationships().
Added unit tests and README examples for the new discovery APIs.

Reviewed changes

Copilot reviewed 7 out of 7 changed files in this pull request and generated no comments.

Show a summary per file

File	Description
`src/PowerPlatform/Dataverse/operations/tables.py`	Adds the three new public `client.tables.*` schema discovery methods and docstrings/examples.
`src/PowerPlatform/Dataverse/data/_odata.py`	Implements `_list_columns()` to fetch `EntityDefinitions({MetadataId})/Attributes`.
`src/PowerPlatform/Dataverse/data/_relationships.py`	Implements `_list_relationships()` and `_list_table_relationships()` (1:N + N:N) metadata discovery.
`tests/unit/test_tables_operations.py`	Verifies the new public methods delegate correctly to the underlying `_odata` methods.
`tests/unit/data/test_relationships.py`	Adds data-layer tests for `_list_relationships()` and `_list_table_relationships()`.
`tests/unit/data/test_list_columns.py`	New data-layer tests for `_ODataClient._list_columns()`.
`README.md`	Documents usage examples for the new schema discovery APIs.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

…ith tests and README examples Co-authored-by: saurabhrb <32964911+saurabhrb@users.noreply.github.com>

…me bridge, guardrails, examples - Schema discovery APIs: list_columns, list_relationships, list_table_relationships - SELECT * auto-expansion via list_columns (server blocks SELECT *) - client.dataframe.sql() for SQL -> DataFrame bridge - SQL safety guardrails: block writes, auto-inject TOP 5000, warn on leading-wildcard LIKE and implicit cross joins - Comprehensive sql_examples.py (28 sections): JOINs, aggregates, GROUP BY, DISTINCT, OFFSET FETCH, polymorphic lookups, audit trail, SQL read -> DataFrame transform -> SDK write-back - Updated query.sql() docstring with accurate SQL capabilities - 732 tests (was 318), all passing

…dd SELECT * + JOIN warning Live-tested against Aurora VM (aurorabapenv71aff.crm10.dynamics.com): - Server auto-caps at 5000 rows without TOP -> no client-side injection needed - Server blocks SELECT * on both single-table and JOIN queries -> SDK expansion confirmed needed - Added warning when SELECT * used with JOIN (expansion only includes first table columns) 732 tests passing.

New methods on client.query for SQL-first developers: - sql_columns(table) -> simplified column metadata list - sql_select(table) -> comma-separated column list for SELECT - sql_joins(table) -> all possible JOINs with ready-to-use clauses - sql_join(from, to) -> auto-generated JOIN clause between tables Key finding from live testing: SQL JOINs use the raw attribute name (e.g. parentcustomerid), NOT the _value suffix. The ReferencingAttribute from relationship metadata matches exactly. 13 new unit tests, 745 total passing.

New methods on client.query for OData users (parallel to SQL helpers): - odata_select(table) -> list[str] for records.get(select=) - odata_expands(table) -> all navigation properties with entity sets - odata_expand(from, to) -> PascalCase nav property name for expand= - odata_bind(from, to, id) -> @odata.bind dict for create/update payloads These eliminate the most error-prone parts of OData queries: - No more guessing PascalCase navigation property names for - No more manually constructing @odata.bind with entity set names - Column discovery matches records.get(select=) format directly 11 new unit tests, 756 total passing.

_list_table_relationships now fetches all 3 relationship types: - OneToManyRelationships (incoming: where table is referenced) - ManyToOneRelationships (outgoing: where table has a lookup) <-- NEW - ManyToManyRelationships This was critical for sql_joins() and odata_expands() which need outgoing lookups. Without ManyToOne, custom table lookups were invisible to the helpers. Also fixed: create_lookup_field requires lowercase logical names (not SchemaName) -- documented in copilot-instructions. Live-validated: 47/47 SQL tests passed on Aurora VM. Unit tests: 756 passed.

Columns with AttributeOf set are auto-generated display names (e.g. createdbyname, createdbyyominame) that cause errors in OData \ and are not real data columns. Now filtered out. Also: _list_table_relationships docstring updated for ManyToOne. 756 unit tests passing.

… depth - Moved write statement check (INSERT/UPDATE/DELETE) before _extract_logical_table() so users get the clear 'SQL endpoint is read-only' error instead of confusing 'no FROM clause' ValueError - Live-tested JOIN depth: 8-table JOINs confirmed working on Aurora VM - Live-tested nested polymorphic: customerid -> account -> owner -> businessunit chain works - Fixed validate_sql_full.py: lookup existence check (no more 95s waits), COUNT assertion relaxed for accumulated data, AttributeOf filter - 66/66 validation tests passed, 12/12 JOIN depth tests passed - 756 unit tests passing

1. Remove unused imports (patch, PropertyMock) from test_select_star_expansion 2. Fix JOIN depth '6+' -> 'no depth limit' in query.py docstring 3. Fix stale 'auto-inject TOP' comment in _odata.py 4. Fix Learn-incompatible docstring types in tables.py list_columns 5. Fix Learn-incompatible docstring types in tables.py list_relationships 6. Fix list_table_relationships docstring: add ManyToOne 7. Fix cross-join regex to catch unaliased FROM account, contact 8. Fix write regex to catch comment-prefixed writes (/* */ and --) 9. Narrow odata_expands exception handling (KeyError/AttributeError/ValueError) 10. Consistent JOIN depth claims across all docs 770 unit tests passing.

saurabhrb · 2026-03-22T05:29:01Z

All 11 review comments addressed in commit a595a72:

Removed unused imports (patch, PropertyMock) from test_select_star_expansion.py
Fixed JOIN depth claim from '6+' to 'no depth limit' in query.py docstring
Fixed stale 'auto-inject TOP' comment in _odata.py
Fixed Learn-incompatible docstring types in tables.py list_columns() -- uses bracket generics now
Fixed Learn-incompatible docstring types in tables.py list_relationships()
Fixed list_table_relationships() docstring to include ManyToOne (matches implementation)
Broadened cross-join regex to catch both aliased and unaliased comma joins (FROM account, contact)
Improved write regex to catch comment-prefixed writes (/* */DELETE, --\nDELETE)
Narrowed odata_expands() exception handling to (KeyError, AttributeError, ValueError)
Consistent JOIN depth claims across all documentation
README comment about one-to-many+many-to-many was already corrected in earlier commits

770 unit tests passing.

…taverse API) Dataverse Web API supports EntityDefinitions({id})/ManyToOneRelationships as a standard metadata endpoint. ManyToOne is the outgoing lookup perspective (child -> parent), complementing OneToMany (parent -> child). - README: 'one-to-many + many-to-many' -> 'one-to-many + many-to-one + many-to-many' - _relationships.py docstring updated to match - copilot-instructions.md: added SKILL.md reference to governance list 770 unit tests passing.

…ributeOf test Critical fixes: - tables.py: Learn-incompatible ':class:\list\ of :class:\str\' -> 'list[str]' - _odata.py, _relationships.py, _error_codes.py: add __all__ = [] per SKILL.md Important fixes: - _odata.py: _delete_multiple param 'logical_name' -> 'table_schema_name' - _odata.py: _lowercase_list docstring removed misleading \ mention - _odata.py: _query_sql guardrails comment clarified (defense-in-depth) New test: - test_excludes_attribute_of_columns: verifies computed display-name columns (with AttributeOf set) are filtered from sql_columns() 771 unit tests passing.

…ng approach CodeQL found exponential backtracking in _SQL_WRITE_RE which used nested quantifiers for SQL comment matching: (?:/\*.*?\*/\s*|...)* Fix: separate comment stripping (_SQL_COMMENT_RE) from write detection. 1. Strip SQL comments with a safe non-backtracking regex 2. Check for write keywords with the simple anchored regex The comment regex uses [^*]*\*+(?:[^/*][^*]*\*+)*/ which is the standard safe pattern for matching C-style block comments. 3 new tests for comment-prefixed write detection. 774 unit tests passing.

Copilot

Pull request overview

Copilot reviewed 17 out of 17 changed files in this pull request and generated 5 comments.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

- sql_joins(): fix docstring example to not use aliases with join_clause (join_clause uses full table name; use sql_join() for aliased queries) - sql_examples: replace _value JOIN pattern with sql_join() helper - list_table_relationships: docstring now mentions ManyToOne - remove unused VALIDATION_SQL_CROSS_JOIN_BLOCKED import - PR description: cartesian updated from ValidationError to UserWarning

t54 · 2026-04-09T15:09:04Z

commenting for posterity and in support of the merger.

Resolved conflicts in _odata.py (kept write-blocking + guardrails from HEAD, adopted _execute_raw/_build_sql pattern from main for proper URL encoding; merged all error code imports) and test_sql_parse.py (kept both JOIN extraction tests from HEAD and _build_sql URL encoding tests from main). Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

…tests

… .lower() from example Agent-Logs-Url: https://github.com/microsoft/PowerPlatform-DataverseClient-Python/sessions/508cd2e9-7649-42ec-b5c4-2768fea34725 Co-authored-by: saurabhrb <32964911+saurabhrb@users.noreply.github.com>

dmitmikh · 2026-04-11T00:53:19Z

+            print(f"  {r.get('new_code', ''):<12s}  Budget={r.get('new_budget')}  Active={r.get('new_active')}")
+
+        # ==============================================================
+        # 5. SELECT * (auto-expanded by SDK)


It would be better not to auto-expand these because for wide entities with a lot of logical fields this results in extremely expensive queries. We deliberately reject SELECT * in the SQL API even though it would be trivial to implement because legacy APIs already know how to select all columns.

makes sense.

…sion to prevent expensive wildcard selects on wide entities

abelmilash-msft · 2026-04-23T01:24:05Z

+        # Block write statements FIRST (before table extraction, since
+        # UPDATE/INSERT/DELETE don't have FROM clauses).
+        # Strip SQL comments to catch e.g. /**/DELETE or --\\nDELETE.
+        sql_no_comments = self._SQL_COMMENT_RE.sub(" ", sql).strip()
+        if self._SQL_WRITE_RE.search(sql_no_comments):
+            raise ValidationError(
+                "SQL endpoint is read-only. Use client.records or "
+                "client.dataframe for write operations "
+                "(INSERT/UPDATE/DELETE are not supported).",
+                subcode=VALIDATION_SQL_WRITE_BLOCKED,
+            )


This is also done in self._sql_guardrails(sql) below. If it is a duplicate check, it can be removed from here.

abelmilash-msft · 2026-04-23T01:41:48Z

+            try:
+                with self._client._scoped_odata() as od:
+                    target_set = od._entity_set_from_schema_name(target)
+            except (KeyError, AttributeError, ValueError):


Should this also catch MetadataError which is raised by od._entity_set_from_schema_name(target) when entity set name resolution fails?

abelmilash-msft · 2026-04-23T01:46:32Z

 VALIDATION_SQL_NOT_STRING = "validation_sql_not_string"
 VALIDATION_SQL_EMPTY = "validation_sql_empty"
+VALIDATION_SQL_WRITE_BLOCKED = "validation_sql_write_blocked"
+VALIDATION_SQL_CROSS_JOIN_BLOCKED = "validation_sql_cross_join_blocked"


Nit: I didn't see VALIDATION_SQL_CROSS_JOIN_BLOCKED used in other places. I think it can be removed if not needed.

- Updated SQL query examples to use correct field names for ownerid and related entities. - Removed unsupported SQL validation error code. - Enhanced parameter handling for OneToMany and ManyToMany relationships. - Improved test coverage for metadata error handling in entity set resolution. - Fix integration tests

abelmilash-msft · 2026-04-23T05:26:47Z

Should the PR title also be updated to remove SELECT * expansion?

abelmilash-msft · 2026-04-23T06:43:38Z

+                continue
+
+            # Generate a short alias for the target table
+            alias = target[0] if target else "j"


Could this cause an issue/bug if two tables have the same first letter in their names (e.g. account, annotation) or when multiple lookups point to the same table (e.g. ownerid, createdby both pointing to systemuser)? This may be worth testing to check the request doesn't fail in such cases.

Copilot AI assigned Copilot and saurabhrb Mar 13, 2026

Copilot started work on behalf of saurabhrb March 13, 2026 16:12 View session

Copilot AI changed the title ~~[WIP] Add public API for discovering entity/table column definitions~~ Add public schema discovery APIs: list_columns, list_relationships, list_table_relationships Mar 13, 2026

Copilot AI requested a review from saurabhrb March 13, 2026 16:22

Copilot finished work on behalf of saurabhrb March 13, 2026 16:22

Copilot started work on behalf of saurabhrb March 13, 2026 16:24 View session

Copilot finished work on behalf of saurabhrb March 13, 2026 16:25

This was referenced Mar 13, 2026

Closed — see #144 #142

Closed

Closed — see #144 #143

Closed

Community Contribution Intake — Triage #1 (2026-03-13) #144

Open

saurabhrb unassigned Copilot Mar 20, 2026

saurabhrb marked this pull request as ready for review March 20, 2026 23:44

saurabhrb requested a review from a team as a code owner March 20, 2026 23:44

Copilot AI review requested due to automatic review settings March 20, 2026 23:44

Copilot started reviewing on behalf of saurabhrb March 20, 2026 23:45 View session

Copilot AI reviewed Mar 20, 2026

View reviewed changes

Copilot AI and others added 2 commits March 20, 2026 16:54

Initial plan

5b08ea0

Add list_columns, list_relationships, list_table_relationships APIs w…

d61ffca

…ith tests and README examples Co-authored-by: saurabhrb <32964911+saurabhrb@users.noreply.github.com>

saurabhrb changed the title ~~Add public schema discovery APIs: list_columns, list_relationships, list_table_relationships~~ End-to-end SQL support: schema discovery, SELECT * expansion, DataFrame bridge, and comprehensive examples Mar 21, 2026

saurabhrb force-pushed the copilot/add-public-api-for-metadata branch from e9a25aa to 7f06533 Compare March 21, 2026 05:30

Saurabh Badenkal added 6 commits March 20, 2026 22:35

github-advanced-security AI found potential problems Mar 22, 2026

View reviewed changes

Comment thread src/PowerPlatform/Dataverse/data/_odata.py Fixed

Saurabh Badenkal added 2 commits March 21, 2026 22:31

saurabhrb requested a review from Copilot March 22, 2026 06:30

Copilot started reviewing on behalf of saurabhrb March 22, 2026 06:30 View session

Copilot AI reviewed Mar 22, 2026

View reviewed changes

saurabhrb approved these changes Mar 31, 2026

View reviewed changes

Samson Gebre and others added 3 commits April 9, 2026 08:57

fix: update SQL parameter extraction to use URL parsing in query SQL …

aa896b8

…tests

fix: integration tests for SQL examples

f4afbef

saurabhrb requested changes Apr 9, 2026

View reviewed changes

Comment thread examples/advanced/sql_examples.py Outdated

Comment thread examples/advanced/sql_examples.py Outdated

Copilot started work on behalf of saurabhrb April 9, 2026 18:45 View session

Copilot finished work on behalf of saurabhrb April 9, 2026 18:53

Copilot AI requested a review from saurabhrb April 9, 2026 18:53

saurabhrb approved these changes Apr 9, 2026

View reviewed changes

dmitmikh reviewed Apr 11, 2026

View reviewed changes

sagebree and others added 2 commits April 13, 2026 16:43

Merge branch 'main' into copilot/add-public-api-for-metadata

5942c4a

Block SELECT * queries with ValidationError — intentional design deci…

f499a30

…sion to prevent expensive wildcard selects on wide entities

abelmilash-msft reviewed Apr 23, 2026

View reviewed changes

Conversation

Copilot AI commented Mar 13, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What's new

Schema discovery APIs

SELECT * auto-expansion

DataFrame SQL bridge

SQL helpers (client.query)

OData helpers (client.query)

Auto-lowercase table names in create_lookup_field

SQL safety guardrails (3 layers of protection)

Anti-patterns and best practices (Section 32 of examples)

Key findings from live testing

Test coverage

Comprehensive examples (sql_examples.py -- 34 sections, 1194 lines)

Uh oh!

saurabhrb commented Mar 13, 2026

Uh oh!

Copilot AI commented Mar 13, 2026

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

saurabhrb commented Mar 22, 2026

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

t54 commented Apr 9, 2026

Uh oh!

Uh oh!

Uh oh!

dmitmikh Apr 11, 2026

Choose a reason for hiding this comment

Uh oh!

sagebree Apr 13, 2026

Choose a reason for hiding this comment

Uh oh!

abelmilash-msft Apr 23, 2026

Choose a reason for hiding this comment

Uh oh!

abelmilash-msft Apr 23, 2026

Choose a reason for hiding this comment

Uh oh!

abelmilash-msft Apr 23, 2026

Choose a reason for hiding this comment

Uh oh!

abelmilash-msft commented Apr 23, 2026

Uh oh!

abelmilash-msft Apr 23, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

8 participants

Copilot AI commented Mar 13, 2026 •

edited

Loading

SQL helpers (`client.query`)

OData helpers (`client.query`)

Auto-lowercase table names in `create_lookup_field`

Comprehensive examples (`sql_examples.py` -- 34 sections, 1194 lines)

abelmilash-msft Apr 23, 2026 •

edited

Loading