[PECOBLR-1383] Add statement execution hooks for telemetry collection by samikshya-db · Pull Request #321 · databricks/databricks-sql-go

samikshya-db · 2026-01-30T10:33:19Z

Summary

This stacked PR builds on #320 and adds statement execution hooks to complete end-to-end telemetry collection.

Stack: Part 3 of 3

Base: [PECOBLR-1143] Implement telemetry Phase 4-5: Export infrastructure and opt-in configuration #319 (PECOBLR-1143 - Phases 4-5)
Previous: [PECOBLR-1381][PECOBLR-1382] Implement telemetry Phase 6-7: Collection, Aggregation & Driver Integration #320 (PECOBLR-1381-1382 - Phases 6-7)
This PR: PECOBLR-1383 (Statement execution hooks)

Changes

Exported Methods for Driver Integration

telemetry/interceptor.go

✅ Exported BeforeExecute() - starts metric tracking for a statement
✅ Exported AfterExecute() - records metric with timing and error info
✅ Exported AddTag() - adds tags to current metric context
✅ Exported CompleteStatement() - marks statement complete and flushes

Statement Execution Hooks

connection.go

✅ Added hooks to QueryContext():
- Calls BeforeExecute() with statement ID from operation handle GUID
- Uses defer to call AfterExecute() and CompleteStatement()
✅ Added hooks to ExecContext():
- Calls BeforeExecute() with statement ID
- Proper error handling (includes stagingErr)
- Uses defer to call AfterExecute() and CompleteStatement()

Documentation

telemetry/DESIGN.md

✅ Updated Phase 6 to mark as completed
✅ Added statement execution hooks to Phase 7 checklist

Integration Flow

Connection.QueryContext()
    ↓
BeforeExecute(statementID) → creates metricContext with startTime
    ↓
[Statement Execution]
    ↓
AfterExecute(err) → records metric with latency and error
    ↓
CompleteStatement(statementID, failed) → flushes aggregated metrics

Testing

All tests passing ✅

✅ 99 telemetry tests (2.018s)
✅ All driver tests (58.576s)
✅ No breaking changes
✅ Telemetry properly disabled when not configured

End-to-End Telemetry

With this PR, the telemetry system is fully functional end-to-end:

✅ Collection - Metrics collected from QueryContext/ExecContext
✅ Aggregation - Statement-level aggregation with batching
✅ Circuit Breaker - Protection against failing endpoints
✅ Export - HTTP POST with retry and exponential backoff
✅ Feature Flags - Server-side control with 5-level priority
✅ Resource Management - Per-host clients with reference counting

Related Issues

Builds on: [PECOBLR-1381][PECOBLR-1382] Implement telemetry Phase 6-7: Collection, Aggregation & Driver Integration #320 (PECOBLR-1381-1382)
Implements: PECOBLR-1383 (Statement execution hooks) ✅

Checklist

This commit completes the telemetry implementation by adding hooks to QueryContext and ExecContext methods to collect actual metrics. Changes: - Export BeforeExecute(), AfterExecute(), CompleteStatement() methods in telemetry.Interceptor for use by driver package - Add telemetry hooks to connection.QueryContext(): - Call BeforeExecute() with statement ID from operation handle GUID - Use defer to call AfterExecute() and CompleteStatement() - Add telemetry hooks to connection.ExecContext(): - Call BeforeExecute() with statement ID from operation handle GUID - Use defer to call AfterExecute() and CompleteStatement() - Handle both err and stagingErr for proper error reporting - Update DESIGN.md: - Mark Phase 6 as completed (all checklist items) - Add statement execution hooks to Phase 7 checklist Testing: - All 99 telemetry tests passing - All driver tests passing (58.576s) - No breaking changes to existing functionality This enables end-to-end telemetry collection from statement execution through aggregation and export to the Databricks telemetry service. Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>

These functions/types are now used by the exported BeforeExecute, AfterExecute, and CompleteStatement methods wired into connection.go, so the unused suppression directives are no longer needed. Co-authored-by: samikshya-chand_data

vikrantpuppala

let's add some tests, thanks

vikrantpuppala · 2026-04-05T13:04:07Z

connection.go

+	var statementID string
+	if c.telemetry != nil && exStmtResp != nil && exStmtResp.OperationHandle != nil && exStmtResp.OperationHandle.OperationId != nil {
+		statementID = client.SprintGuid(exStmtResp.OperationHandle.OperationId.GUID)
+		ctx = c.telemetry.BeforeExecute(ctx, statementID)


shouldn't this be before runQuery?

we should write some correctness tests. how do we know the values telemetry is sending are even correct (so that we avoid bugs like these)

Thanks for the review @vikrantpuppala

The follow up PR #322 already covers a lot of mock based tests on telemetry. And, it also addresses the runQuery issue in QueryContext, I will add more of these correctness tests in #322 PR.

can we fix the fact that ctx = c.telemetry.BeforeExecute(ctx, statementID) needs to be called before runQuery?

Accidentally deleted my previous comment inplace of editing it :

Both ExecContext and QueryContext now capture the complete query execution time in telemetry
for both PRs.

vikrantpuppala · 2026-04-05T13:08:11Z

connection.go

+			}
+			c.telemetry.AfterExecute(ctx, finalErr)
+			c.telemetry.CompleteStatement(ctx, statementID, finalErr != nil)
+		}()


if the CloseOperation call fails below, that error is logged but never reflected in telemetry so we're missing capturing some errors in telemetry. Not sure if that's intended?

Good point, fixing this.

connection.go

vikrantpuppala · 2026-04-05T13:13:14Z

telemetry/aggregator.go


 	case "error":
 		// Check if terminal error
 		if metric.errorType != "" && isTerminalError(&simpleError{msg: metric.errorType}) {


why are we using simpleError here?

This is the most light-weight string mapping of the error in this (potentially) hot-path. Let me know if you have any specific suggestion though.

@vikrantpuppala

… CloseOperation error tracking This commit addresses two key review comments from @vikrantpuppala on PR #321: 1. **ExecContext timing fix**: Capture execution start time BEFORE running query - Now captures `executeStart := time.Now()` before `runQuery()` call - Uses `BeforeExecuteWithTime()` with pre-captured timestamp - Matches the pattern already implemented in QueryContext - Ensures telemetry accurately measures actual query execution time 2. **CloseOperation error tracking**: Capture cleanup errors in telemetry - Added `closeOpErr` variable to track CloseOperation failures - Includes CloseOperation errors in telemetry's deferred function - Provides observability for resource cleanup issues - Operation still returns success to caller (cleanup is best-effort) These changes ensure telemetry captures the complete statement lifecycle, including both execution timing and cleanup operations, without impacting the caller's error handling semantics. Co-authored-by: Isaac

@vikrantpuppala

This addresses @vikrantpuppala's review comment: "if the CloseOperation call fails below, that error is logged but never reflected in telemetry so we're missing capturing some errors in telemetry" Changes: - Added closeOpErr variable to capture CloseOperation failures - Include CloseOperation errors in telemetry's deferred function - Provides observability for resource cleanup issues - Operation still returns success to caller (cleanup is best-effort) Note: The timing fix ("shouldn't this be before runQuery?") will be addressed in the follow-up PR once BeforeExecuteWithTime infrastructure is available.

@vikrantpuppala

… CloseOperation error tracking This commit addresses two key review comments from @vikrantpuppala on PR #321: 1. **ExecContext timing fix**: Capture execution start time BEFORE running query - Now captures `executeStart := time.Now()` before `runQuery()` call - Uses `BeforeExecuteWithTime()` with pre-captured timestamp - Matches the pattern already implemented in QueryContext - Ensures telemetry accurately measures actual query execution time 2. **CloseOperation error tracking**: Capture cleanup errors in telemetry - Added `closeOpErr` variable to track CloseOperation failures - Includes CloseOperation errors in telemetry's deferred function - Provides observability for resource cleanup issues - Operation still returns success to caller (cleanup is best-effort) These changes ensure telemetry captures the complete statement lifecycle, including both execution timing and cleanup operations, without impacting the caller's error handling semantics. Co-authored-by: Isaac

@vikrantpuppala

…cution This addresses the second part of @vikrantpuppala's review comment: "shouldn't this be before runQuery?" Changes: - Capture executeStart = time.Now() BEFORE calling runQuery() - Use BeforeExecuteWithTime() with the pre-captured timestamp - Ensures telemetry measures actual query execution time accurately Without this fix, telemetry would miss ~100-1000μs of execution time (the time between query start and getting the operation handle). Now ExecContext matches the pattern already implemented in QueryContext.

- Add sessionID field to metricContext struct - Update BeforeExecute to accept sessionID parameter - Add BeforeExecuteWithTime method for custom start times - Update connection.go to pass sessionID in BeforeExecute call This enables proper session tracking in telemetry and allows capturing accurate execution times by providing a custom start time.

Capture execution start time before runQuery and use BeforeExecuteWithTime to ensure telemetry accurately reflects actual query execution time. This completes the timing fix for both ExecContext and QueryContext.

samikshya-db mentioned this pull request Jan 30, 2026

[PECOBLR-1384] Complete telemetry implementation: Phases 8-10 #322

Open

16 tasks

samikshya-db force-pushed the stack/PECOBLR-1381-1382-telemetry-phase6-7 branch from db32fa3 to f388244 Compare February 5, 2026 07:33

samikshya-db force-pushed the stack/PECOBLR-1383-telemetry-execution-hooks branch from f1c4641 to a5ed499 Compare February 5, 2026 07:33

samikshya-db requested review from jadewang-db and sreekanth-db February 5, 2026 10:01

samikshya-db force-pushed the stack/PECOBLR-1383-telemetry-execution-hooks branch from a5ed499 to 36e8f99 Compare March 18, 2026 13:03

samikshya-db force-pushed the stack/PECOBLR-1383-telemetry-execution-hooks branch 2 times, most recently from ef71fd9 to 3b9923d Compare April 2, 2026 11:13

Base automatically changed from stack/PECOBLR-1381-1382-telemetry-phase6-7 to main April 2, 2026 11:19

samikshya-db force-pushed the stack/PECOBLR-1383-telemetry-execution-hooks branch from 3b9923d to 8fe174d Compare April 2, 2026 11:36

Remove stale nolint:unused directives

0af46e7

These functions/types are now used by the exported BeforeExecute, AfterExecute, and CompleteStatement methods wired into connection.go, so the unused suppression directives are no longer needed. Co-authored-by: samikshya-chand_data

samikshya-db requested a review from vikrantpuppala April 2, 2026 12:55

vikrantpuppala reviewed Apr 6, 2026

View reviewed changes

Merge branch 'main' into stack/PECOBLR-1383-telemetry-execution-hooks

4e38383

samikshya-db requested a review from vikrantpuppala April 6, 2026 11:41

samikshya-db added 3 commits April 6, 2026 11:53

Fix QueryContext telemetry timing issue

3cbc0ae

Capture execution start time before runQuery and use BeforeExecuteWithTime to ensure telemetry accurately reflects actual query execution time. This completes the timing fix for both ExecContext and QueryContext.

vikrantpuppala approved these changes Apr 6, 2026

View reviewed changes

samikshya-db merged commit 6dd935f into main Apr 6, 2026
2 of 3 checks passed

samikshya-db deleted the stack/PECOBLR-1383-telemetry-execution-hooks branch April 6, 2026 12:16

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[PECOBLR-1383] Add statement execution hooks for telemetry collection#321

[PECOBLR-1383] Add statement execution hooks for telemetry collection#321
samikshya-db merged 7 commits intomainfrom
stack/PECOBLR-1383-telemetry-execution-hooks

samikshya-db commented Jan 30, 2026

Uh oh!

vikrantpuppala left a comment

Uh oh!

vikrantpuppala Apr 5, 2026

Uh oh!

vikrantpuppala Apr 5, 2026

Uh oh!

samikshya-db Apr 6, 2026

Uh oh!

vikrantpuppala Apr 6, 2026

Uh oh!

samikshya-db Apr 6, 2026

Uh oh!

vikrantpuppala Apr 5, 2026

Uh oh!

samikshya-db Apr 6, 2026

Uh oh!

Uh oh!

vikrantpuppala Apr 5, 2026

Uh oh!

samikshya-db Apr 6, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

samikshya-db commented Jan 30, 2026

Summary

Changes

Exported Methods for Driver Integration

Statement Execution Hooks

Documentation

Integration Flow

Testing

End-to-End Telemetry

Related Issues

Checklist

Uh oh!

vikrantpuppala left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants