Skip to content

Reindexer: Mock reindex function to reduce parallel contention#1210

Open
brandur wants to merge 1 commit intomasterfrom
brandur-mock-reindex
Open

Reindexer: Mock reindex function to reduce parallel contention#1210
brandur wants to merge 1 commit intomasterfrom
brandur-mock-reindex

Conversation

@brandur
Copy link
Copy Markdown
Contributor

@brandur brandur commented Apr 13, 2026

This one's aimed at fixing an intermittently failing test case:

https://github.com/riverqueue/river/actions/runs/24322049420/job/71009998068?pr=1208

--- FAIL: TestReindexer (0.00s)
    --- FAIL: TestReindexer/ReindexesConfiguredIndexes (10.07s)
        reindexer_test.go:219: Reusing idle postgres schema "maintenance_2026_04_13t01_54_14_schema_04" [user facing: "maintenance_2026_04_13t01_54_14_schema_04"] after cleaning in 26.436456ms [4 generated] [7 reused]
        test_signal.go:95: timed out waiting on test signal after 10s
        logger.go:256: time=2026-04-13T01:54:27.581Z level=INFO msg="maintenance.Reindexer: Signaled to stop during index build; attempting to clean up concurrent artifacts"
        riverdbtest.go:293: Checked in postgres schema "maintenance_2026_04_13t01_54_14_schema_04"; 1 idle schema(s) [4 generated] [10 reused]
    --- FAIL: TestReindexer/ReindexesMinimalSubsetofIndexes (10.14s)
        reindexer_test.go:183: Reusing idle postgres schema "maintenance_2026_04_13t01_54_14_schema_01" [user facing: "maintenance_2026_04_13t01_54_14_schema_01"] after cleaning in 28.042877ms [4 generated] [10 reused]
        test_signal.go:95: timed out waiting on test signal after 10s
        reindexer_test.go:211:
                Error Trace:	/home/runner/work/river/river/internal/maintenance/reindexer_test.go:211
                Error:      	Should be false
                Test:       	TestReindexer/ReindexesMinimalSubsetofIndexes
        logger.go:256: time=2026-04-13T01:54:28.444Z level=INFO msg="maintenance.Reindexer: Signaled to stop during index build; attempting to clean up concurrent artifacts"
        riverdbtest.go:293: Checked in postgres schema "maintenance_2026_04_13t01_54_14_schema_01"; 1 idle schema(s) [5 generated] [24 reused]
FAIL
FAIL	github.com/riverqueue/river/internal/maintenance	18.764s

I'm diagnosing with Claude's help here, but what appears to be happening
is that although a reindex operation in Postgres is often fast, it is
still a heavy operation, and can slow down even further when there's a
lot of concurrent activity hammering a database.

Many reindexer test cases run in parallel, and it appears that was
happening here is that we got a reindex that exceeded our maximum
timeout of 10x. We have some evidence this during the test run from the
runtime of 10.07s and the line:

Signaled to stop during index build; attempting to clean up concurrent artifacts

Here, I'm putting forward a solution proposed by Claude, which is to
mock out the reindex operation, especially where we have a number of
reindexes going in parallel. The tests ReindexOneSuccess and
ReindexSkippedWithReindexArtifact still put load on the real reindex
operation, so we're not going full mock here, and should see our
intermittency considerably reduced while still being confident that
everything still works.

This one's aimed at fixing an intermittently failing test case:

https://github.com/riverqueue/river/actions/runs/24322049420/job/71009998068?pr=1208

    --- FAIL: TestReindexer (0.00s)
        --- FAIL: TestReindexer/ReindexesConfiguredIndexes (10.07s)
            reindexer_test.go:219: Reusing idle postgres schema "maintenance_2026_04_13t01_54_14_schema_04" [user facing: "maintenance_2026_04_13t01_54_14_schema_04"] after cleaning in 26.436456ms [4 generated] [7 reused]
            test_signal.go:95: timed out waiting on test signal after 10s
            logger.go:256: time=2026-04-13T01:54:27.581Z level=INFO msg="maintenance.Reindexer: Signaled to stop during index build; attempting to clean up concurrent artifacts"
            riverdbtest.go:293: Checked in postgres schema "maintenance_2026_04_13t01_54_14_schema_04"; 1 idle schema(s) [4 generated] [10 reused]
        --- FAIL: TestReindexer/ReindexesMinimalSubsetofIndexes (10.14s)
            reindexer_test.go:183: Reusing idle postgres schema "maintenance_2026_04_13t01_54_14_schema_01" [user facing: "maintenance_2026_04_13t01_54_14_schema_01"] after cleaning in 28.042877ms [4 generated] [10 reused]
            test_signal.go:95: timed out waiting on test signal after 10s
            reindexer_test.go:211:
                    Error Trace:	/home/runner/work/river/river/internal/maintenance/reindexer_test.go:211
                    Error:      	Should be false
                    Test:       	TestReindexer/ReindexesMinimalSubsetofIndexes
            logger.go:256: time=2026-04-13T01:54:28.444Z level=INFO msg="maintenance.Reindexer: Signaled to stop during index build; attempting to clean up concurrent artifacts"
            riverdbtest.go:293: Checked in postgres schema "maintenance_2026_04_13t01_54_14_schema_01"; 1 idle schema(s) [5 generated] [24 reused]
    FAIL
    FAIL	github.com/riverqueue/river/internal/maintenance	18.764s

I'm diagnosing with Claude's help here, but what appears to be happening
is that although a reindex operation in Postgres is often fast, it is
still a heavy operation, and can slow down even further when there's a
lot of concurrent activity hammering a database.

Many reindexer test cases run in parallel, and it appears that was
happening here is that we got a reindex that exceeded our maximum
timeout of 10x. We have some evidence this during the test run from the
runtime of 10.07s and the line:

    Signaled to stop during index build; attempting to clean up concurrent artifacts

Here, I'm putting forward a solution proposed by Claude, which is to
mock out the reindex operation, especially where we have a number of
reindexes going in parallel. The tests `ReindexOneSuccess` and
`ReindexSkippedWithReindexArtifact` still put load on the real reindex
operation, so we're not going full mock here, and should see our
intermittency considerably reduced while still being confident that
everything still works.
@brandur brandur force-pushed the brandur-mock-reindex branch from 74790ee to 9bb0e45 Compare April 13, 2026 05:25
@brandur brandur requested a review from bgentry April 13, 2026 05:31
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant