Batch concat_ws calls to support tables with 100+ columns by mason-sharp · Pull Request #80 · pgEdge/ace

mason-sharp · 2026-02-18T20:44:15Z

PostgreSQL limits functions to 100 arguments. The row-hashing expression used concat_ws('|', col1, ..., colN), which fails on wide tables. Split column expressions into groups of 99 with nested concat_ws calls. This affects both table-diff and merkle tree hash queries.

PostgreSQL limits functions to 100 arguments. The row-hashing expression used concat_ws('|', col1, ..., colN), which fails on wide tables. Split column expressions into groups of 99 with nested concat_ws calls. This affects both table-diff and merkle tree hash queries. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

coderabbitai · 2026-02-18T20:44:34Z

📝 Walkthrough

Walkthrough

Added batching logic for PostgreSQL concat_ws function calls to handle the 100-argument limit. Introduced a concatWSBatched helper function that nests concat_ws calls when expressions exceed 99 arguments per call, with corresponding comprehensive test coverage.

Changes

Cohort / File(s)	Summary
Query Expression Batching `db/queries/queries.go`	Introduced `concatWSBatched` helper function with explanatory comments about PostgreSQL's 100-argument limit and batching strategy. Modified `buildRowTextExpr` to utilize batching, splitting expressions into nested concat_ws calls when exceeding 99-argument threshold.
Batching Test Suite `db/queries/queries_test.go`	Added `TestConcatWSBatched` with four subtests validating single concat_ws behavior under and at limit (50, 99 exprs) and nested batching for over-limit scenarios (100, 200 exprs).

Poem

🐰 Hopping through queries with glee,
Batching concat_ws calls, so spry!
PostgreSQL limits we see,
Nested functions reach the sky,
A hundred arguments? No cry!

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 66.67% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (2 passed)

Check name	Status	Explanation
Title check	✅ Passed	The title clearly and concisely summarizes the main change: implementing batching for concat_ws calls to support tables with 100+ columns, which directly matches the changeset's primary objective.
Description check	✅ Passed	The description is directly related to the changeset, explaining the PostgreSQL 100-argument limit, the problem with wide tables, and the batching solution implemented in the code changes.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

📝 Generate docstrings

🧪 Generate unit tests (beta)

Create PR with unit tests
Post copyable unit tests in a comment
Commit unit tests in branch fix/ACE-168-wide-tables

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

🧹 Nitpick comments (3)

db/queries/queries.go (1)

583-599: Document the single-depth nesting assumption and guard against empty input.

Two minor gaps worth addressing:

Single-depth nesting: The outer concat_ws itself has no overflow protection — if len(batches) ever exceeded 99, it would hit the same PostgreSQL 100-arg limit. In practice this can't happen because PostgreSQL's column limit (~1600) means at most ⌈1600/99⌉ = 17 batches, but the assumption is implicit. A brief comment would prevent confusion if this function is ever reused outside its current context.

Empty-slice precondition: concatWSBatched([]string{}) emits concat_ws('|', ) — invalid SQL. The only call site (buildRowTextExpr) guards this with an early return, but the function itself carries no documented precondition.
♻️ Proposed improvements
 // concatWSBatched produces a concat_ws('|', ...) expression. When len(exprs)
 // exceeds 99 (the max value-arguments per concat_ws call, since the separator
 // takes one slot), it splits the expressions into batches and nests the calls.
+//
+// Only one level of nesting is applied. This is safe because PostgreSQL's
+// maximum column count (~1600) means at most ceil(1600/99)=17 inner batches,
+// keeping the outer concat_ws well within the 100-argument limit.
+//
+// Precondition: exprs must not be empty.
 func concatWSBatched(exprs []string) string {
 	const maxArgs = 99 // 100 total - 1 for the separator
 
+	if len(exprs) == 0 {
+		return "''"
+	}
+
 	if len(exprs) <= maxArgs {
 		return fmt.Sprintf("concat_ws('|', %s)", strings.Join(exprs, ", "))
 	}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@db/queries/queries.go` around lines 583 - 599, The concatWSBatched function
needs a short doc comment stating it assumes only single-depth nesting (the
outer concat_ws is not batched and therefore must not exceed PostgreSQL's
~100-arg limit in this usage) and must guard against an empty input slice;
update concatWSBatched to return an empty string immediately when len(exprs) ==
0 (or otherwise no-op) to avoid emitting invalid SQL like "concat_ws('|', )",
and add the comment near the const maxArgs / batches logic referencing
concatWSBatched and batches so future readers understand the single-depth
assumption.

db/queries/queries_test.go (2)

457-468: Add expression-presence check to the 200-expression subtest.

Unlike the 100-expression subtest, this case only verifies the number of concat_ws calls but not that all 200 expressions appear in the output. A bug that drops the third batch entirely would still pass.

♻️ Proposed addition

 		if strings.Count(result, "concat_ws(") != 4 {
 			t.Errorf("expected 4 concat_ws calls for 200 exprs, got %d in: %s",
 				strings.Count(result, "concat_ws("), result)
 		}
+		for _, e := range exprs {
+			if !strings.Contains(result, e) {
+				t.Errorf("missing expression %s in result", e)
+			}
+		}
 	})

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@db/queries/queries_test.go` around lines 457 - 468, Update the "200
expressions nests into 3 inner batches" subtest to also assert that all
expressions are present in the result: after calling concatWSBatched(exprs)
iterate the expected expression names (e.g., "col0".."col199") and assert each
appears in result (or assert the total occurrences of "col" labels equals 200).
Modify the test that calls concatWSBatched to include this presence check so a
dropped batch would fail; reference the concatWSBatched call and the local exprs
slice when adding the assertions.

449-455: Substring collision weakens the expression-presence check.

Column names col0–col99 include prefix collisions: "col1" is a substring of "col10"–"col19", so strings.Contains(result, "col1") passes even if col1 is absent but any of col10–col19 is present. The test will not catch a regression where a specific column is silently dropped.

♻️ More precise presence check

-		// Every expression should be present
-		for _, e := range exprs {
-			if !strings.Contains(result, e) {
-				t.Errorf("missing expression %s in result", e)
-			}
-		}
+		// Every expression should be present — check for the exact token
+		// (comma or parenthesis delimited) to avoid substring false-positives.
+		for _, e := range exprs {
+			// A column appears as ", colN" or "(colN" inside the concat_ws calls.
+			if !strings.Contains(result, ", "+e) && !strings.Contains(result, "("+e) {
+				t.Errorf("missing expression %s in result: %s", e, result)
+			}
+		}

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@db/queries/queries_test.go` around lines 449 - 455, The check using
strings.Contains(result, e) is vulnerable to substring collisions (e.g., "col1"
matching "col10"); update the loop that iterates over exprs and checks result to
use a word-boundary regex match instead: compile regexp.MustCompile(`\b` +
regexp.QuoteMeta(e) + `\b`) and use its MatchString(result) in place of
strings.Contains; ensure you add the regexp import and use exprs/result as
before so each column name is matched as a whole token.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In `@db/queries/queries_test.go`:
- Around line 457-468: Update the "200 expressions nests into 3 inner batches"
subtest to also assert that all expressions are present in the result: after
calling concatWSBatched(exprs) iterate the expected expression names (e.g.,
"col0".."col199") and assert each appears in result (or assert the total
occurrences of "col" labels equals 200). Modify the test that calls
concatWSBatched to include this presence check so a dropped batch would fail;
reference the concatWSBatched call and the local exprs slice when adding the
assertions.
- Around line 449-455: The check using strings.Contains(result, e) is vulnerable
to substring collisions (e.g., "col1" matching "col10"); update the loop that
iterates over exprs and checks result to use a word-boundary regex match
instead: compile regexp.MustCompile(`\b` + regexp.QuoteMeta(e) + `\b`) and use
its MatchString(result) in place of strings.Contains; ensure you add the regexp
import and use exprs/result as before so each column name is matched as a whole
token.

In `@db/queries/queries.go`:
- Around line 583-599: The concatWSBatched function needs a short doc comment
stating it assumes only single-depth nesting (the outer concat_ws is not batched
and therefore must not exceed PostgreSQL's ~100-arg limit in this usage) and
must guard against an empty input slice; update concatWSBatched to return an
empty string immediately when len(exprs) == 0 (or otherwise no-op) to avoid
emitting invalid SQL like "concat_ws('|', )", and add the comment near the const
maxArgs / batches logic referencing concatWSBatched and batches so future
readers understand the single-depth assumption.

coderabbitai bot reviewed Feb 18, 2026

View reviewed changes

mason-sharp merged commit f8374a4 into main Feb 18, 2026
3 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Batch concat_ws calls to support tables with 100+ columns#80

Batch concat_ws calls to support tables with 100+ columns#80
mason-sharp merged 1 commit intomainfrom
fix/ACE-168-wide-tables

mason-sharp commented Feb 18, 2026

Uh oh!

coderabbitai bot commented Feb 18, 2026 •

edited

Loading

Walkthrough

Changes

Poem

❌ Failed checks (1 warning)

Uh oh!

coderabbitai bot left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

mason-sharp commented Feb 18, 2026

Uh oh!

coderabbitai bot commented Feb 18, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Poem

❌ Failed checks (1 warning)

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

coderabbitai bot commented Feb 18, 2026 •

edited

Loading