Skip to content

doc: eql indexing walkthrough#167

Open
tobyhede wants to merge 2 commits intomainfrom
walkthrough
Open

doc: eql indexing walkthrough#167
tobyhede wants to merge 2 commits intomainfrom
walkthrough

Conversation

@tobyhede
Copy link
Contributor

@tobyhede tobyhede commented Mar 3, 2026

Summary by CodeRabbit

  • Documentation
    • Added a comprehensive walkthrough on PostgreSQL indexing in EQL: architecture and query execution flows; index term types and ranking; operator-class and indexing integration; hashing/equality and determinism considerations; STE vector and GIN handling; cryptographic notes and end-to-end examples for equality, range, LIKE, and containment queries.

@coderabbitai
Copy link

coderabbitai bot commented Mar 3, 2026

📝 Walkthrough

Walkthrough

Added a new Markdown walkthrough, "PostgreSQL Indexing in EQL: Complete Walkthrough," documenting index term embedding for encrypted JSONB payloads, operator classes and definitions, compare/hash functions, GIN containment, STE vector handling, index term types, and end-to-end query execution flows for equality, range, LIKE, and containment queries.

Changes

Cohort / File(s) Summary
Documentation
INDEX_WALKTHROUGH.md
New comprehensive walkthrough describing PostgreSQL indexing architecture for encrypted JSONB in EQL, operator class and operator definitions, compare/hash function contracts, index term types (b3, hm, bf, ob, ocf, ocv), STE vector handling, GIN containment via jsonb_array, and detailed query execution flows for equality, range, LIKE, and containment queries.

Estimated code review effort

🎯 1 (Trivial) | ⏱️ ~2 minutes

Poem

🐇 I nibble on index lines so neat,

encrypting carrots in JSONB sweet.
Operators hop and functions play,
Queries find crumbs along the way.
A little rabbit cheers the new guide today.

🚥 Pre-merge checks | ✅ 3
✅ Passed checks (3 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title 'doc: eql indexing walkthrough' directly and clearly describes the main change—adding comprehensive documentation for EQL indexing.
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch walkthrough

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (2)
INDEX_WALKTHROUGH.md (2)

480-481: Consider tracking the IMMUTABLE declaration for operator wrapper functions.

The documentation notes that operator wrapper functions (eql_v2."<", eql_v2.">", etc.) could be declared IMMUTABLE since they only call IMMUTABLE functions. While the current implementation relies on PostgreSQL defaults, explicitly declaring these functions as IMMUTABLE might enable better query optimization.

Would you like me to open an issue to track this potential enhancement?

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@INDEX_WALKTHROUGH.md` around lines 480 - 481, The operator wrapper functions
(eql_v2."<", eql_v2.">", etc.) and ordering helpers (eql_v2.lt, eql_v2.gt,
eql_v2.lte, eql_v2.gte) should be explicitly marked IMMUTABLE in their CREATE
FUNCTION declarations; update the SQL/DDL files where those functions are
defined to add the IMMUTABLE attribute to each corresponding CREATE FUNCTION
statement so PostgreSQL can use stronger optimizations, and run any relevant
test or migration steps to ensure no behavioral change.

7-524: Consider adding language specifiers to fenced code blocks.

Multiple fenced code blocks throughout the document lack language specifiers. For ASCII diagrams, consider using text or leaving them unlabeled with explicit comments. For pseudo-code sections (compare function flow, hash function flow, etc.), consider using text or pseudocode identifiers.

Example:

  • Line 7: ASCII diagram → ```text
  • Line 141: Compare function flow → ```text
  • Line 189: Hash function flow → ```text

This improves markdown rendering consistency and addresses the markdownlint warnings.

Note: The LanguageTool grammar hint about "btree" at line 73 is a false positive—lowercase "btree" is the correct PostgreSQL terminology for the B-tree access method.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@INDEX_WALKTHROUGH.md` around lines 7 - 524, Summary: Several fenced code
blocks in INDEX_WALKTHROUGH.md lack language specifiers causing markdownlint
warnings; add appropriate specifiers. Fix: locate the top ASCII diagram and all
pseudo-code blocks (e.g., the initial ASCII diagram, the "Compare function —
Heart of B-tree Indexing" flow, the "Hash function — Hash/Equality Contract"
flow, and other code-like blocks) and add language hints such as ```text for
ASCII/art and ```pseudocode or ```text for flow/pseudo-code; also add ```sql to
SQL snippets and ```text to JSON/diagram blocks where appropriate (notably
sections describing eql_v2.compare, eql_v2.hash_encrypted, eql_v2.jsonb_array,
and the tables/diagrams), keeping "btree" lowercase as-is.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In `@INDEX_WALKTHROUGH.md`:
- Around line 480-481: The operator wrapper functions (eql_v2."<", eql_v2.">",
etc.) and ordering helpers (eql_v2.lt, eql_v2.gt, eql_v2.lte, eql_v2.gte) should
be explicitly marked IMMUTABLE in their CREATE FUNCTION declarations; update the
SQL/DDL files where those functions are defined to add the IMMUTABLE attribute
to each corresponding CREATE FUNCTION statement so PostgreSQL can use stronger
optimizations, and run any relevant test or migration steps to ensure no
behavioral change.
- Around line 7-524: Summary: Several fenced code blocks in INDEX_WALKTHROUGH.md
lack language specifiers causing markdownlint warnings; add appropriate
specifiers. Fix: locate the top ASCII diagram and all pseudo-code blocks (e.g.,
the initial ASCII diagram, the "Compare function — Heart of B-tree Indexing"
flow, the "Hash function — Hash/Equality Contract" flow, and other code-like
blocks) and add language hints such as ```text for ASCII/art and ```pseudocode
or ```text for flow/pseudo-code; also add ```sql to SQL snippets and ```text to
JSON/diagram blocks where appropriate (notably sections describing
eql_v2.compare, eql_v2.hash_encrypted, eql_v2.jsonb_array, and the
tables/diagrams), keeping "btree" lowercase as-is.

ℹ️ Review info

Configuration used: defaults

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 8611719 and b08d85a.

📒 Files selected for processing (1)
  • INDEX_WALKTHROUGH.md

Dual-verification review found 22 issues across two independent reviewers.
Fixes applied:

Accuracy fixes:
- Fix operator table: < uses compare() = -1, > uses compare() = 1
- Fix ORE block algorithm: get_bit()/get_byte(), not bit()/byte()
- Qualify IMMUTABLE/STRICT/PARALLEL SAFE scope (not all functions)
- Fix version field type ("2" string → 2 integer)
- Fix "XOR" → modular addition for ORE indicator computation

Completeness fixes:
- Add new Section 2: Supported Index Types (btree, hash, GIN)
- Add missing <@ operator to operator table with commutator
- Add MERGES documentation note for all operators
- Add JOIN selectivity (scalarltjoinsel/eqjoinsel) to operator table
- Add <@ operator description in STE vector section
- Add ocf/ocv to Section 11.1 equality flow diagram
- Add ob exclusion explanation for GIN jsonb_array()
- Add ORE u64_8_256 naming convention explanation
- Add STE vector nesting clarification note

Clarity improvements:
- Simplify GIN section: lead with practical usage, push internals to subsection
- Fix Section 6 compare call chain (term extraction steps)
- Improve CLLW comparison description (cryptographic, not plain byte compare)
- Rephrase hash/equality contract explanation (broadest common denominator)
- Renumber all sections 1-14 sequentially
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🧹 Nitpick comments (1)
INDEX_WALKTHROUGH.md (1)

7-7: Add language tags to fenced code blocks for lint compliance.

Multiple fenced blocks are missing a language identifier (MD040). Please annotate them (text, sql, etc.) so docs tooling passes consistently.

Proposed fix pattern
-```
+```text
  ┌─────────────────────────────────────────────────────────────────┐
  ...
-```
+```

Also applies to: 161-161, 199-199, 247-247, 263-263, 282-282, 336-336, 407-407, 423-423, 433-433, 466-466, 480-480, 498-498, 548-548

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@INDEX_WALKTHROUGH.md` at line 7, Add explicit language tags to each fenced
code block in INDEX_WALKTHROUGH.md (e.g., replace ``` with ```text or ```sql as
appropriate) to satisfy MD040; apply this change to all occurrences referenced
in the review (including the blocks listed) by updating the opening fence to
include the correct language identifier while leaving the block content
unchanged, following the proposed fix pattern (use ```text for plain
ASCII/diagram blocks and more specific tags where content indicates a language).
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@INDEX_WALKTHROUGH.md`:
- Around line 429-431: The subsection numbering under the "## 11. Query
Execution Flow — End to End" section is inconsistent (subsections are labeled
"### 12.1", "### 12.2", "### 12.3", then "### 11.4"); update the subsection
headings so they use a consistent "11.x" sequence (e.g., change "### 12.1
Equality Query (`WHERE col = ?`)" and the other "12.x" headings to "### 11.1",
"### 11.2", "### 11.3", and ensure the last one becomes "### 11.4"), locate and
fix the other occurrences by searching for the subsection headings (the visible
strings like "Equality Query (`WHERE col = ?`)" and the following subsection
titles) at the noted nearby blocks (also apply the same renumbering to the
headings around lines referenced in the comment) so the entire "Query Execution
Flow" section uses consistent 11.x numbering.
- Line 568: Update the summary diagram to correct the typo: replace the
incorrect function name "hash_encryp()" with the correct "hash_encrypted()"
wherever it appears (e.g., in the diagram line containing "compare() │ │
hash_encryp() │ │ → jsonb[] @>"). Ensure the name matches other documentation
references to the hash_encrypted() function for consistency.

---

Nitpick comments:
In `@INDEX_WALKTHROUGH.md`:
- Line 7: Add explicit language tags to each fenced code block in
INDEX_WALKTHROUGH.md (e.g., replace ``` with ```text or ```sql as appropriate)
to satisfy MD040; apply this change to all occurrences referenced in the review
(including the blocks listed) by updating the opening fence to include the
correct language identifier while leaving the block content unchanged, following
the proposed fix pattern (use ```text for plain ASCII/diagram blocks and more
specific tags where content indicates a language).

ℹ️ Review info

Configuration used: defaults

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between b08d85a and f0b6dad.

📒 Files selected for processing (1)
  • INDEX_WALKTHROUGH.md

Comment on lines +429 to +431
## 11. Query Execution Flow — End to End

### 12.1 Equality Query (`WHERE col = ?`)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Fix subsection numbering under Query Execution Flow.

The subsection sequence is inconsistent (12.1, 12.2, 12.3, then 11.4) under section 11, which makes the walkthrough harder to follow.

Suggested numbering correction
-### 12.1 Equality Query (`WHERE col = ?`)
+### 11.1 Equality Query (`WHERE col = ?`)

-### 12.2 Range Query (`WHERE col > ? AND col < ?`)
+### 11.2 Range Query (`WHERE col > ? AND col < ?`)

-### 12.3 LIKE Query (`WHERE col ~~ ?`)
+### 11.3 LIKE Query (`WHERE col ~~ ?`)

-### 11.4 GIN Containment Query
+### 11.4 GIN Containment Query

Also applies to: 464-466, 478-480, 496-496

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@INDEX_WALKTHROUGH.md` around lines 429 - 431, The subsection numbering under
the "## 11. Query Execution Flow — End to End" section is inconsistent
(subsections are labeled "### 12.1", "### 12.2", "### 12.3", then "### 11.4");
update the subsection headings so they use a consistent "11.x" sequence (e.g.,
change "### 12.1 Equality Query (`WHERE col = ?`)" and the other "12.x" headings
to "### 11.1", "### 11.2", "### 11.3", and ensure the last one becomes "###
11.4"), locate and fix the other occurrences by searching for the subsection
headings (the visible strings like "Equality Query (`WHERE col = ?`)" and the
following subsection titles) at the noted nearby blocks (also apply the same
renumbering to the headings around lines referenced in the comment) so the
entire "Query Execution Flow" section uses consistent 11.x numbering.

┌────────▼────────┐ ┌───────▼───────┐ ┌───────▼────────┐
│ btree opclass │ │ hash opclass │ │ GIN (expr) │
│ FUNCTION 1 │ │ FUNCTION 1 │ │ jsonb_array() │
│ compare() │ │ hash_encryp() │ │ → jsonb[] @> │
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Correct function name typo in summary diagram.

hash_encryp() appears to be a typo; elsewhere the function is documented as hash_encrypted().

Proposed fix
-     │  compare()      │ │ hash_encryp() │  │  → jsonb[] @>  │
+     │  compare()      │ │ hash_encrypted() │  │  → jsonb[] @>  │
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
│ compare() │ │ hash_encryp() │ │ → jsonb[] @> │
│ compare() │ │ hash_encrypted() │ │ → jsonb[] @> │
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@INDEX_WALKTHROUGH.md` at line 568, Update the summary diagram to correct the
typo: replace the incorrect function name "hash_encryp()" with the correct
"hash_encrypted()" wherever it appears (e.g., in the diagram line containing
"compare() │ │ hash_encryp() │ │ → jsonb[] @>"). Ensure the name matches other
documentation references to the hash_encrypted() function for consistency.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant