Feature/vector abstraction#1
Open
phlealr wants to merge 6 commits into
Open
Conversation
PR1 of the vector-store refactor. Introduces a Driver abstraction (src/driver/Driver.ts) and re-wraps the existing OpenSearch logic as the 'opensearch' driver (src/driver/OpensearchDriver.ts) behind a vendor-agnostic plugin shell (src/VectorStore.ts). Adapting OpenSearch was minimal: AwsSigv4Signer setup, save/load/list/remove handlers, and buildQuery preserved as-is. Caller selects the driver via driver: '<name>' (or the new DriverName enum). Future drivers (pgvector in PR2) need a new file under src/driver/ + one line in the internal registry + one enum entry — the Record<DriverName, ...> shape forces all three to stay in sync at compile time. Adds: - docker-compose.yml with pgvector/pgvector:pg16 (for PR2) - pg + @types/pg deps (other deps unchanged) - VectorStore.test.ts smoke + utils tests (11 passing) - Backward-compat alias: options.index continues to work as options.table Existing OpenSearch tests stay env-var gated against live AWS (skip cleanly without SENECA_OPENSEARCH_TEST_NODE / _INDEX). CI: matrix narrowed to ubuntu-latest + Node 20/22, with a pgvector service for the upcoming PR2 integration tests. README is unchanged in this commit and will be rewritten alongside the pgvector driver work. Package renamed: @seneca/opensearch-store -> @seneca/vector-store. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
PgvectorDriver implements the full Driver contract: - connect/close over a pg Pool - upsert (INSERT ... ON CONFLICT), get (metadata only, no embedding) - query: KNN cosine via the <=> operator (score = 1 - distance), equality filters over jsonb, filter-only, and empty paths - remove / removeQuery (DELETE by id / by filter — the all$ path) Vectors are encoded with pgvector.toSql; LIMIT is parameterized; table and filter-key identifiers pass a strict regex guard; ids are generated per-driver when the caller omits one. Plugin: pgvector registered in the driver registry + DriverName enum; save validates the vector dim per canon. Tests restructured into two layers that match what this plugin actually owns: - translation (test/VectorStore.test.ts, MockDriver, no DB): verifies the plugin maps the Seneca entity API <-> Driver interface correctly — save strips id/vector into metadata, dim validation, load mapping, list filter partitioning + vector$ directive + custom$.score, remove/removeQuery dispatch. - driver integration (test/driver/PgvectorDriver.test.ts, real pgvector via docker): SQL/KNN ordering/score range/encoding/delete/identifier guards. OpenSearch live-AWS tests stay env-gated (skip without credentials). Stop tracking dist/ (now gitignored; it is a build artifact). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Generalize the OpenSearch template README to the vendor-agnostic vector store: same Seneca entity API across drivers, driver selected via the `driver` option. Documents the driver registry (opensearch, pgvector), vector save / KNN similarity (directive$.vector$) / equality filters / remove, pgvector SQL setup, per-driver differences, and the docker-compose test workflow. Keeps the original README's structure and tone. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
The plugin opened the DB connection from a seneca-promisify `prepare`
hook, which caused the store's `close` cmd to fire spuriously during
plugin init — tearing the pool back down so every save$/load$/list$
threw "connect() has not been called". The mock-driver tests missed it
(MockDriver methods don't need a live connection) and the pgvector
tests bypassed Seneca entirely.
Open the connection from the canonical store init action instead
(`seneca.add({init: store.name, tag: meta.tag}, ...)`), matching
seneca-postgres-store: Seneca runs it during ready(), before routing
store messages, so the connection is live before the first op. The
store `close` cmd ends the pool.
Also:
- Driver.remove/removeQuery are now required (every driver implements
them); drop the "driver does not support remove" guards and the
MockDriverNoRemove test scaffolding.
- Add test/VectorStorePg.test.ts: exercises the abstraction (the Seneca
entity API) against a real pgvector backend, gated on
SENECA_VECTOR_PG_URL. This is what the task asked for and what was
missing — it now passes (and originally exposed the lifecycle bug).
52 tests pass with SENECA_VECTOR_PG_URL set; 25 pass + the rest skip
cleanly without it.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
senecajs/todo-to-issue-action@master fails on every push (urllib3 TypeError in the action's Docker image; stale inputs). Disable the workflow rather than run a broken action: manual-only trigger + an always-skipped job, with a header comment explaining the breakage and how to re-enable via the maintained upstream alstr/todo-to-issue-action@v5. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Notes: I tried to deal with postgres the same way the store-postgres works, not sure if it was the best way
I added 2 new params in the options besides the pre existing in openSearch one; the pg one that receives credentials and the driver one so the caller can choose what driver to use.
i made some updates on yml so let me know if i shouldn't
What's in here
translates Seneca entity messages into driver calls and maps results back into entity shape.
AND-combined with similarity, parameterised SQL with an identifier guard against injection. Schema is not managed by the plugin (documented
DDL).
Connection lifecycle fix
The plugin originally opened the DB connection from a seneca-promisify prepare hook. That caused the store's close cmd to fire spuriously
during plugin init, tearing the pool back down — so every save$/load$/list$ threw connect() has not been called. The mock-driver tests
didn't catch it (mock methods don't need a live connection) and the pgvector tests bypassed Seneca.
Now the connection is opened from the canonical store init action (seneca.add({ init: store.name, tag: meta.tag }, …)), exactly like
seneca-postgres-store: Seneca runs it during ready(), before routing store messages, so the connection is live before the first op; the
store close cmd ends the pool.
Testing
Two-level suite:
A local pgvector is provided via Docker Compose:
docker compose up -d
export SENECA_VECTOR_PG_URL=postgres://postgres:postgres@localhost:5432/postgres
npm test
The translation tests run with no backend; the pgvector tests skip cleanly unless SENECA_VECTOR_PG_URL is set. 52 tests pass with the DB;
25 pass + the rest skip without it.