Issues with Persisting and Retrieving Keyword Index in PostgreSQL #14348
Replies: 3 comments 1 reply
-
|
Hey there, @kwyszynski-cervello! I'm here to help you out with any bugs, questions, or contributions. Let's tackle this issue together! 🤖 To ensure that keyword indexes are properly stored and retrieved across sessions in LlamaIndex version 0.10.42 when using a PostgreSQL database, you need to implement a persistent storage mechanism for the Here is an example of how you can achieve this:
import psycopg2
import json
def save_keyword_table_to_postgres(keyword_table: KeywordTable, connection_string: str):
conn = psycopg2.connect(connection_string)
cursor = conn.cursor()
# Convert the keyword table to a JSON string
keyword_table_json = json.dumps(keyword_table.table)
# Insert or update the keyword table in the database
cursor.execute("""
INSERT INTO keyword_tables (id, table_data)
VALUES (%s, %s)
ON CONFLICT (id) DO UPDATE SET table_data = %s
""", ('keyword_table_id', keyword_table_json, keyword_table_json))
conn.commit()
cursor.close()
conn.close()
import psycopg2
import json
def load_keyword_table_from_postgres(connection_string: str) -> KeywordTable:
conn = psycopg2.connect(connection_string)
cursor = conn.cursor()
# Retrieve the keyword table from the database
cursor.execute("SELECT table_data FROM keyword_tables WHERE id = %s", ('keyword_table_id',))
result = cursor.fetchone()
cursor.close()
conn.close()
if result:
keyword_table_json = result[0]
keyword_table_dict = json.loads(keyword_table_json)
return KeywordTable(table=keyword_table_dict)
else:
return KeywordTable(table={})
class PersistentKeywordTableIndex(KeywordTableIndex):
def __init__(self, connection_string: str, *args, **kwargs):
super().__init__(*args, **kwargs)
self.connection_string = connection_string
self._index_struct = load_keyword_table_from_postgres(self.connection_string)
def save(self):
save_keyword_table_to_postgres(self._index_struct, self.connection_string)With these implementations, you can ensure that the keyword indexes are properly stored in a PostgreSQL database and retrieved across sessions. You can call the |
Beta Was this translation helpful? Give feedback.
-
|
Did you find a solution to this? I also have trouble getting keyword table index to work. |
Beta Was this translation helpful? Give feedback.
-
|
I believe I found the issue. It looks like there are two things happening here:
Here is how to fix it: 1. Update your Database ConfigurationYou need to initialize a from llama_index.storage.index_store.postgres import PostgresIndexStore
from llama_index.core import load_index_from_storage
# ... (Your existing params for vector and docstore) ...
# Define params for the Index Store
index_db_params = {
"host": "...",
# ... rest of your connection details ...
"table_name": "index_store" # Choose a table name
}
# Initialize the stores
vector_store = PGVectorStore.from_params(**vector_db_params)
docstore = PostgresDocumentStore.from_params(**doc_db_params)
index_store = PostgresIndexStore.from_params(**index_db_params) # <--- Add this
# Add index_store to the context
storage_context = StorageContext.from_defaults(
vector_store=vector_store,
docstore=docstore,
index_store=index_store
)2. Update your Retrieval CodeInstead of # Re-initialize the storage_context with the connections (as shown above)
# Load the index from Postgres
keyword_index = load_index_from_storage(
storage_context,
service_context=service_context
)This should ensure the keyword mappings persist in Postgres alongside your vectors and documents. Hope this helps! |
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
-
Hello,
I am using LlamaIndex version 0.10.42 and have encountered an issue where keyword indexes are not being stored or retrieved from a PostgreSQL database across different sessions. My documents and nodes are processed and stored in both docstore and vector store without issues, and other data types are indexing correctly within the same database setup.
Issue Description:
When I ingest and attempt to persist the keyword index within the same session or execution, everything functions correctly—I can retrieve the required keywords and their links to individual nodes. However, if I close the session and later attempt to retrieve the keyword index in a new session, the retrieval process returns nothing. There seems to be no keyword data stored persistently in either the docstore or the vector store, despite no error messages being produced.
Code for Processing and Persisting the Keyword Index:
Code for Retrieving the Keyword Index:
Database Configuration:
I have set up separate configurations for the vector store and the docstore which work perfectly in other applications. Here is the corrected configuration:
Could you provide any insights or suggestions on how to ensure that keyword indexes are properly stored and retrieved across sessions? Is there additional configuration required for keyword data that might differ from other types of data?
Thank you for your help.
Beta Was this translation helpful? Give feedback.
All reactions