Conversation
|
Warning Rate limit exceeded
Your organization is not enrolled in usage-based pricing. Contact your admin to enable usage-based pricing to continue reviews beyond the rate limit, or try again in 21 minutes and 35 seconds. ⌛ How to resolve this issue?After the wait time has elapsed, a review can be triggered using the We recommend that you space out your commits to avoid hitting the rate limit. 🚦 How do rate limits work?CodeRabbit enforces hourly rate limits for each developer per organization. Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout. Please see our FAQ for further information. ℹ️ Review info⚙️ Run configurationConfiguration used: defaults Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (2)
📝 WalkthroughWalkthroughThis change adds course synchronization difference tracking to the system. A new nullable JSONB Changes
Estimated code review effort🎯 4 (Complex) | ⏱️ ~50 minutes Poem
🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
Made-with: Cursor
3d0803c to
daa53cf
Compare
There was a problem hiding this comment.
Actionable comments posted: 3
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@ferry/database/generate_changelog.py`:
- Around line 340-349: The current loop only processes
diff["courses"]["changed_rows"], so brand-new courses
(diff["courses"]["added_rows"]) never get field-level (old, new) entries and are
dropped downstream; extend the logic to also iterate
diff["courses"]["added_rows"] and populate course_id_to_changes for each new
course_id (respecting computed_columns["courses"]), using old_val = None (or an
appropriate sentinel) and new_val from courses_new_indexed.loc[course_id,
column]; ensure you initialize course_id_to_changes[course_id] as with changed
rows and reuse the same column filtering as in the existing loop.
In `@ferry/database/sync_db_courses.py`:
- Around line 833-840: The UPDATE that builds ids_sql from changed_course_ids
should use a bound expanding parameter instead of string-concatenating IDs;
update the conn.execute call that references changed_course_ids/ids_sql to use
an expanding bind (e.g., bindparam('ids', expanding=True) or SQLAlchemy's ":ids"
expanding param) and pass {"ids": changed_course_ids} so the query becomes
"WHERE last_sync_diff IS NOT NULL AND course_id NOT IN :ids" (or equivalent with
bindparam) to avoid huge SQL text and enable parameter binding for the course_id
list.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 44730d87-aad1-4c1c-9199-4a6ce5180b58
📒 Files selected for processing (3)
ferry/database/generate_changelog.pyferry/database/models.pyferry/database/sync_db_courses.py
| register_junction_changes( | ||
| course_id_to_changes, | ||
| diff, | ||
| tables_old, | ||
| tables, | ||
| "listings", | ||
| ["course_id"], | ||
| ) |
There was a problem hiding this comment.
Handle listing course_id moves on both courses.
For a listing that keeps the same listing_id but changes course_id, this path only registers the new course via changed_rows. The old course never gets the corresponding listing removal, so cross-listing merges/splits leave one side's changelog and last_sync_diff incomplete.
There was a problem hiding this comment.
Actionable comments posted: 1
♻️ Duplicate comments (1)
ferry/database/sync_db_courses.py (1)
874-882:⚠️ Potential issue | 🟡 MinorKeep the
last_sync_diffclear query parameterized.This still string-builds the
NOT IN (...)list, which bypasses bound parameters and can generate very large SQL on big syncs. Please switch it to a bound array / expanding parameter.Proposed fix
changed_course_ids = list(course_sync_diff_by_id.keys()) if changed_course_ids: - ids_sql = ",".join(str(int(cid)) for cid in changed_course_ids) conn.execute( text( - f"UPDATE courses SET last_sync_diff = NULL " - f"WHERE last_sync_diff IS NOT NULL AND course_id NOT IN ({ids_sql})" - ) + "UPDATE courses " + "SET last_sync_diff = NULL " + "WHERE last_sync_diff IS NOT NULL " + "AND NOT (course_id = ANY(CAST(:changed_course_ids AS int[])))" + ), + {"changed_course_ids": [int(cid) for cid in changed_course_ids]}, ) else: conn.execute(🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@ferry/database/sync_db_courses.py` around lines 874 - 882, The current UPDATE in sync_db_courses.py builds a NOT IN(...) list with string concatenation; change it to use a bound expanding parameter so the DB driver can handle large lists and binding. In the block that builds ids_sql and calls conn.execute(text(...)), replace that approach by using a bound parameter (e.g., bindparam("ids", expanding=True) or an :ids expanding param) and pass the changed_course_ids list as the parameter to conn.execute for the UPDATE statement that sets last_sync_diff = NULL where course_id NOT IN :ids; update the text/sqlalchemy call around conn.execute and the surrounding logic in the same function so it uses the expanding bound parameter instead of inlining ids_sql.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@ferry/database/sync_db_courses.py`:
- Around line 71-91: The helper course_scalar_last_sync_diffs currently only
processes diff["courses"]["changed_rows"], so new courses are omitted; update it
to also iterate diff["courses"]["inserted_rows"] and add entries for each
inserted course id (use int(row["course_id"])) building a fields dict for
non-computed columns where "old" is None and "new" comes from cn.loc[cid, col]
via _jsonable_scalar (skip columns in computed_columns["courses"]); merge these
into the existing out mapping the same way as changed rows so inserts produce a
last_sync_diff payload.
---
Duplicate comments:
In `@ferry/database/sync_db_courses.py`:
- Around line 874-882: The current UPDATE in sync_db_courses.py builds a NOT
IN(...) list with string concatenation; change it to use a bound expanding
parameter so the DB driver can handle large lists and binding. In the block that
builds ids_sql and calls conn.execute(text(...)), replace that approach by using
a bound parameter (e.g., bindparam("ids", expanding=True) or an :ids expanding
param) and pass the changed_course_ids list as the parameter to conn.execute for
the UPDATE statement that sets last_sync_diff = NULL where course_id NOT IN
:ids; update the text/sqlalchemy call around conn.execute and the surrounding
logic in the same function so it uses the expanding bound parameter instead of
inlining ids_sql.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 9b81392d-f98c-4acf-af48-2da719c9b05d
📒 Files selected for processing (2)
ferry/database/models.pyferry/database/sync_db_courses.py
🚧 Files skipped from review as they are similar to previous changes (1)
- ferry/database/models.py
| def course_scalar_last_sync_diffs( | ||
| diff: dict[str, DiffRecord], | ||
| tables_old: dict[str, pd.DataFrame], | ||
| tables: dict[str, pd.DataFrame], | ||
| ) -> dict[int, dict[str, dict[str, Any]]]: | ||
| co = tables_old["courses"].set_index("course_id") | ||
| cn = tables["courses"].set_index("course_id") | ||
| out: dict[int, dict[str, dict[str, Any]]] = {} | ||
| for _, row in diff["courses"]["changed_rows"].iterrows(): | ||
| cid = int(row["course_id"]) | ||
| fields: dict[str, dict[str, Any]] = {} | ||
| for col in row["columns_changed"]: | ||
| if col in computed_columns["courses"]: | ||
| continue | ||
| fields[col] = { | ||
| "old": _jsonable_scalar(co.loc[cid, col]), | ||
| "new": _jsonable_scalar(cn.loc[cid, col]), | ||
| } | ||
| if fields: | ||
| out[cid] = fields | ||
| return out |
There was a problem hiding this comment.
Include inserted courses in last_sync_diff.
This helper only walks diff["courses"]["changed_rows"], so newly inserted courses never get a payload and end the sync with last_sync_diff = NULL. That leaves course creations invisible to this feature.
Proposed fix
def course_scalar_last_sync_diffs(
diff: dict[str, DiffRecord],
tables_old: dict[str, pd.DataFrame],
tables: dict[str, pd.DataFrame],
) -> dict[int, dict[str, dict[str, Any]]]:
co = tables_old["courses"].set_index("course_id")
cn = tables["courses"].set_index("course_id")
out: dict[int, dict[str, dict[str, Any]]] = {}
for _, row in diff["courses"]["changed_rows"].iterrows():
cid = int(row["course_id"])
fields: dict[str, dict[str, Any]] = {}
for col in row["columns_changed"]:
if col in computed_columns["courses"]:
continue
fields[col] = {
"old": _jsonable_scalar(co.loc[cid, col]),
"new": _jsonable_scalar(cn.loc[cid, col]),
}
if fields:
out[cid] = fields
+
+ for _, row in diff["courses"]["added_rows"].iterrows():
+ cid = int(row["course_id"])
+ fields: dict[str, dict[str, Any]] = {}
+ for col in cn.columns:
+ if col == "course_id" or col in computed_columns["courses"]:
+ continue
+ fields[col] = {
+ "old": None,
+ "new": _jsonable_scalar(cn.loc[cid, col]),
+ }
+ if fields:
+ out[cid] = fields
+
return out🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@ferry/database/sync_db_courses.py` around lines 71 - 91, The helper
course_scalar_last_sync_diffs currently only processes
diff["courses"]["changed_rows"], so new courses are omitted; update it to also
iterate diff["courses"]["inserted_rows"] and add entries for each inserted
course id (use int(row["course_id"])) building a fields dict for non-computed
columns where "old" is None and "new" comes from cn.loc[cid, col] via
_jsonable_scalar (skip columns in computed_columns["courses"]); merge these into
the existing out mapping the same way as changed rows so inserts produce a
last_sync_diff payload.
Summary by CodeRabbit