Skip to content

Conversation

@cawo-odoo
Copy link
Contributor

@cawo-odoo cawo-odoo commented Nov 24, 2025

[ADD] pg.query_ids: query large numbers of ids memory-safely

This is mainly the code that has been recently added to orm.recompute_fields, here we're making it re-usasble.

[IMP] orm.recompute_fields: use new pg.query_ids

This code in recompute_fields has been made re-usable in a new util pg.query_ids. Use that.

[FIX] models.remove_model: MemoryError

Traceback (most recent call last):
[...]
  File "/tmp/tmpipxrg2eq/migrations/util/models.py", line 563, in merge_model
    remove_model(cr, source, drop_table=drop_table, ignore_m2m=ignore_m2m)
  File "/tmp/tmpipxrg2eq/migrations/util/models.py", line 138, in remove_model
    it = chunks([id for (id,) in cr.fetchall()], chunk_size, fmt=tuple)
MemoryError

[IMP] pg.named_cursor: lessons learned from pg.query_ids

  • it is not exactly right to call into the Cursor's __enter__ and __exit__ methods
  • Cursor should be explicitly closed before it is garbage collected

@robodoo
Copy link
Contributor

robodoo commented Nov 24, 2025

Pull request status dashboard

@cawo-odoo cawo-odoo force-pushed the master-imp_query_ids_server_side-cawo branch from 568311e to ac8c6d0 Compare November 24, 2025 14:47
@cawo-odoo cawo-odoo force-pushed the master-imp_query_ids_server_side-cawo branch from ac8c6d0 to ffc872a Compare November 24, 2025 15:01
@KangOl
Copy link
Contributor

KangOl commented Nov 24, 2025

upgradeci retry with always only crm hr

@cawo-odoo cawo-odoo force-pushed the master-imp_query_ids_server_side-cawo branch from 6aa7709 to 543204d Compare November 24, 2025 21:11
@cawo-odoo
Copy link
Contributor Author

so, this avoids the memoryerror in upg-3307099 just fine

@cawo-odoo cawo-odoo marked this pull request as ready for review November 25, 2025 14:29
@cawo-odoo cawo-odoo requested review from a team and asno-odoo November 25, 2025 14:29
Copy link
Contributor

@aj-fuentes aj-fuentes left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we could we follow the plan of "providing an object that would iterate ids and do the cleanup as early as possible". The cleanup would happen at iteration finished, __exit__, or __del__. Whatever happens first.

Technically:

  • The class is primarily an iterator, once exhausted it will do all the cleanup.
  • We guard against double iteration. (this pattern already exists in the utils)
  • We double down the cleanup at __del__ just in case.
  • We triple down the cleanup at __exit__ to leverage the context manager semantic. We don't need to __enter__ into _ncr. Current dummy __enter__ returning self is just fine.

Feel free to improve the stub suggestions below if you agree.

@cawo-odoo
Copy link
Contributor Author

cawo-odoo commented Nov 26, 2025

I pushed a version inspired by your suggestions, but I didn't use them verbatim. Please resolve conversations when you feel they have been addressed.

2 explanations up-front:

  • It does not work to raise a real exception from __iter__ when the cursor is closed, because util.chunks works that way that it needs to catch StopIteration twice before it really stops iterating.
  • afaik, returning a value from __del__ is useless, it is ignored always

@cawo-odoo cawo-odoo force-pushed the master-imp_query_ids_server_side-cawo branch 4 times, most recently from 8017ffb to 3b42f99 Compare November 27, 2025 13:50
@cawo-odoo cawo-odoo force-pushed the master-imp_query_ids_server_side-cawo branch 4 times, most recently from 610edfa to 6f069fb Compare November 27, 2025 20:56
@cawo-odoo cawo-odoo force-pushed the master-imp_query_ids_server_side-cawo branch 2 times, most recently from fbe3bc3 to 76ae21e Compare November 28, 2025 07:54
Comment on lines +1987 to +1984
if self._ncr:
if self._ncr.closed:
return
self._ncr.close()
self._cr.execute(format_query(self._cr, "DROP TABLE IF EXISTS {}", self._tmp_tbl))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now that we ensure both are assigned, we can reorder for simplicity?

Suggested change
if self._ncr:
if self._ncr.closed:
return
self._ncr.close()
self._cr.execute(format_query(self._cr, "DROP TABLE IF EXISTS {}", self._tmp_tbl))
self._cr.execute(format_query(self._cr, "DROP TABLE IF EXISTS {}", self._tmp_tbl))
if self._ncr and not self._ncr.closed:
self._ncr.close()

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would prefer not running the DROP TABLE when we know it has already been done.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's safer to run it always. It won't depend on any logic to "know that we have already done it". Leaking that table is worse than a noop DROP TABLE. But I'm OK with both options.

)
except psycopg2.IntegrityError as e:
if e.pgcode == errorcodes.UNIQUE_VIOLATION:
cr.rollback()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Leave this to the caller. In almost all cases we won't continue under this error and just fail the whole upgrade. The idea of wrapping the error is just to make it simpler to debug what failed.

Suggested change
cr.rollback()

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I.e. we will not prevent further "broken transaction" exceptions from happening (e.g. from __del__? Then I really don't see the point in catching this here.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(note that there is a typo, missing call to .format(query))

Then I really don't see the point in catching this here.

The point is the error message, for humans.
Compare:

Traceback (most recent call last):
  File "<input>", line 1, in <module>
    util.query_ids(self.env.cr, "SELECT * FROM (VALUES (1), (1)) AS x(x)")
  File "/home/odoo/src/upgrade/migrations/util/pg.py", line 1979, in __init__
    raise ValueError("The query for ids is producing duplicate values: {}".format(query))
ValueError: The query for ids is producing duplicate values: SELECT * FROM (VALUES (1), (1)) AS x(x)

vs

Traceback (most recent call last):
  File "<input>", line 1, in <module>
    util.query_ids(self.env.cr, "SELECT * FROM (VALUES (1), (1)) AS x(x)")
  File "/home/odoo/src/upgrade/migrations/util/pg.py", line 1969, in __init__
    cr.execute(
  File "/home/odoo/src/odoo/16.0/odoo/sql_db.py", line 324, in execute
    res = self._obj.execute(query, params)
psycopg2.errors.UniqueViolation: could not create unique index "pk__upgrade_query_ids_d321bc27483a4d24a8a79f4e8a4ca920_id"
DETAIL:  Key (id)=(1) is duplicated.

The latter leaks internal info that's hard to understand. What's pk__upgrade_query_ids_d321bc27483a4d24a8a79f4e8a4ca920_id? What's that weird table name?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK I see what you mean:

Traceback (most recent call last):
  File "/home/odoo/src/upgrade/migrations/util/pg.py", line 2018, in __del__
    self._close()
  File "/home/odoo/src/upgrade/migrations/util/pg.py", line 1990, in _close
    self._cr.execute(format_query(self._cr, "DROP TABLE IF EXISTS {}", self._tmp_tbl))
  File "/home/odoo/src/odoo/16.0/odoo/sql_db.py", line 327, in execute
    _logger.error("bad query: %s\nERROR: %s", tools.ustr(self._obj.query or query), e)
  File "/home/odoo/src/odoo/16.0/odoo/sql_db.py", line 485, in __getattr__
    raise psycopg2.InterfaceError("Cursor already closed")
psycopg2.InterfaceError: Cursor already closed

That can be solved by checking in __del__ if the cursor was closed

if not self._cr.closed:
   # drop table...

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't have a decisive opinion against the rollback call... it looks odd but I'll let @KangOl provide further advice.

@cawo-odoo cawo-odoo force-pushed the master-imp_query_ids_server_side-cawo branch from 76ae21e to a11d862 Compare November 28, 2025 15:19
This is mainly the code that has been recently added to `orm.recompute_fields`,
here we're making it re-usasble.
This code in recompute_fields has been made re-usable in a new util
pg.query_ids. Use that.
```
Traceback (most recent call last):
[...]
  File "/tmp/tmpipxrg2eq/migrations/util/models.py", line 563, in merge_model
    remove_model(cr, source, drop_table=drop_table, ignore_m2m=ignore_m2m)
  File "/tmp/tmpipxrg2eq/migrations/util/models.py", line 138, in remove_model
    it = chunks([id for (id,) in cr.fetchall()], chunk_size, fmt=tuple)
MemoryError
```

Some IR tables can be large. Avoid `cr.fetchall()` when getting ids by use of
pg.query_ids()
@cawo-odoo cawo-odoo force-pushed the master-imp_query_ids_server_side-cawo branch from a11d862 to 4fde1e6 Compare November 28, 2025 15:30
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants