Skip to content

Conversation

@abhijat
Copy link
Contributor

@abhijat abhijat commented Nov 7, 2025

Defrag support is added for JSON compact objects for both cons and flat type encoding.

@abhijat abhijat force-pushed the abhijat/feat/defrag-json branch 5 times, most recently from a6e61b4 to dd35f06 Compare November 7, 2025 07:37
@abhijat abhijat marked this pull request as ready for review November 7, 2025 09:15
@abhijat abhijat requested review from kostasrim and romange November 7, 2025 09:15
@abhijat abhijat force-pushed the abhijat/feat/defrag-json branch from dd35f06 to e717440 Compare November 7, 2025 09:29
Copy link
Collaborator

@romange romange left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Flat json is not really used, do we use jsoncons in prod. I did not understand how you defragment json as it can.contain lots of pointers due to its hierarchical structure

@abhijat
Copy link
Contributor Author

abhijat commented Nov 7, 2025

do we use jsoncons in prod.

Yes jsoncons should the default encoding based on experimental_flat_json flag which is false by default, I could see its use in facade::OpStatus dfly::SetFullJson(...) which is called via json.set etc.

I did not understand how you defragment json as it can.contain lots of pointers due to its hierarchical structure

Yes, the current implementation works better for flat encoding as it is a single chunk of data, but for cons it only copies the root of the tree so it is not effective. It should walk down the tree and move everything for cons.

If we don't use flat encoding much and use mostly cons, then this PR needs more work. I will look into defragmenting the tree from the root downwards.

@abhijat abhijat marked this pull request as draft November 7, 2025 14:10
@abhijat
Copy link
Contributor Author

abhijat commented Nov 7, 2025

I will look into defragmenting the tree from the root downwards.

Actually, it doesn't look like this is possible, as far as I can see it might not be possible to go into the object's fields and individually reallocate them.

It might be possible to do the following more expensive operations:

  • serialize the object to string
  • convert it back to a JsonType using dfly::JsonFromString to get new allocations in the page reserved for malloc
  • move it into u_.json_obj.cons.json_ptr

@abhijat abhijat force-pushed the abhijat/feat/defrag-json branch from e717440 to 7cf8541 Compare November 8, 2025 12:51
@abhijat
Copy link
Contributor Author

abhijat commented Nov 8, 2025

  • serialize the object to string
  • convert it back to a JsonType using dfly::JsonFromString to get new allocations in the page reserved for malloc
  • move it into u_.json_obj.cons.json_ptr

I tried this approach in the latest commit, will test it a bit more

Taking an existing JSON object as input, it is serialized and then
deserialized, allocating a new object which is a copy of the old one,
but now exists in the page reserved for malloc.

It can be used to defragment a JSON object.

Signed-off-by: Abhijat Malviya <[email protected]>
@abhijat abhijat force-pushed the abhijat/feat/defrag-json branch from 7cf8541 to bb7650b Compare November 10, 2025 07:49
@abhijat abhijat marked this pull request as ready for review November 10, 2025 11:36
@abhijat abhijat requested a review from romange November 11, 2025 08:18
@abhijat
Copy link
Contributor Author

abhijat commented Nov 11, 2025

@romange does the current approach look better? We basically copy the JSON object, in the process reallocating memory

}

bool CompactObj::JsonConsT::DefragIfNeeded(PageUsage* page_usage) {
if (JsonType* old = json_ptr; page_usage->IsPageForObjectUnderUtilized(old)) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

note: the trigger condition is also problematic. the object may be part of fully utilized page but have hundreds of other heap objects. It is possible to solve this using the latest PATCH of @vyavdoshenko
here: dragonflydb/jsoncons@990b0a2

you can pass into compute_memory_size a callback that will return 0 or 1 if a pointer belongs to the under utilized page and if the final result is > 0, than there is at least one pointer that needs to be defragmented.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the patch is already present in jsoncons.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I used compute_memory_size but kept state (a boolean) outside to figure out if any pointer is in an underutilized page.

@abhijat abhijat force-pushed the abhijat/feat/defrag-json branch from bb7650b to 98c3fa4 Compare November 11, 2025 09:56
@abhijat abhijat requested a review from romange November 11, 2025 10:43
@abhijat abhijat enabled auto-merge (squash) November 11, 2025 10:52
@abhijat abhijat merged commit 767ed14 into main Nov 11, 2025
10 checks passed
@abhijat abhijat deleted the abhijat/feat/defrag-json branch November 11, 2025 11:02
abhijat added a commit that referenced this pull request Nov 11, 2025
* core: Add utility to reallocate a JSON object

Taking an existing JSON object as input, it is serialized and then
deserialized, allocating a new object which is a copy of the old one,
but now exists in the page reserved for malloc.

It can be used to defragment a JSON object.

* core: Add defrag support for json objects
* core: Add tests for json defrag

Signed-off-by: Abhijat Malviya <[email protected]>
(cherry picked from commit 767ed14)
abhijat added a commit that referenced this pull request Nov 11, 2025
* fix(evicition): Limit accumulation deleted bytes during eviction (#5995)

Cache state variable accumulated_deleted_bytes_during_eviction need to
have upper bound when accumulating to prevent over eviction.

Adjust deleted bytes for each shard between two heartbeats by comparing
previous shard used memory and current shard used memory.

Fixes #5945

Signed-off-by: mkaruza <[email protected]>
(cherry picked from commit 1125eb4)

* feat(core): Add defrag support for json objects (#6023)

Signed-off-by: mkaruza <[email protected]>
Signed-off-by: Abhijat Malviya <[email protected]>
Co-authored-by: mkaruza <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants