Update the RecLM-cgen, Named RecLM-uni #108

SZU-ZJW · 2026-01-23T13:00:57Z

Description

Checklist:

I have added description accordingly.
RecLM-cgen was changed to RecLM-uni
Updated the referenced of RecLM-uni
Added support for RecLM-token and the reward function in TitleRewritter，see RecLM-uni for detail.

SZU-ZJW · 2026-01-23T13:06:26Z

@microsoft-github-policy-service agree

Copilot

Pull request overview

This pull request renames RecLM-cgen to RecLM-uni and introduces significant new functionality for RecLM-token support, which uses RQ-VAE to encode items as special token sequences. The PR also adds GRPO training support and a plugin system for reward functions used in title rewrite training.

Changes:

Renamed RecLM-cgen directory to RecLM-uni and updated references
Added RecLM-token support with item code mapping and token registration
Introduced index/ module for RQ-VAE training and codebook generation
Added GRPO reinforcement learning training capabilities
Implemented plugin.py with reward functions for title rewrite training
Updated documentation and added new configuration guides

Reviewed changes

Copilot reviewed 23 out of 47 changed files in this pull request and generated 22 comments.

Show a summary per file

File	Description
RecLM-uni/trainer.py	Added item code mapping loading for RecLM-token support
RecLM-uni/train_utils/utils.py	Added `load_item_code_mapping` function and ITEM_CODE_FIELD constant
RecLM-uni/train_utils/model.py	Added `register_item_tokens` method for dynamic token registration
RecLM-uni/train_utils/dataset.py	Integrated token sequence retrieval in `get_item_index`
RecLM-uni/plugin.py	Reward functions for title rewrite training with performance optimizations
RecLM-uni/grpo_train.py	GRPO training implementation for policy optimization
RecLM-uni/GRPO/rl_dataset.py	RL dataset builder for GRPO training
RecLM-uni/index/*	Complete RQ-VAE module for item encoding to token sequences
RecLM-uni/task_test_tokenizer.py	Test script for RecLM-token evaluation
RecLM-uni/README.md	Comprehensive documentation for all three methods
RecLM-uni/PLUGIN_CONFIG.md	Configuration guide for plugin reward functions
RecLM-cgen/README.md	Removed old README (directory renamed)
README.md	Updated citation to RecLM-uni

Comments suppressed due to low confidence (1)

RecLM-uni/trainer.py:61

There's a trailing whitespace at the end of line 61. This should be removed for code cleanliness.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2026-01-24T08:14:34Z

RecLM-uni/plugin.py

+
+# --- 全局配置与客户端 ---
+# 建议将这些配置项在顶层统一定义
+import os


The import statement for 'os' should be placed with the other imports at the top of the file (lines 1-19), not after the logger initialization. This violates PEP 8 import ordering conventions.

Copilot · 2026-01-24T08:14:35Z

RecLM-uni/index/generate_indices.py

+from datasets import EmbDataset
+from models.rqvae import RQVAE


The 'datasets' and 'models.rqvae' imports are relative imports that should specify the package they're relative to, or use absolute imports. In a script file run directly, these relative imports will fail. They should be 'from index.datasets import EmbDataset' and 'from index.models.rqvae import RQVAE', or the script should be run as a module.

Suggested change

from datasets import EmbDataset

from models.rqvae import RQVAE

from index.datasets import EmbDataset

from index.models.rqvae import RQVAE

Copilot · 2026-01-24T08:14:35Z

RecLM-uni/index/main.py

+from datasets import EmbDataset
+from models.rqvae import RQVAE
+from trainer import  Trainer


Same import issue: 'datasets', 'models.rqvae', and 'trainer' should use relative imports (from .datasets, from .models.rqvae, from .trainer) since this is within the index package.

Suggested change

from datasets import EmbDataset

from models.rqvae import RQVAE

from trainer import Trainer

from .datasets import EmbDataset

from .models.rqvae import RQVAE

from .trainer import Trainer

Copilot · 2026-01-24T08:14:35Z

RecLM-uni/README.md

+
+### 4.2. Step 2: Train RQ-VAE Model
+
+Train the Residual Quantized Variational AutoEncoder(RQ-VAE) to learn item codebook mappings using the `index/` module from RecLM-LC1 or RecLM-cgen:


Documentation refers to "RecLM-LC1" which appears to be a typo or outdated reference. This should likely be "RecLM-uni" to be consistent with the PR's renaming effort.

Suggested change

Train the Residual Quantized Variational AutoEncoder(RQ-VAE) to learn item codebook mappings using the `index/` module from RecLM-LC1 or RecLM-cgen:

Train the Residual Quantized Variational AutoEncoder(RQ-VAE) to learn item codebook mappings using the `index/` module from RecLM-uni or RecLM-cgen:

Copilot · 2026-01-24T08:14:36Z

RecLM-uni/plugin.py

+    time.sleep(50)
+    return None


When all vLLM endpoints fail, the function sleeps for 50 seconds before returning None. This is a very long blocking sleep that could hang the entire application. Consider reducing this timeout or using exponential backoff with a maximum retry limit instead. Additionally, this should probably raise an exception rather than silently returning None after such a long wait.

Suggested change

time.sleep(50)

return None

raise RuntimeError(f"All vLLM chat endpoints failed for source '{source}'.")

Copilot · 2026-01-24T08:14:38Z

RecLM-uni/index/models/rqvae.py

@@ -0,0 +1,104 @@
+import numpy as np


Import of 'np' is not used.

Suggested change

import numpy as np

Copilot · 2026-01-24T08:14:39Z

RecLM-uni/task_test_tokenizer.py

@@ -0,0 +1,193 @@
+import argparse
+import copy


Import of 'copy' is not used.

Copilot · 2026-01-24T08:14:39Z

RecLM-uni/task_test_tokenizer.py

+from train_utils.utils import (
+    save_json,
+    get_ctrl_item,
+    rm_idx,
+    load_json,
+    load_pickle,
+    side_tokenizer,
+    process_train_sample,
+    load_item_code_mapping,
+    ITEM_CODE_FIELD,
+)


Import of 'load_pickle' is not used.

Copilot · 2026-01-24T08:14:39Z

RecLM-uni/index/utils.py

+    color_set = ["black", "red", "green", "yellow", "blue", "pink", "cyan", "white"]
+    try:
+        index = color_set.index(color)
+    except:


Except block directly handles BaseException.

Suggested change

except:

except ValueError:

Copilot · 2026-01-24T08:14:39Z

RecLM-uni/grpo_train.py

+        model_init = get_peft_model(base_model, lora_config)
+        try:
+            model_init.print_trainable_parameters()
+        except AttributeError:


'except' clause does nothing but pass and there is no explanatory comment.

Suggested change

except AttributeError:

except AttributeError:

# Some PEFT models do not implement `print_trainable_parameters`; ignore if unavailable.

Update the RecLM-cgen, Named RecLM-uni

490def6

Leavingseason requested a review from Copilot January 24, 2026 08:09

Copilot started reviewing on behalf of Leavingseason January 24, 2026 08:09 View session

Copilot AI reviewed Jan 24, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Update the RecLM-cgen, Named RecLM-uni #108

Update the RecLM-cgen, Named RecLM-uni #108

Uh oh!

SZU-ZJW commented Jan 23, 2026

Uh oh!

SZU-ZJW commented Jan 23, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

Copilot AI Jan 24, 2026

Uh oh!

Copilot AI Jan 24, 2026

Uh oh!

Copilot AI Jan 24, 2026

Uh oh!

Copilot AI Jan 24, 2026

Uh oh!

Copilot AI Jan 24, 2026

Uh oh!

Copilot AI Jan 24, 2026

Uh oh!

Copilot AI Jan 24, 2026

Uh oh!

Copilot AI Jan 24, 2026

Uh oh!

Copilot AI Jan 24, 2026

Uh oh!

Copilot AI Jan 24, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

		from datasets import EmbDataset
		from models.rqvae import RQVAE


		### 4.2. Step 2: Train RQ-VAE Model

		Train the Residual Quantized Variational AutoEncoder(RQ-VAE) to learn item codebook mappings using the `index/` module from RecLM-LC1 or RecLM-cgen:

	time.sleep(50)
	return None
	raise RuntimeError(f"All vLLM chat endpoints failed for source '{source}'.")

	except AttributeError:
	except AttributeError:
	# Some PEFT models do not implement `print_trainable_parameters`; ignore if unavailable.

Update the RecLM-cgen, Named RecLM-uni #108

Are you sure you want to change the base?

Update the RecLM-cgen, Named RecLM-uni #108

Uh oh!

Conversation

SZU-ZJW commented Jan 23, 2026

Uh oh!

SZU-ZJW commented Jan 23, 2026

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Copilot AI Jan 24, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Jan 24, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Jan 24, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Jan 24, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Jan 24, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Jan 24, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Jan 24, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Jan 24, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Jan 24, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Jan 24, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant