Skip to content

Conversation

@hlu1
Copy link
Collaborator

@hlu1 hlu1 commented Nov 17, 2025

Motivation

Following deepseek-ai/DeepSeek-V3.2-Exp@8631a81, change the indexer weights_proj precision from bf16 to fp32.

#13439 fixed the rope issue.

Modifications

  • change the indexer weights_proj precision from bf16 to fp32
  • The fuse_wk_and_weights_proj optimization is no longer valid after the fix because the wk precision is bf16 and weights_proj precision is fp32. Remove it from the code base.
  • Add aime 2025 accuracy results and commands to the dsv32 doc. Also added gsm8k 20 shots results. Accuracy results are tested together with [Bug] Fixes accuracy issues caused by incorrect use of rope #13439 on top of commit 6448b4c

Accuracy Tests

See updated doc in PR

Benchmarking and Profiling

Checklist

@github-actions github-actions bot added documentation Improvements or additions to documentation deepseek labels Nov 17, 2025
@gemini-code-assist
Copy link
Contributor

Summary of Changes

Hello @hlu1, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request refines the Deepseek V3.2 model's weights_proj component by elevating its precision to FP32. This change, driven by external model updates, necessitates the removal of a previously existing fusion optimization that relied on consistent BF16 precision. Additionally, the PR enriches the model's documentation with new accuracy benchmarks for AIME 2025 and GSM8K, providing updated performance insights.

Highlights

  • Precision Update: The weights_proj component within the NSA indexer has been updated from BF16 to FP32 precision, aligning with recent Deepseek V3.2-Exp changes.
  • Optimization Removal: The fuse_wk_and_weights_proj optimization has been removed from the codebase. This is due to the precision mismatch between wk (BF16) and the newly FP32 weights_proj, rendering the fusion invalid.
  • Documentation Enhancement: The documentation for Deepseek V3.2 has been updated to include new accuracy results and commands for AIME 2025 and GSM8K (20-shot) benchmarks.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

@hlu1 hlu1 requested a review from trevor-m November 17, 2025 22:56
Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request changes the precision of the weights_proj layer in the NSA indexer to fp32 from bf16. This change aligns with an update in the original DeepSeek-V3.2 repository. As a consequence, the fuse_wk_and_weights_proj optimization is no longer valid due to a type mismatch and has been cleanly removed from the codebase. The necessary input casting to float() for the weights_proj layer has been correctly added. Additionally, the documentation is updated with new accuracy results for gsm8k and aime 2025. My review includes a couple of minor suggestions for the documentation to improve clarity and fix a typo.

@hlu1 hlu1 force-pushed the weight_proj_fp32 branch 2 times, most recently from 5aa0046 to b5a0096 Compare November 17, 2025 23:24
@hlu1
Copy link
Collaborator Author

hlu1 commented Nov 17, 2025

cc @Paiiiiiiiiiiiiii

@Fridge003 Fridge003 removed the run-ci label Nov 17, 2025

if not self.fuse_wk_and_weights_proj:
weights, _ = self.weights_proj(x)
weights, _ = self.weights_proj(x.float())
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe this gemm was orignally in the _get_logits_head_gate() function.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

deepseek documentation Improvements or additions to documentation

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants