Skip to content

Conversation

@nil0x9
Copy link
Contributor

@nil0x9 nil0x9 commented Nov 17, 2025

No description provided.

@nil0x9 nil0x9 changed the title Dev add model internal metrics [Enhance] add model internal metrics Nov 17, 2025
device=hidden_states.device)

attn_output: torch.Tensor = self.attn_impl_func( # type: ignore
attn_output, extra_info = self.attn_impl_func( # type: ignore
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's not a good idea to change the attention function signature like this. Instead, we should define an AttnOutput type (using TypedDict or namedtuple) to represent the attention result."

ctx.cu_seqlen = cu_seqlen

return o
return o, lse
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The lse should only be returned when specific flags are passed in. Otherwise, only the attention output should be returned. This makes for a cleaner attention interface.


def register_attn_extra_info_hook(self, module, layer_name=None):
def hook(module, input, output):
extra_info = output[1]
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggest refactoring with the AttnOutput


# do dummy forward to get metrics
for i in range(0, len(data_batches), self.intra_layer_micro_batch):
data_batch = data_batches[i : i + self.intra_layer_micro_batch]
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it necessary to consider intra_layer_micro_batch?

@HAOCHENYE HAOCHENYE changed the base branch from main to dev November 17, 2025 13:36
@HAOCHENYE HAOCHENYE force-pushed the dev-add-model-internal-metrics branch from f4014fc to 97f1564 Compare November 17, 2025 13:37
@nil0x9 nil0x9 force-pushed the dev-add-model-internal-metrics branch 9 times, most recently from e557bee to 39f7de0 Compare November 19, 2025 11:32
@nil0x9 nil0x9 force-pushed the dev-add-model-internal-metrics branch from 39f7de0 to 0a5f008 Compare November 19, 2025 11:54
@HAOCHENYE HAOCHENYE merged commit 217dd0f into InternLM:dev Nov 19, 2025
2 of 4 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants