Commit 4084e7f
authored
Passes long and short factors for phi3+ models using longrope (#3375)
In the canonical HF implementation of Phi3+ models, the longrope embedding
leverages both the long and short factors depending on sequence length.
This can be seen here: https://github.com/huggingface/transformers/blob/7b325cd573e40bbb12951b8446176c96e8b1afaa/src/transformers/modeling_rope_utils.py#L521
To achieve this in MLC, we need to pass both the long and short factors
to the KV Cache creation. The TVM side of this patch is here:
apache/tvm#184221 parent 7b15b19 commit 4084e7f
2 files changed
+6
-2
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
238 | 238 | | |
239 | 239 | | |
240 | 240 | | |
241 | | - | |
| 241 | + | |
| 242 | + | |
| 243 | + | |
242 | 244 | | |
243 | 245 | | |
244 | 246 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
143 | 143 | | |
144 | 144 | | |
145 | 145 | | |
146 | | - | |
| 146 | + | |
| 147 | + | |
| 148 | + | |
147 | 149 | | |
148 | 150 | | |
149 | 151 | | |
| |||
0 commit comments