Hi, guys! I have a little question here:
From the principle of Wanda, the method uses a general approach for selecting which weights to prune, theoretically making it applicable to all models. However, why does this code only support a limited range of models instead of being universally applicable? What makes it different when implementing this method on general LLMs?
Can anyone help resolve my confusion or share their insights?
Hi, guys! I have a little question here:
From the principle of Wanda, the method uses a general approach for selecting which weights to prune, theoretically making it applicable to all models. However, why does this code only support a limited range of models instead of being universally applicable? What makes it different when implementing this method on general LLMs?
Can anyone help resolve my confusion or share their insights?