Skip to content

Conversation

@YuryYakhno
Copy link

SantaCoder model uses different (but very similar) special tokens, comparing to StarCoder model. The current settings contain template only for StarCoder, so it appears to be logical just to change "bigcode/starcoder" to "bigcode/santacoder" in "Model ID or Endpoint" setting. But actually it is not enough, because SantaCoder tokens start with "fim-", while StarCoder uses tokens starting with "fim_". It is hard to notice by brief settings overview. If wrong FIM tokens are used, it leads to improper work of SantaCoder: "fim_..." tokens are parsed as text, and the model adds them to the output from time to time.

This issue was discussed in SantaCoder's model page. To prevent this issue in the future without changing the SantaCoder's interface, I propose to add a separate template for SantaCoder with proper special tokens.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant