Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

mergekit is the tool you need to do this

  https://github.com/cg123/mergekit
you can slice off layers and blend models with different strategies.


Mergekit is the best thing since sliced bread, as the local llm community already knows.

The dev's blog is great: https://goddard.blog/posts/

...But its not what this paper is describing. They are basically alternating models, AFAIK. Also I have other nitpicks with the paper, like using extremely old/mediocre chat models as bases:

> Pygmillion 6B, Vicuna 13B, Chai Model 6B




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: