https://github.com/cg123/mergekit
The dev's blog is great: https://goddard.blog/posts/
...But its not what this paper is describing. They are basically alternating models, AFAIK. Also I have other nitpicks with the paper, like using extremely old/mediocre chat models as bases:
> Pygmillion 6B, Vicuna 13B, Chai Model 6B