Mistral-small explicitly has inference costs of a 12.9b, but more than that, it'...

Mistral-small explicitly has inference costs of a 12.9b, but more than that, it's probably ran with batch size of 32 or higher. They'll worry more about offsetting training costs than about this.

Here's how it works in reality:

https://docs.mystic.ai/docs/mistral-ai-7b-vllm-fast-inferenc...