MIXTRAL 8x22 B MOE LLM – ALL WE KNOW NEW MISTRAL AI OPENWEIGHTS NEW RELEASE

Опубликовано: 09 Ноябрь 2024
на канале: Rithesh Sreenivasan
478
13

If you like to support me financially, It is totally optional and voluntary. Buy me a coffee here: https://www.buymeacoffee.com/rithesh

mixtral 8x22B - things we know so far 🫡

*176B parameters
*performance in between gpt4 and claude sonnet (according to their discord)
*same/ similar tokeniser used as mistral 7b
*65536 sequence length
*8 experts, 2 experts per token: More
*would require ~260GB VRAM in fp16, 73GB in bnb
*uses RoPE
32000 vocab size
https://huggingface.co/mistral-commun...
  / 1777946948617605384  
https://twitter.com/MistralAI/status/...
  / 1778020589225091453  
https://www.linkedin.com/posts/philip...
https://huggingface.co/mistral-commun...
If you like such content please subscribe to the channel here:
https://www.youtube.com/c/RitheshSree...