Back to Blog

Mistral Large disappoints

Mistral announced Mistral Large, a new language model achieving 81.2% accuracy on MMLU, trailing GPT-4 Turbo by about 5 percentage points on benchmarks. The community reception has been mixed, with skepticism about open sourcing and claims that Mistral Small outperforms the open Mixtral 8x7B. Discussions in the TheBloke Discord highlighted performance and cost-efficiency comparisons between Mistral Large and GPT-4 Turbo, technical challenges with DeepSpeed and DPOTrainer for training, advances in AI deception for roleplay characters using DreamGen Opus V1, and complexities in model merging using linear interpolation and PEFT methods. Enthusiasm for AI-assisted decompilation was also expressed, emphasizing the use of open-source projects for training data.

Read original post