Test-Time Training, MobileLLM, Lilian Weng on Hallucination (Plus: Turbopuffer)
Lilian Weng released a comprehensive literature review on hallucination detection and anti-hallucination methods including techniques like FactualityPrompt, SelfCheckGPT, and WebGPT. Facebook AI Research (FAIR) published MobileLLM, a sub-billion parameter on-device language model architecture achieving performance comparable to llama-2-7b with innovations like thin and deep models and shared weights. A new RNN-based LLM architecture with expressive hidden states was introduced, replacing attention mechanisms and scaling better than Mamba and Transformer models for long-context modeling. Additionally, Tsinghua University open sourced CodeGeeX4-ALL-9B, a multilingual code generation model excelling in code assistance.