HippoRAG: First, do know(ledge) Graph
Alibaba released new open-source Qwen2 models ranging from 0.5B to 72B parameters, achieving SOTA results on benchmarks like MMLU and HumanEval. Researchers introduced Sparse Autoencoders to interpret GPT-4 neural activity, improving feature representation. The HippoRAG paper proposes a hippocampus-inspired retrieval augmentation method using knowledge graphs and Personalized PageRank for efficient multi-hop reasoning. New techniques like Stepwise Internalization enable implicit chain-of-thought reasoning in LLMs, enhancing accuracy and speed. The Buffer of Thoughts (BoT) method improves reasoning efficiency with significant cost reduction. A novel scalable MatMul-free LLM architecture competitive with SOTA Transformers at billion-parameter scale was also presented. *"Single-Step, Multi-Hop retrieval"* is highlighted as a key advancement in retrieval speed and cost.