Mistral

Mistral 7B

Mistral 7B：高效开源大语言模型

📅 2023-10-10👤 Mistral AI Team📄 arXiv: 2310.06825

Mistral开源7B参数GQASWA

中文摘要

Mistral 7B是一个强大的7B参数开源语言模型，在大多数基准测试中性能超越Llama 2 13B。该模型采用分组查询注意力（GQA）、滑动窗口注意力（SWA）和前缀键值缓存优化等技术，实现了卓越的性能和效率平衡。Mistral 7B仅用约相当于Llama 2 13B一半的FLOPs就训练完成。

Mistral 7B is a powerful 7B parameter open-source language model that outperforms Llama 2 13B on most benchmarks. Using Grouped-Query Attention, Sliding Window Attention, and optimized key-value caching, it achieves an exceptional balance of performance and efficiency.

快速链接

PDF 下载 arXiv 原文网页查看全文

← 厂商论文列表首页