The Llama 3 Herd of Models

LLaMA 3 模型家族

📅 2024-07-23👤 Meta AI📄 arXiv: 2407.21783

LLaMA开源基础模型128K上下文指令微调

中文摘要

LLaMA 3是Meta的第三代开源语言模型，提供8B和70B参数版本。该模型使用超过15万亿token的训练数据，在语言理解、推理、编码等广泛任务上实现了最先进水平。LLaMA 3采用分组查询注意力（GQA）、SwiGLU激活函数等先进架构，并支持128K上下文窗口。LLaMA 3-Instruct经过指令微调，在对话和指令遵循方面表现出色。

Llama 3 is Meta's third-generation open-source language model, available in 8B and 70B parameter versions. Trained on over 15 trillion tokens, it achieves state-of-the-art performance across language understanding, reasoning, coding, and more. Llama 3 uses Grouped-Query Attention, SwiGLU activation, and supports 128K context window.

快速链接

PDF 下载 arXiv 原文网页查看全文

📄 PDF 原文预览

中文翻译进度 72 / 194 段 (37%)

← 厂商论文列表首页