LLaMA2 model is open source for commercial use: comparable to ChatGPT in strength, exploring new heights of AI

2024-07-08

[Large Model] Commercially available and more powerful LLaMA2 is here

Introduction to LLaMA2

July 19, 2023: Meta releases the open source commercial model Llama 2.

Llama 2 is a collection of pre-trained and fine-tuned generative text models ranging in size from 7 billion to 70 billion parameters.

The fine-tuned LLMs, called Llama-2-Chat, are optimized for conversational use cases. The Llama-2-Chat model outperforms open-source chat models on most benchmarks we tested and is on par with some popular closed-source models such as ChatGPT and PaLM in human evaluations of usefulness and security.

LLaMA-2-chat is almost the only open source model that has done RLHF. After 5 rounds of RLHF, LLaMA-2 showed better performance than ChatGPT under the evaluation of Meta's own reward model and GPT-4.

Model List

Llama2-chat：

Llama2-chat-7B

Llama2-chat-13B

Llama2-chat-70B

Other models please see:
https://huggingface.co/meta-llama

Training Data

Trained on a dataset of over 2 trillion tokens.
The fine-tuning data includes a publicly available instruction dataset, as well as over 1 million new human-annotated examples.
The deadline for pre-training data is September 2022

Training Information

All models are trained with a global batch size of 4M tokens.
The larger 70 billion parameter model uses Grouped-Query Attention (GQA) to improve inference scalability.
The training period is from January 2023 to July 2023.
Is a plain text model.
During pre-training, 330,000 GPU hours were spent on A100-80GB.