Technology Sharing

LLaMA2 model is open source for commercial use: comparable to ChatGPT in strength, exploring new heights of AI

2024-07-08

한어Русский языкEnglishFrançaisIndonesianSanskrit日本語DeutschPortuguêsΕλληνικάespañolItalianoSuomalainenLatina

[Large Model] Commercially available and more powerful LLaMA2 is here

Introduction to LLaMA2

July 19, 2023: Meta releases the open source commercial model Llama 2.

Llama 2 is a collection of pre-trained and fine-tuned generative text models ranging in size from 7 billion to 70 billion parameters.

The fine-tuned LLMs, called Llama-2-Chat, are optimized for conversational use cases. The Llama-2-Chat model outperforms open-source chat models on most benchmarks we tested and is on par with some popular closed-source models such as ChatGPT and PaLM in human evaluations of usefulness and security.

LLaMA-2-chat is almost the only open source model that has done RLHF. After 5 rounds of RLHF, LLaMA-2 showed better performance than ChatGPT under the evaluation of Meta's own reward model and GPT-4.

paper

https://ai.meta.com/research/publications/llama-2-open-foundation-and-fine-tuned-chat-models/

GitHub

address:
https://github.com/facebookresearch/llama

huggingface

address:
https://huggingface.co/meta-llama

Model List

Llama2-chat:

Llama2-chat-7B

Llama2-chat-13B

Llama2-chat-70B

Other models please see:
https://huggingface.co/meta-llama

Training Data

  1. Trained on a dataset of over 2 trillion tokens.
  2. The fine-tuning data includes a publicly available instruction dataset, as well as over 1 million new human-annotated examples.
  3. The deadline for pre-training data is September 2022

Training Information

  1. All models are trained with a global batch size of 4M tokens.
  2. The larger 70 billion parameter model uses Grouped-Query Attention (GQA) to improve inference scalability.
  3. The training period is from January 2023 to July 2023.
  4. Is a plain text model.
  5. During pre-training, 330,000 GPU hours were spent on A100-80GB.

Model Information

The context length is 4K.

license

Free for commercial use

Registration application required

refer to

https://ai.meta.com/research/publications/llama-2-open-foundation-and-fine-tuned-chat-models/

https://github.com/facebookresearch/llama

https://huggingface.co/meta-llama

Llama2-chat-7B

Llama2-chat-13B

Llama2-chat-70B