On 18 July 2023, Meta AI open-sourced their LLM, the Llama-2, which is a variant of Llama-1. The internet is going gaga about it. It is the next generation of open-source large language models which outperforms other LLMs. It is free for research and commercial use.
There are three variants of this LLM:
- Llama-2-7B
- Llama-2-13B
- Llama-2-70B
These models are trained on massive datasets of text and code, which allows them to learn the general principles of language and generate more natural-sounding text. The computing power required to train these models is also significant – which would be difficult for many businesses and individuals to access. However, by making these models available as open source, businesses and individuals can now benefit from the power of LLMs without having to invest in the development of computing power. This opens up a world of opportunities for businesses, startups, entrepreneurs, and researchers to experiment, innovate, and ultimately benefit economically and socially.
For example, businesses can use LLMs to improve customer service, develop new products, or generate creative content. Startups can use LLMs to accelerate their product development and gain a competitive edge. Entrepreneurs can use LLMs to develop new business ideas and generate leads. And researchers can use LLMs to conduct new research and generate new insights.
Architecture and Model of Llama-2
This is an overview of how the Llama-2 architecture works. It consists of three steps:
- Pretraining
- Human feedback
- Fine-tuning
Pretraining
The model is first trained on a large corpus of text and code. It learns from the data – and the speciality of the Llama-2 model is that it has been trained on data that has been released till July 2023. This way it is up-to-date and learns the general principles of language.
The above plot shows the training loss of Llama-2 Models.
Human Feedback
In the second step, the model collects human feedback, which further helps in training the model.
Fine-Tuning
The model is then fine-tuned on dialogue data, which makes the responses to the human prompts more natural.
The above plots show the RLHF impact of the temperature when sampling N outputs and scoring them with a reward model.
Implementation of Llama-2-13B on E2E Cloud
- Create an Ubuntu 22.04 GPU node on E2E.
- Select a 40GB Machine and hit Create.
- Check on Enable Backup and hit Create.
- The node will now be created with the following specifications:
- Login to the E2E server using ssh via terminal:
- After you are logged in, cd into /mnt
- Clone this repository from Hugging Face:
- Install GPTQ:
- Make a new file ‘script.py’:
- Keep changing the prompts:
Out:
Out:
Out:
Out:
Conclusion
Meta has been using publicly available data to train the model and we have seen the demonstration of the same in this article. We can use it for our own private research, or for business or academic purposes. E2E Networks is a user-friendly platform to use such models and obtain results.