Five months ago, Mistral.AI published their strategic memo (see here) where they mentioned that they will become a key player in the generative AI industry, emphasizing on innovation, ethical deployment, and the ambition to rival large existing companies like OpenAI. Today, it’s LLM, the Mistral 7B stands amongst the topmost ranked 7 Billion parameter model there is. It even outperforms Meta’s LLaMa2 13B language model. So what’s all the hype about? Why is it so powerful? We will dive deeper in this interactive post where I have also illustrated a step-by-step guide on how we can implement it on E2E Networks’ TIR/AI platform.
E2E Networks is India’s largest hyperscaler equipped with state-of-the-art NVIDIA graphics card. They offer virtual machines for heavy AI/ML workloads.
What sets Mistral 7B Apart ?
Mistral 7-B is a lightweight large language model that is surprisingly robust and outperforms most of the best performing Language Models. It surpasses Meta’s LLaMa2 13B in text generation and LLaMa 34B in mathematical and code generation. It even approaches coding performance of Code-LLaMa 7B.
A high level overview of its architecture
It leverages Grouped Query Attention and Sliding Window Attention. Grouped Query Attention increases inference speed and reduces memory requirement during decoding allowing for higher batch sizes hence higher throughput. This helps in real time applications like building a conversational agent. In layman terms, it reduces both computational and time complexity.
Getting Started on E2E Networks’ TIR-AI
The TIR-AI platform on E2E Networks provides an interactive jupyter labs interface. TIR-AI is the built-in jupyter lab interface on E2E. What is unique about it is the fact that it comes with various python frameworks embedded in the notebooks as per your use case. Let’s get started:
Login to My Account and you will be directed to the dashboard.
On clicking the TIR-AI Platform button we will be directed to the TIR-AI Platform
Hit on CREATE NOTEBOOK, and we will get the option to create a new jupyter notebook or import a notebook from an existing github repository.
We can name the notebook anything of our choice. In this particular case the default name is ‘tir-notebook-8177’.
Then, most important, we can select the frameworks depending on our use case. For running Mistral-7B, we will be choosing PyTorch 2.
These are the various images available on TIR-AI.
Next, we choose the GPU/CPU Plan we wish to go with.
The above are the CPU Plans. The GPU Plans are listed below:
I went with the paid version of GDC.A100-16.115GB.
We also have the option to add our local system’s SSH keys.
And then, we can create the notebook once all our options are selected. Select on the three dots when the notebook is running and click on Launch Notebook.
Now, the notebook is in the running stage.
Then we will be directed to the jupyter lab interface:
Click on Python 3 under Notebook. Now, we have finished setting up the jupyter lab. Now let us get into coding.
Problem Statement
We have already discussed the potentials of Mistral-7B. Now let us see what problem it can solve. I took a custom dataset of Sales-KRA from here. The problem most sales executives face when they are new to this field is, how to approach customers, how to generate leads, and many more. Here in this step-by-step guide, we will see a detailed walkthrough of how we can solve it by training the custom data with the Mistral 7B LLM.
Code Snippets
Step 1 : Install all the dependencies
Press Shift + Enter
To get the sales KRA data run this command:
Press Shift + Enter
Step 2 : Data Preparation
Press Shift + Enter
The above code snippet will prepare the data in the format of a dictionary saved in each file that can be accessed for training.
Press Shift + Enter
Step 3:Load Model and Dataset
Press Shift + Enter
Press Shift + Enter
Press Shift + Enter
Out:
Press Shift + Enter
Out:
plot_data_lengths(tokenized_train_dataset, tokenized_val_dataset)
110
Press Shift + Enter
Out:
Press Shift + Enter
Out:
Press Shift + Enter
Out:
Press Shift + Enter
out:
Press Shift + Enter
Out:
Press Shift + Enter
Out:
Press Shift + Enter
Press Shift + Enter
Out:
Out:
Press Shift+Enter
Step 4: Training the model
Press Shift + Enter
Out:
Press Shift + Enter
Press Shift + Enter
Testing
Press Shift + Enter
Out:
Final Thoughts
Mistral 7B is a lightweight open-source language model which has a lot of potential in the field of generative AI – be it code generation or content creation. How is it different from other models you might ask – it requires less compute power and is less time complexive.