Introduction
Recently, researchers from Chennai introduced the ‘World’s First Autonomous AI University Professor’, known as the ‘Malar Teacher’. This AI teacher / professor has been built with an understanding of the entire engineering syllabus of Anna University.
The most interesting fact about ‘Malar Teacher’ is that it is accessible through WhatsApp. Since it has access to all the recommended study materials, it can teach any concept from the syllabus, and simplify topics to a 10-year-old's comprehension level.
Leveraging WhatsApp, a widely used messaging application in India, was a masterstroke, as it widely increased the accessibility of this AI, and demonstrated how exactly AI might create a positive impact in India.
In this article, we will demonstrate how to build a similar AI Professor Chatbot, using purely open-source technologies and an open LLM - Mistral-7B. This can be highly relevant for future EdTech startups, universities, coaching institutes and other businesses operating in the Education sector.
Since access to WhatsApp API is through third-party partners, we will instead use Telegram for demonstration purposes. However, the core methodology, especially the AI application architecture used, would be very similar in case of WhatsApp. You would just need to replace the send/receive APIs.
Let’s get started.
Chatbot Workflow
We’ll be using the Retrieval Augmented Generation (RAG) technique for our chatbot. It’s a method of “grounding” an LLM’s response by connecting it to an external knowledge source. This is extremely useful when we want our applications to read and understand documents. By sending contextually relevant information from the documents to the LLM, one can receive information about them by querying the LLM in simple natural language.
Below is a diagrammatic workflow for our chatbot - it will help in understanding the tech under the hood.
Components used:
- Mistral-7B LLM hosted on Ollama
- Chromadb Vector Store
- LangChain for document ingestion and retrieval
- Telegram API & Python SDK
Let’s Get to the Code
For hosting the LLM we’ll be needing a GPU-server to handle the AI workload. E2E Networks provides a fleet of advanced GPUs which are tailored specifically for this purpose.
You can check out the GPU offerings by heading over to https://myaccount.e2e networks.com/. Click on Compute on the left-hand side, then click on Nodes, and Add New Node.
First, install all the necessary dependencies.
Now open Telegram and search for BotFather to get an API token for the Telegram bot. BotFather is a service offered by Telegram to manage and create new bots.
With the command /newbot you can name and create your own bot. You’ll receive an API_TOKEN of that bot.
Spin up a Jupyter notebook and import all the required libraries:
Set up the API_TOKEN, the logging info, the Ollama client, and a textsplitter object.
- The Telegram Bot API utilizes asynchronous programming to handle and execute requests concurrently.
- When an `await` statement is encountered, the program jumps to another ready task while the awaited task completes.
- An event loop manages the flow of execution whenever an `await` statement is encountered. It continuously checks for tasks that are ready to run and executes them based on their priority and readiness.
- The event loop takes control and executes other ready tasks when an `await` statement is reached.
- If another `await` statement is encountered, the event loop again takes control and decides the next action.
- Async programming ensures efficient utilization of resources by executing other parts of the code while one part is busy.
- Asynchronous programming is highly effective for Telegram bots handling thousands of users simultaneously.
Refer to this to start a Ollama server and pull the Mistra-7B model on your local server.
Load the embeddings model to convert text into vectors:
We create a function to convert PDF into Document format with a chunk size of 512 tokens each as defined by the text splitter.
We create a function which will refresh the user data (vector DB) every time the command ‘/start’ is sent. This will allow users to start over fresh.
Then we create a function to handle the document processing stage. Every time the user sends in a PDF document, it is indexed into the vector DB.
Given a query, this function will create a prompt for the LLM. The prompt will contain the context fetched from the Vector DB, along with the query.
Now we write a function to handle user queries. Note that all the Telegram functions are user specific. This means that for every user there’s a unique vector DB involved (as given by context.bot_data[‘user_id’]). This ensures that the bot can handle concurrent requests from multiple users without mixing up the document information provided by these users. This is because the users’ documents are separately indexed for every user.
Then we create the main function to deploy the bot.
Results
I sent this Arxiv paper to the bot. It’s called “Don’t Think about Pink Elephants”, and it’s about the fact that certain AI models, when prompted to not think about certain objects, invariably end up thinking about it - which is a flaw in their design, and is very similar to how human brains work.
Voila! We have created a Virtual Professor chatbot that can read and understand scientific papers and respond to your queries. Of course this chatbot can be used for many other use cases, as the LLM (Mistral 7B) it uses is a general purpose LLM and can understand queries and context in a variety of domains other than just EdTech.
Final Note
You can also implement a similar chatbot in Whatsapp, but that process is a little bit complicated as it requires signing up for a Whatsapp Business account, and then selecting a third-party service provider that offers Whatsapp API integration.