Introduction to Navarasa 2.0
Navarasa 2.0 is a fine-tuned Large Language Model. Its generative prowess encompasses 15 Indian languages and English.
- Hindi
- Telugu
- Tamil
- Kannada
- Malayalam
- Marathi
- Gujarati
- Bengali
- Punjabi
- Odia
- Urdu
- Konkani
- Assamese
- Nepali
- Sindhi
- English
Navarasa 2.0 has the potential to streamline tasks for India’s diverse linguistic population. In this article, we will focus on its capabilities in the Telugu language.
Model Capabilities
Similar to its predecessor, this model exhibits the following capabilities:
1. Input and instruction in Native X language, with the output generated in the same Native X language.
2. Input and instruction in English, with responses generated in Native X language.
3. Instruction in Native X language, input in English, and output in Native X language.
In this article, we’ll showcase the step-by-step process of deploying Navarasa 2.0 locally. Following that, we’ll delve into querying and testing its technical capabilities.
Prerequisites:
- Basic knowledge of Hugging Face.
- How to launch a GPU node - you can check out the GPU offerings at E2E Networks. Here’s a detailed pricing of the GPUs. We recommend launching a V100 GPU Node.
Guide to Deploying the Model
- Install the necessary libraries.
2. Import the necessary dependencies.
3. Load the model and tokenizer.
4. Define your question and encode the question using the tokenizer.
5. Generate the response using the model (beam search for better results).
6. Decode the generated tokens back to text and print the answer.
Output:
భారతదేశ రాజధాని ఏమిటి?
### Input:
### Response:
నూతన ఢిల్లీ అని కూడా పిలువబడే న్యూ
Inference
Question 1:how many countries are there in the world
Output:
ప్రపంచంలో ఎన్ని దేశాలు ఉన్నాయి?
### Input:
### Response:
2021 నాటికి, 195 గుర్తింపు పొందిన సార్వ
Question 2: calculate 1+1
Output:
చల్చులాతే 1 + 1
### Input:
### Response:
ఒక సంఖ్య 2
Packaging It All into a Gradio Bot
We’ll define a function that will take in a user query and return the LLM generated response.
Below is a simple chatbot code for the Gradio UI.
Conclusion
Navarasa 2.0, fine-tuned for Indian languages, has the potential to be a game-changer. By understanding regional languages and nuances, it could bridge the digital divide, empower local communities, and foster cultural preservation. This technology holds promise in the fields of education, content creation, and communication across the Indian subcontinent. In general, Indic language LLMs can help with the following:
- Bridging the Digital Divide: A vast portion of India’s population speaks languages other than English. LLMs in these languages can make information and services accessible by translating interfaces, generating content in local languages, and enabling voice interactions in familiar tongues.
- Education for All: LLMs can personalize learning by creating educational materials in local languages. Imagine textbooks, tutorials, and even intelligent tutoring systems that adapt to a student’s native language and learning style.
- Enhanced Communication: LLMs can remove language barriers within India. They can translate documents and real-time conversations, allowing people from different regions to connect and collaborate more effectively.
- Content Creation Boom: LLMs can empower local creators by helping them generate content in their native languages. This can lead to a surge of regional literature, music, and other creative expressions reaching wider audiences.
- Government Services & Inclusion: LLMs can simplify government interactions for citizens by enabling applications, forms, and communication in local languages. This fosters inclusivity and ensures everyone has equal access to government services.
Overall, Indic language LLMs hold the potential to significantly simplify life for India’s diverse population by bridging language gaps and empowering people to participate fully in the digital age.