Gen AI is a rapidly evolving field that promises to revolutionize many industries and domains. Gen AI platforms enable users to create novel content, such as text, images, videos, music, code, and more, using advanced deep learning models. It has the potential to add a cumulative US$1.2-1.5 trillion to India's GDP by FY 2029-30, according to a report by EY India. The potential of Gen AI lies in its ability to enable people to achieve greater creativity, effectiveness, and efficiency in their work, while powering AI applications that improve customer experience and help us understand data. Therefore, in the last one year, Gen AI has been a top-of-mind topic for every CIO.
However, not all Gen AI platforms are created equal. Some are Closed Source, meaning that the underlying code, architecture, model weights and training algorithms are proprietary and not accessible to the public. Others are Open Source, meaning that anyone can view, modify, and contribute to the model weights, code, dataset and training algorithms.
What are the advantages and disadvantages of each approach? And what are the key questions that every CIO should ask before choosing a Gen AI platform for their organization? Here are some of the most important factors to consider:
1. Risks of Vendor Lock-In
Closed-Source AI creates situations where users become dependent on a single vendor or platform for their AI needs and have no choice but to accept their terms and conditions, pricing, and quality of service. In most enterprise use cases of AI, this means that the company doesn’t have any control over the quality of the AI model itself. Fine-tuning of the model on a specific use case isn’t possible; neither can one adapt its output based on user feedback. Additionally, the company would also be held hostage to the future trajectory of the proprietary platform, even if they don’t necessarily align with the best interest of the enterprise.
Open-Source AI, on the other hand, can offer tremendous flexibility as the organization can host it on their own cloud infrastructure, fine-tune it for their specific needs, and build AI pipelines or applications that leverage its capabilities. Over time, they can even refine the AI models through user feedback or switch it altogether for a newer and more advanced version. According to the latest benchmarks, the advanced Open-Source AI models are currently at par in their performance and accuracy with Closed-Source ones. Most importantly, they don’t lead to vendor lock-in.
In other words, for companies that are looking to simply test and try out AI models, Closed-Source AI platforms do the job. However, the moment you need to implement it for any serious use case, it is advisable that the company invests effort in building the AI pipeline in-house as the returns are far higher in the long run and the risks are much lower. Closed-Source AI models aren’t capable of handling any of the real-world enterprise use cases.
2. Ability to Handle Company- or Domain-Specific Use Cases
Another fundamental question that arises in the Closed-Source AI vs Open-Source AI debate is whether it is even possible to truly harness Closed-Source AI for real-life enterprise use cases. Closed-Source AI models are trained on publicly available datasets and lack the knowhow of the enterprise and their private data. This means that the only way to truly leverage them is through building advanced AI pipelines that incorporate technologies that can assist in incorporating the company’s knowledge base into the AI pipeline.
Language and tone are crucial components of any enterprise. It differs among individual companies. Each organization possesses its own distinct specialized language. To attain the right language and tone, one has to fine-tune LLMs. This is impossible in Closed-Source AI.
Secondly, in many cases, developers build AI pipelines where company’s private data is provided as context (a technique called RAG pipeline). Now, if a developer is already putting in effort towards creating a RAG pipeline, then it is a matter of an additional few steps to integrate it with Open-Source AI models.
In the world of LLMs, frameworks like LangChain and LlamaIndex have made it a breeze to work with any language model, and the entire interface has been standardized. This has simplified and standardized the process of creating AI applications, be it using Closed-Source or Open-Source AI.
This means that even if you built your first AI application using OpenAI APIs, you can easily switch to using Llama2 or Mistral variants. The best performing LLMs currently are Llama2 and Mixtral 8x7B. Alternatively, in some scenarios, even smaller models might suffice. Hugging Face leaderboard is a great resource to check the trending AI models for particular applications.
3. Risks from Lack of Transparency
As we have mentioned before, Closed-Source AI obscures how the software works, what data it uses, and what decisions it makes. This creates problems of trust, accountability, and fairness, as users may not know how the software affects their privacy, security, or rights. They may not be able to challenge or correct errors or biases in the software. This is a major problem for enterprise use cases, as incorrect responses or responses with a bias can have far-reaching implications.
Open-Source AI allows developers to inspect, audit, and verify the model's behavior and outcomes. This ensures that enterprise users can reduce bias in the models, makes sure that AI models don’t misbehave, and puts guardrails for any risky or faulty behavior.
4. Long-Term Cost Implications
Open-Source AI has several advantages over Closed-Source AI in terms of cost-effectiveness in the long run. Open-Source AI is typically free to use, and so the only initial investment is around the cloud GPU infrastructure required for serving the model.
In the case of Closed-Source AI, it is cheaper to get started, as it works on a per-token pricing model. However, in the long run, and especially in real-world scenarios of everyday use, the cost quickly adds up.
With Open-Source AI, the pricing remains predictable as the model is hosted on the company’s infrastructure, which typically has a fixed monthly cost. Since the same model would be able to serve many users, and there are no cost implications even with heavy usage, Open-Source AI often turns out to be far more cost-effective in real world scenarios.
5. Data Localization and Compliance
Data localization and compliance are two major concerns for any enterprise customer that deals with sensitive or proprietary information, such as customer data, financial records, trade secrets, intellectual property, etc.
Data localization refers to the legal requirement imposed by certain countries for companies to store and process data on servers physically located within the country's borders. Governments enforce this as a means to protect personal information from foreign surveillance and enhance national security, and to ensure that citizens' data is subject to local privacy laws and laws of the land.
In most cases, data localization is of significant importance due to compliance requirements mandated by the government. If companies violate this, they stand to incur substantial business risk.
Data breaches can have serious consequences, such as legal liabilities, reputational damage, loss of competitive advantage, and customer trust erosion. Data security, therefore, is one of the most crucial factors to consider for any CIO looking to leverage Gen AI. It is essential to choose an AI platform that can protect company data from unauthorized access, misuse, or leakage.
This is one of the biggest advantages of Open-Source AI. Since the company hosts the AI model on their own cloud infrastructure, and either builds an RAG pipeline, or fine-tunes the AI model with their own data, the data always remains fully in control of the company and within its own infrastructure. However, with proprietary Closed-Source AI, the platforms expect you to upload your sensitive data, and this poses several risks. You must trust the vendor or provider of the AI software without knowing what is going on under the hood and you have no way of knowing whether the AI software has any bugs or vulnerabilities that could expose your data to hackers, competitors, or even used by the vendor themselves. Moreover, you have no control over how the vendor uses your data for their own purposes, such as training their AI models, improving their products, or selling it to third parties.
Conclusion
These are some of the key questions that every CIO should ask before choosing a Generative AI platform. Ultimately, as you would discover sooner or later, Open-Source AI works far better as an option for enterprise and business users who are looking to deploy AI in their organizations.
If your organization or business needs assistance with AI strategy, we provide a free one-hour consultation. Please do not hesitate to contact us at sales@e2enetworks.com, with the subject line ‘AI Strategy Consultation’.