Machine Learning Lifecycle Management

April 2, 2025

E2E is one of the largest cloud service providers in India. Aside from its stellar clientele, the company is also known for its efforts to advance India’s technological playfield. One such E2E activity is AITalks. The focus of this online seminar is to nurture an Indian AI community and lend them a platform for exchanging ideas and discussing the latest global trends in the field. This article shall expound on the AITalks webinar regarding machine learning lifecycle management by Mr Ramjee Ganti.

About the Speaker
Mr Ramjee Ganti is a founder and CEO at dblue.ai, a company which helps in managing machine learning life cycles. Mr Ganti is a strong supporter of open source technologies and has been involved in building and scaling technological products with multiple startups for over a decade now. He has served as the lead engineer for organisations like JustEat and BigDecisions; former of which was later acquired by FoodPanda and the latter by NewCorp.

What Does Machine Learning Lifecycle Mean?
The speaker defines a machine learning lifecycle as “an iterative process that spans cross-functional teams which define, build, deploy, monitor, operate and improve ML systems”. Now, what does that really mean? To understand what a machine learning life cycle is, one first needs to understand what really goes into building machine learning systems.

Contrary to the common notion, ML is not just about clever algorithms and coding. In actuality, ML algorithms make up only about 15-20% of any machine learning system. A lot more effort is required to create a setup that makes these algorithms useful. This setup comprises various sub-systems and processes like Configuration, Data Collection, Serving Infrastructure, Feature Extraction, Data Validation, Monitoring, Analysis tools, Process Management tools and Resource Management. All these processes need to work together in a specific order for the ML system to function, and this explicit order is called a machine learning lifecycle.

Roles and Responsibilities Involved in Machine Learning Life cycle:-
As explained in the previous section, the machine learning life cycle involves a wide array of processes. Naturally, one needs multiple highly-trained professionals to maintain and run such a comprehensive system. Listed below are some of the individual roles that are critical for any machine learning lifecycle management and their responsibilities according to the speaker,

● Data Engineer - Data engineers are primarily responsible for data collection. They gather all the available data from various channels and departments of the organisation. As for the external data requirements, data engineers employ different APIs for the job. Once all the data is pooled together, data engineers clean this data before passing it on to data scientists for further processing and analysis.

● Data Scientist/Researcher - In any ML development lifecycle, the researchers and data scientists are the actual model-makers. They design the algorithm and write the code in frameworks like TensorFlow to create various models. Once built, the data scientists train these models using the available database and test the outcomes generated. If satisfied, they move along the tested models to ML engineers for deployment.

● ML Engineers and Developers - Responsible for the production and deployment of models, the ML engineers are the last filter of this process. Deployment activities can sometimes be as simple as making the models available at a given location. At times, however, a lot of engineering work needs to be put in before a model can be deployed.

Why Does It Need to Be Properly Managed?
A general machine learning development lifecycle consists of the following steps:
● Defining a task
● Collecting data
● Model Exploration
● Model Refinement
● Testing and Evaluation
● Deployment and Integration
● Monitoring and Maintenance

The life cycle of a machine learning system is a highly iterative process, meaning that the steps mentioned above are repeated over and over again in a cyclic manner. Not only does it demand exercising meticulousness while performing each step of a machine learning lifecycle, but they also need to be reiterated constantly so that the system stays relevant and useful.

Conclusion
Managing the lifecycle of a machine learning system is difficult, even more so than building one. Mr Ganti puts it well, "It is hard, accept it”.Once a machine learning model is deployed, its complexity increases considerably. And if not calibrated and attended to carefully, the system can lead to highly undesirable outcomes. Even those with experience in software development cycle management might find this task to be quite challenging, especially in terms of version control, automated tests and deployment. Yet, following a software development approach is the best bet as of now. The important thing to remember is that with a machine learning system, supervision of version data, models, and data is a must. Last but not least, one should think of ML systems as a platform for the entire organisation to rely upon rather than an activity to be completed separately by different departments. As Mr Ganti says about dealing with ML lifecycles, "Think platforms".