A startup born out of a PhD thesis of two Massachusetts Institute of Technology (MIT) electrical engineering and computer science students is developing AI models that respect privacy without compromising personal data. Their innovative approach aims to train powerful AI models without collecting user information, potentially meeting the requirements of strict privacy laws like GDPR, CCPA, and HIPAA.

Governments around the world are working to craft regulations to ensure artificial intelligence models are leak-proof when it concerns sensitive user data, be it someone’s bank details or health records. But, what if these powerful AI models can be trained without collecting data at all?

Can AI be safe and compliant with some of the world’s toughest privacy laws—EU’s General Data Protection Regulation (GDPR), California Consumer Privacy Act (CCPA) or the U.S.Health Insurance Portability and Accountability Act (HIPAA)?

Curiously, a startup born out of a PhD thesis of two Massachusetts Institute of Technology (MIT) electrical engineering and computer science students — Vaikkunth Mugunthan and Christian Lau—is working on making it happen, making AI models respect privacy without compromising a person’s identity.

Founded in 2021, DynamoFL uses a new way of training AI models and it is called Federated Learning. On YoungTurks’ “Voices from the Valley” special, co-founder Vaikkunth Mugunthan offered a dummy guide to Federated Learning.

"For example, let's take our smartphones. Imagine you want to deploy an app that predicts the next word when you are texting your friends. On Google, Apple or any other smartphone, the data never has to leave. The data should remain under your control,” said Mugunthan.

“AI models can now be trained on individual data sources or smartphones. Only the models or learnings are transferred to the cloud, not the data. You are aggregating the different models from each smartphone, come up with a more powerful model and deploy it back on the phone,” he said.

“This way, you don't have to transfer sensitive data in order to improve the efficacy of these AI models. So, in a privacy-preserving way, how do you improve the AI models? That’s Federated Learning," he added.

Highlighting the hallucinations suffered by OpenAI ’s ChatGPT models, he concluded, “It is important for AI developers to not only build high-quality AI models but also models that are transparent, fair, bias-free, privacy-preserving and regulation-compliant.”

DynamoFL is not the only platform offering Federated Learning. Last year, Intel partnered with Penn Medicine to develop a brain tumour–classifying system using federated learning. Major pharma companies, including Novartis and Merck, have built similar models to accelerate drug discovery. Tech giants such as Nvidia offer federated learning as a service. But, DynamoFL says its offering is safer, flexible and cheaper.

Part of the Y Combinator 2022 Winter batch, DynamoFL raised $4.5 million in seed funding led by Nexus Venture Partners in 2021. Since crafting its go-to-market strategy, Mugunthan says the company has some of the largest Fortune 500 companies across financial services, insurance and automotive industries in its clientele.

"I was in a big round table in New York recently with big CIOs of the financial services industry. The main thing everyone was worried about is model risk management. So, AI has a lot of advantages, but if we don't properly incorporate these solutions, there are a myriad of potential issues," said Mugunthan.

Concurring with Mugunthan’s view on the importance of Federated Learning, industry veteran and founder of Nexus-backed open-source machine learning platform H2O.ai, Sri Satish Ambati said, “There are billions of cell phones that go to bed every night. If you can use the power of federated architecture to break down the problem of large language models into smaller models, you can start building GPT which is learning you as a person, personalized and contextualised. You can use it as a power tool.”

He further said, "Historically, the search was only a Google affair. LLMs and GPT were the purviews of the tech giants. Now, everybody can build a search."

Taking about a typical day in AI being like a month or a quarter, Ambati said, “We discovered that the largest model GPT-4 is a group of 220 billion models, not a large trillion parameter model. You can actually combine smaller models to combine the same level of efficiency, the same level of accuracy while being capital efficient with a smaller footprint.”

Mugunthan agreed saying, “The goal is to make sure AI is effectively deployed in a cost-efficient manner. How can you come up with more personalized AI models that can run on your smartphones? So, you won't even need massive Graphics processing units (GPUs) or spend billions of dollars in training these AI models. How can you get that cost massively down is something which we are effectively achieving for a lot of Fortune 500 companies.”