The Business Times' Column | Due Diligence: Defending the moat around the generative AI castle

Genping LIU | 17 Jul 2023

This column was first published on The Business Times

“WE HAVE no moat, and neither does OpenAI” – so the headline of a purportedly leaked Google internal memo said. While the memo was referring to Google’s closed-source artificial intelligence (AI) models, it begs the larger question – where exactly is the moat for generative AI companies? What do they hold that is defensible and inaccessible by competitors?

Generative AI is undoubtedly ushering in a new wave of disruption across all sectors. ChatGPT reached 100 million users in less than two months, 15 times faster than Instagram, which took 30 months. Early-stage startups are raising over US$100 million rounds. Three of the five largest funding rounds in the first quarter this year went to generative AI companies, according to private equity company CB Insights.

But before joining the hype, let us take a step back and evaluate if generative AI companies are investable with defensible technology, data, and ecosystem moats.

Technological lead is important but potentially transient

Presently, closed-source AI models, compared to open-source models, maintain a technological moat through proprietary algorithms and pre-trained models. These algorithms enable the building of larger models, providing more accurate, richer responses, or lowering costs. For example, OpenAI’s GPT-4 has multi-modal capabilities to handle both text and images, differentiating it from most other foundation models which focus either on text or images alone.

However, in the long run, we anticipate that open-source models may overtake the technological capabilities of closed-source models. Communities supporting open-source models can potentially iterate more quickly to improve algorithms and at lower cost.

Lower costs would mean lower barriers for more members of the open-source community to iterate on the models, further spurring innovation and new winners.

That said, raw technological prowess alone may not be sufficient to act as a moat. While having superior technology could confer an edge, it is not a guarantee of commercial success.

One example that most can relate to would be the iPhone. Despite not having the best camera or the fastest charging speeds, the iPhone still generates strong sales and commands a price premium. This commercial success could be attributed to non-technological factors, such as its seamless user experience.

While technological edge is an important moat, companies should not overlook other factors. Considering the fast-evolving nature of new technology, companies should leverage their technological lead as a first-mover advantage to kickstart a flywheel effect and build more sustainable data and ecosystem moats.

Proprietary data is great, but the value of processing public data is not to be overlooked

The most talked-about AI moat within venture capital circles is that of proprietary data. The reason is clear – data is the lifeblood of AI and the enabler of foundation models.

With OpenAI increasing the number of model parameters (points that control model behaviour) from 175 billion in GPT-3 to an estimated one trillion in GPT-4, the arms race for larger models is on.

Larger models will require increasingly large amounts of data to train well, but AI forecasters have declared that new high-quality data used for model training is expected to be exhausted by 2026. This could act as a ceiling where all foundation models eventually achieve data-parity.

To gain an edge, foundation models need access to proprietary data to expand their datasets beyond the data ceiling. Greater user activity generates more data to train better models, which would then attract more users that feed further user activity. Such proprietary data would thus form the moat.

The catch is that users of foundation models – especially the specialised industry-specific models – might not be willing to share their data with the foundation models.

While the spotlight has primarily shone on proprietary data, it is crucial not to disregard the significance of leveraging publicly-available data. Specialised features or data-specific optimised algorithms built to process the inputs and/or outputs of public data could unlock hidden value and generate superior outcomes.

For example, one of our portfolio companies, Patsnap, has integrated generative AI capabilities into its global patent search and analytics service through EurekaAI. EurekaAI enables users to query Patsnap’s database with conversational language, and EurekaAI will return a list of relevant patents as well as easily digestible summaries.

There is no denying that proprietary data presents a significant moat for generative AI companies. However, companies without proprietary data could still compete through specialised-built features and algorithms that unlock hidden value from publicly-available data to better meet user needs.

Ecosystems and networks form strong moats, but it is not a winner-takes-all scenario

As a fund that has been through many booms and busts, we often see that strong ecosystems powered by developers, plug-ins, and software integrations are key to the success of nascent technologies and would likely be paramount to the success of generative AI companies.

An interesting parallel would be the success of Nvidia graphic processor unit (GPU) chips (extensively used for training machine-learning models) versus AI-specialised chips by startups such as Graphcore.

While Nvidia’s success is largely attributed to its hardware technological edge, there is another key success driver – Nvidia’s Cuda software.

Cuda is the coding language for implementing AI applications on Nvidia chips, and enhances Nvidia chip performance. Cuda has attracted an ecosystem of four million developers that use Cuda for AI-application development. Even if AI-specialised chip companies were to develop better hardware than Nvidia, they might still struggle to displace Nvidia as the dominant AI chip manufacturer.

Similarly, foundation models that attract a large number of developers to build plug-ins and integrations would have a strong ecosystem. The race is on for foundation model companies – OpenAI has integrated into Microsoft’s suite of applications, Cohere is integrated with Oracle, and Anthropic with Zoom.

While the race to build the largest ecosystem is intense, it is not necessary to be the largest to succeed. Regional foundation models could capitalise on their regional expertise to integrate with local ecosystems before large foundation models enter their markets.

Once entrenched in the local ecosystem, other foundation models – potentially with better technology and data – may struggle to penetrate their regions. In such cases, tailwinds from de-globalisation, especially regarding local data protection, can serve as a short-term moat against global competitors. In the long run, the key to building a vibrant ecosystem moat would still lie in strong execution.

How much defence does the castle need?

There is hope for generative AI companies to build moats – with technology, proprietary data, and ecosystems.

How many of these moats would generative AI “castles” need, though?

Some castles would have the ability to build all three moats and require them all. For instance, OpenAI currently has access to superior technology, proprietary user input data, and a blossoming ecosystem.

Other castles could be secured with fewer moats. While not a generative AI company, the AI company DeepMind became a valuable business primarily through exceptional technological capabilities and was acquired by Google for more than US$400 million.

For generative AI companies, it comes down to identifying the necessary moats and executing well to deepen them. Ultimately, generative AI companies have to build out a valuable solution addressing a clear, commercially viable use case that is worth defending.

As an early-stage investor, we are keeping our eyes peeled for promising startups who can execute on these moats. The battle horn has sounded; it is time to start digging!

Gan Kah Kheng is an investment analyst and Genping Liu is a venture partner at Vertex Ventures Southeast Asia and India

We publish monthly on The Business Times "Due Diligence" column and we invite you to read our previous articles here.

Edited by Elise Tan, Director, Vertex Ventures Southeast Asia & India.

For the latest news on Vertex Ventures SE Asia and India and our portfolio companies, follow us on Linkedin or subscribe to our monthly newsletter.