The competition of AI large models escalates: from Transformer to the engineering revolution of the hundred model war.

2025-07-12 09:03:05

Abstract generation in progress

The Engineering Revolution of AI: From Transformers to the Hundred Model Battle

Last month, a "war of animals" broke out in the AI community. On one side is the Llama series of models launched by Meta, which is highly favored by developers due to its open-source nature. On the other side is a large model named Falcon, developed by the Technology Innovation Institute in the UAE. These two models have been taking turns topping the open-source LLM rankings.

Interestingly, the UAE's goal in participating in the AI competition is to "disrupt the core players." Shortly thereafter, the UAE's Minister of Artificial Intelligence was selected as one of the "100 Most Influential People in AI" by Time magazine.

Today, the field of AI has entered a "hundred schools of thought contending" stage. Many countries and companies are developing their own large language models. In the Gulf region alone, there are more than one player involved. This phenomenon has led some industry insiders to lament that even the hard technology sector has seen a "hundred model battle" situation.

Transformer Devours the World

The rapid development of large models can be attributed to the paper "Attention Is All You Need" published in 2017. The Transformer algorithm proposed in this paper has become the catalyst for this wave of AI enthusiasm.

Before the emergence of Transformers, "teaching machines to read" was a recognized academic challenge. Early neural networks struggled to understand contextual meaning. In 2014, the emergence of Recurrent Neural Networks (RNN) to some extent solved this problem, but their sequential computation characteristics limited their ability to process large-scale data.

Transformers, through innovations such as positional encoding and parallel computation, have both improved training efficiency and enhanced the ability to understand context. This has shifted AI from theoretical research to engineering practice, paving the way for the era of large models.

With the popularity of Transformers, the pace of innovation in underlying algorithms has slowed down, and engineering factors such as data engineering and computing power scale have become key to the AI competition. This also means that any company with a certain level of technical strength can attempt to develop large models.

Moat Built on Glass

Currently, the "Battle of the Big Models" has become a reality. According to reports, as of July this year, the number of large models in China has reached 130, surpassing the 114 in the United States. In addition to China and the United States, countries like Japan, India, and South Korea have also launched their own local large models.

However, easy entry does not mean that everyone can become a giant in the AI era. Taking the competition between Falcon and Llama as an example, although Falcon leads in certain rankings, it is hard to say how much impact it has had on Meta. For open-source large models, an active developer community is the core competitiveness. Meta, with its social media genes and open-source strategy, holds an advantage in this regard.

In addition, most large models still have a significant performance gap compared to GPT-4. In the recent AgentBench test, GPT-4 led with a score of 4.41, while the second place, Claude, scored only 2.77, and most open-source models scored around 1 point.

This gap arises from the high-level teams of scientists possessed by top AI companies and their long-term accumulated experience. Therefore, the core competitiveness of large models may lie in ecological construction ( open-source route ) or purely reasoning ability ( closed-source route ).

Anchor Points of Value

Despite the surge of AI, there are currently not many companies that can profit from it. The high cost of computing power has become a major obstacle to industry development. It is estimated that global tech companies may spend up to $200 billion annually on large model infrastructure, while the revenue generated by large models is at most $75 billion, resulting in a significant gap.

Even industry leaders like Microsoft and Adobe face challenges in pricing AI services and controlling costs. For most large models with extensive parameters, the primary application scenario remains focused on chat functionalities.

As homogeneous competition intensifies and the popularity of open-source models grows, the business model that solely relies on providing large model services may face greater pressure. In the future, the true value of AI technology may be more reflected in its ability to be applied in specific scenarios and to solve practical problems.

GPT5.89%

View Original

This page may contain third-party content, which is provided for information purposes only (not representations/warranties) and should not be considered as an endorsement of its views by Gate, nor as financial or professional advice. See Disclaimer for details.

10 Likes