Scan to Download Gate App
qrCode
More Download Options
Don't remind me again today

How DeAI Competes with Centralized AI: Advantages, Applications, and Funding

Author: 0xJeff, encryption KOL

Compiled by: Felix, PANews

Nowadays, everyone is selling something, whether it's food, housing, encyclopedias, electronic products, applications, or the latest AI.

In the past, what was sold were practical items that satisfied the lower levels of Maslow's hierarchy of needs. Nowadays, what is being sold are dreams and hopes, packaged in a shiny exterior, especially in the encryption AI field.

Encryption AI products and infrastructure are often difficult to understand, leading teams to use too much jargon in communication, which fails to engage users.

In addition, launching a real AI laboratory (not just a simple wrapper) requires a significant amount of funding to support talent, contributors, computing resources, and other necessary resources.

The cost of an advanced enterprise-level AI laboratory can reach millions of dollars each year. If researching, training, and optimizing cutting-edge AI models, the costs can soar to hundreds of millions of dollars. The price of H100 model GPUs ranges from $25,000 to $40,000, while the newer Blackwell B200 and GB200 model GPUs range from $30,000 to $70,000. Training a cutting-edge model may require thousands of such GPUs.

Advantages of Decentralized AI (DeAI): Small Models + Reinforcement Learning

Choosing a decentralized system, which coordinates computing resources globally to train a single model, can theoretically significantly reduce the cost of GPUs (saving 30% to 90%) because you can leverage the globally idle GPU network. However, in practice, coordinating these GPUs and ensuring they all function at a high quality is very challenging. Therefore, currently, there is no decentralized AI laboratory capable of overcoming the challenges of decentralized training.

However, there is still hope for the future, as a few laboratories have achieved encouraging results in decentralized reinforcement learning. It is precisely this process of self-play and self-learning that can make a small model extremely intelligent.

Not all situations require large language models (LLMs). Training domain-specific models and using reinforcement learning (RL) to refine and enhance their skills is the most cost-effective way to provide enterprise-level AI solutions, because ultimately, what clients want are results (compliance, safety, cost-effectiveness, and increased productivity).

As early as 2019, OpenAI Five defeated the then-world champion OG team in “Dota 2”. This was not a fluke, but a complete crushing, having beaten the OG team twice in a row.

You may be curious about how it is done?

“Dota 2” is an extremely complex multiplayer online battle arena game where 5 players compete against each other to accomplish various objectives and destroy the opposing base.

To enable AI to compete with top players, it follows the steps below:

  • Start self-battling from scratch: Learn the basics and engage in millions of self-battles. If you win, it indicates a good operation; if you lose, it signifies a poor operation (i.e., large-scale trial and error).
  • Set up a reward system (points) to incentivize behaviors that may lead to a positive expected value (EV) such as destroying towers and killing heroes, while deducting points for behaviors with a negative expected value.
  • The training method uses a reinforcement learning algorithm called “PPO”. The AI tries certain actions during the game, and PPO treats the results as feedback. If the result is good, it does more; if the result is poor, it does less. This method gradually steers the AI in the right direction.
  • Hundreds of GPUs have been running AI training for nearly a year, with the AI continuously learning and adapting to game version updates and changes.
  • After a period of time, it began to autonomously explore complex strategies (sacrificing a lane, adopting conservative or aggressive play at the right moment, seizing the opportunity for a large-scale attack, etc.), and it started to compete against human players and win.

Although OpenAI Five has been retired, it has inspired that small models can also be extremely effective in specific domain tasks (OpenAI Five's parameter size is only 58MB).

Large AI laboratories like OpenAI are able to do this because they have the funding and resources to train reinforcement learning models. If a company wants to have its own OpenAI Five for fraud detection, factory robots, self-driving cars, or financial market trading, it requires a substantial amount of funding to achieve that.

Decentralized reinforcement learning solves this problem, which is why decentralized AI laboratories like Nous Research, Pluralis, gensyn, Prime Intellect, and Gradient are building a global GPU network to collaboratively train reinforcement learning models, providing infrastructure for enterprise-specific AI.

Some laboratories are researching ways to further reduce costs, such as using RTX 5090/4090 instead of H100 to train reinforcement learning models. There are also some focusing on using reinforcement learning to enhance the intelligence level of large foundational models.

Regardless of the research focus, it will become one of the most promising development directions for decentralized AI. If decentralized reinforcement learning solutions can be commercially applied on a large scale, enterprise clients will invest substantial funds into AI and will also see more decentralized AI teams achieving annual revenues in the 8 to 9 digit range.

provides funding for DeAI and achieves scalability through the coordination layer.

However, before reaching an annual income of 8 to 9 figures, they need to continuously research, implement, and transition to commercially viable reinforcement learning solutions, which requires significant funding.

One of the best ways to raise funds is through coordination layers like Bittensor. Millions of dollars in TAO incentives are distributed to subnets (startups and AI labs) every day, while contributors (AI talents) contribute to the subnets they are interested in to receive a share of the incentives.

Bittensor enables contributors to participate in the development of AI and allows investors to invest in AI laboratories that contribute to DeAI technology.

Currently, there are several key DeAI segments that stand out in the Bittensor ecosystem, including quantum computing, decentralized training, AI agents, and prediction systems (reinforcement learning is not one of them yet, but there are more than 3 subnets actively focusing on decentralized reinforcement learning).

How is the current progress of decentralized reinforcement learning?

Reinforcement learning has been proven to be applicable on a large scale, but industrialization has not yet been achieved. The good news is that the demand for AI agents that can learn from real feedback is growing rapidly among enterprises. For example, agents that can learn from real-world environments, sales, and customer service calls, as well as trading models that can adapt to market changes, etc. These self-learning systems can create or save companies millions of dollars.

Privacy technologies are also on the rise. Technologies such as Trusted Execution Environments (TEE), encryption embedded within TEEs, and differential privacy are helping to encrypt and protect private information in feedback loops, allowing sensitive industries like healthcare, finance, and law to maintain compliance while having robust domain-specific self-learning AI agents.

What will happen next?

Reinforcement learning is the best choice for making AI smarter. Reinforcement learning transforms AI from a generative system into an active, intelligent AI agent.

The combination of privacy and reinforcement learning will drive enterprises to truly adopt and provide compliant solutions for customers.

Reinforcement learning makes the “agent economy” possible, where agents purchase computing resources, negotiate with each other, and provide services.

Due to cost-effectiveness, decentralized reinforcement learning will become the default method for scaling reinforcement learning training.

Federated Reinforcement Learning (Federated RL) will emerge, allowing multiple parties to collaborate in learning without sharing local sensitive data, combining privacy protection with self-learning, greatly enhancing intelligence levels while complying with regulatory requirements.

Related reading: Encryption AI reshuffle: Virtuals fall out of favor, DeFAI and predictive AI seize the opportunity.

TAO-10.5%
View Original
This page may contain third-party content, which is provided for information purposes only (not representations/warranties) and should not be considered as an endorsement of its views by Gate, nor as financial or professional advice. See Disclaimer for details.
  • Reward
  • Comment
  • Repost
  • Share
Comment
0/400
No comments
  • Pin
Trade Crypto Anywhere Anytime
qrCode
Scan to download Gate App
Community
  • 简体中文
  • English
  • Tiếng Việt
  • 繁體中文
  • Español
  • Русский
  • Français (Afrique)
  • Português (Portugal)
  • Bahasa Indonesia
  • 日本語
  • بالعربية
  • Українська
  • Português (Brasil)