JD.com's large-scale model submission: Tiansheng is one step closer to the industry

Original source: Qubit

Image source: Generated by Unbounded AI

There are quite a few companies making large-scale models, and it is the first time I have seen one that can clearly give a timeline of industry landing.

Just now, at the 2023 JD Global Technology Explorer Conference and JD Cloud Summit, JD.com launched the Yanxi Large Model + Yanxi Large Model Open Computing Platform, and demonstrated multiple industries such as retail, health, logistics, marketing, finance, and customer service. Phased practical results of the scene.

At the beginning, I enjoyed a digital human song and dance performance driven by a large model.

In the e-commerce scenario, AIGC product content generation is supported.

There is also an AI growth marketing platform, which can build a marketing plan and promote a website in a few words.

In medical scenarios, the cause of the user’s back pain can be determined through multiple rounds of dialogue.

In addition to live demonstrations, the three-step landing roadmap is also particularly attractive——

In the second half of this year, JD.com has repeatedly polished and tempered its internal high-complexity scenarios, and provided external services for benchmark customers in key scenarios. The main purpose is to find those seemingly “inconspicuous” but very critical problems in industrial applications.

In the first half of 2024, large-scale model capabilities will be fully opened to the outside world for serious business scenarios.

The focus is on, the opening here is not just to provide APIs, but to package industrial applications together so that enterprises can use them out of the box.

Jingdong will not serve unfinished dishes on the table.

The reason why JD.com has the confidence to formulate such a route is that there are quite a lot of dry goods in it.

**What kind of large model do industry partners need? **

Half a year after the general-purpose large-scale model became popular, the entire industry has focused on the next stage—the industrial large-scale model.

As JD.com, which was the first to propose a large-scale industry model, handed over the answer sheet, the question of “what kind of large-scale model does the industry partner need” can also be answered step by step.

With the advent of the era of large models, C-end applications emerge in endlessly, and the general public has an intuitive perception, and they also have a personal experience of the existing “nonsense” and other problems.

But for B-end companies, due to their different industries and their own business considerations, there is still no good answer to “what to use? How to use it?”, and most of them “still don’t know how to use it”.

When people talk about large industrial models, the first reaction is to focus on a certain industry and make small models**.

But He Xiaodong told Qubit that this may be a misunderstanding. When making an industrial model, data in a general domain is still needed. The data in the general domain is also very critical for industrial applications, and it provides background knowledge of common sense. Otherwise, if the user suddenly asks some questions that are not related to this field, the topic jumps a lot, and the small model of the original field will be at a loss.

Therefore, the large industrial model required by enterprises must also be based on general capabilities.

For enterprises, large-scale models are mostly used to reduce costs and increase efficiency. For example, it can automatically process data and tasks, analyze large amounts of data to give more accurate decisions, expand to new business areas, and so on.

To achieve these uses, the large model must meet two principles: Credible and usable.

  • Credible, which means that the prediction results of the model are reliable and can be trusted by enterprises;
  • Available means that the model can play a role in actual business and create value for the enterprise.

** These two points are not only the criteria for choosing which large-scale model to use for enterprises; they are also the two basic characteristics for realizing the industrialization of large-scale models. **

First look at believability.

To solve the problem of trustworthiness of large models in a targeted manner, there is still no complete solution on the market.

The illusion problem that has always existed since the big model was unveiled and applied in the past few months. As early as 2020, JD.com’s K-PLUG model entity attribute extraction accuracy rate based on Transformer reached 95%, which has reached the level of commercial.

The reason for this is that JD.com has blazed its own path—injecting ** knowledge into **.

It was the first of its kind in the industry at the time.

It has greatly improved the text diversity and chapter coherence in the previous AI task of generating long text, as well as the uniqueness and attribute consistency of selling points that need to be paid special attention to in the generation of product selling point copywriting, giving products “real praise”.

Finally, in a series of NLP tasks, such as entity attribute extraction accuracy, generative multi-round dialogue ROUGE-L, multi-round question answering knowledge retrieval rate, its performance is significantly better than other baseline models.

The reason why the “illusion” problem can be discovered and solved so early is that they have been surrounding the industry practice.

The correct rate of content generated by general generative language models on the market is about 83% and 85%. Generally, toC users think it is okay to use, and one out of ten is wrong, but it is unacceptable for commercial use.

The same idea is also integrated into the available practices.

From the perspective of providing large-scale enterprises, this question turns into how to make large-scale models create inclusive value. The realization of inclusiveness of any technology means that the technical threshold and the cost of use are reduced as much as possible.

Algorithm generalization + vector database + SaaS, this is a set of combined punches given by JD.com.

Needless to say the first two, on the one hand, the generalization ability of the algorithm allows the model to handle a variety of tasks and solve complex long-tail scenarios in the industry.

In 2022, in view of the lack of generalization of the model, JD.com proposed a tens of billions-level Vega model. And in 2023, the Vega v2 model with larger scale, stronger performance and better transferability will be proposed in terms of the general language understanding basic model.

The vector database can update the knowledge base in real time, supplement the long-term memory of the large model, and reduce the training cost. It can be said that it serves multiple purposes.

As for the vector database, JD.com took the lead in developing Vearch in 2019. Currently serving more than 100 large and medium-sized enterprise users. The data shows that using JD.com’s vector database for large model pre-training, compared with traditional methods, by optimizing the efficiency of model inference, the inference cost is reduced by 80%.

At present, the mainstream idea in the application of large models is to call API. He Xiaodong said in an interview that for some traditional industry customers, this still has a certain technical threshold.

The only thing customers care about is service efficiency, as long as the product can be used.

So the team decided to directly decouple the technology to form “Yanxi AI Development Computing Platform”, and polish product modules in internal practice. Customers don’t even need to master deep AI knowledge, skipping intermediate steps and directly introducing mature large-scale model capabilities.

In the past, a team of more than 10 scientists was required to work, but now only 1-2 algorithm personnel are needed to complete the whole process from data preparation, model training to model deployment through the platform, the training efficiency is increased by 2 times, and the reasoning efficiency is increased by 6.2 times. Save nearly 90%. **

From the two perspectives of credibility and usability, the solution provided by JD.com has made a mark in the entire industry.

From language model to multimodal digital human interaction

This method of discovering and solving problems around industrial practice is not only reflected in the big language model, but also in all stages of JD’s industrial AI development.

In the pre-deep learning era, few people expected AI to become a function or even a product independently like today. The goals at that time were mainly focused on reducing costs, increasing efficiency, and optimizing experience.

In 2012, JD.com began to apply the intelligent customer service center, focusing on using technology to assist customer service to improve human efficiency and optimize customer experience. Looking back, there are three specific task directions explored:

ASR speech recognition technology, NLP semantic analysis technology, data mining technology.

Today, these three types of technologies have a profound impact on the training of large models.

Entering the era of deep learning, AI begins to play a role in more comprehensive and complex scenarios.

In 2015, Jingdong Intelligent Customer Service officially applied deep neural network technology. In 2018, the unmanned customer service was upgraded for the first time to realize the combination of man and machine. In the same “deep” practice process, the Yanxi team gradually realized such a problem:

The customer service dialogue is a task-oriented dialogue, which is ultimately to solve the problem of the real scene, which is essentially different from ordinary chat. Moreover, in different scenarios, users have different requirements for dialogue, and there are great differences between pre-sales and after-sales.

Therefore, as early as the time when AlphaGo set off the last round of artificial intelligence boom, He Xiaodong proposed “The essence of dialogue is reasoning and decision-making”, and later he often used “Go is also called hand talk” to explain it on various occasions this point of view.

Under the guidance of this idea, He Xiaodong led the team to integrate multi-modal features such as acoustics, semantics, and time to develop a series of “Turn taking” technical routes for human-computer interaction scenarios.

Recently, Google integrated its Google Brain and DeepMind teams, and proposed to use AlphaGo technology and Transformer to develop the next-generation large-scale model “Gemini” Gemini, which also marks that the value of this route has finally been recognized by more people.

Later, we came to the era of Transformer large-scale models, and Jingdong’s layout was also very early.

In terms of AI engineering, JD.com serves hundreds of millions of active users every day. Taking the field of smart customer service as an example, tens of billions of dynamic interactive data are generated every year. First, the data scale is large, providing 10 million smart customer services every day and 2 million hours of voice calls every month. Under various high-load tests, JD.com has accumulated best practices.

Coupled with the human-computer interaction in other fields of JD.com, the data volume has reached tens of billions.

In 2022, JD.com will decouple the internally verified technology with Yanxi 2.0-an artificial intelligence application platform, and begin to export it to the outside world.

Today’s Yanxi AI development computing platform also continues this idea.

He Xiaodong introduced that starting from this year, Yanxi’s large-scale model is being deeply implemented in Jingdong’s scene on the basis of strong engineering capabilities. technical skills.

Taking the health field as an example, relying on the multi-round interaction of Yanxi large model, tool calling, summary summary, multi-modal graphics and text, health assistants and auxiliary diagnosis and treatment applications have been created.

At present, health assistants and auxiliary diagnosis and treatment have accumulated more than 30 million high-quality clinical dialogues, built a million-scale medical knowledge map, covering professional services for more than a thousand diseases, and 20 evaluation standards to ensure medical safety.

In the field of logistics, with the support of Yanxi’s large-scale model, JD Logistics Ultrain realizes real-time interaction, root cause analysis and intelligent decision-making, and updates and iterations to have the ability to automatically generate global optimal supply chain solutions in real time.

In the field of marketing, the marketing and platform operation team of JD.com created an AI growth marketing growth platform, using large-scale models to solve problems such as key tasks, dynamic adaptability, and user experience, greatly optimizing the marketing and operation process, and achieving higher program production efficiency. A hundred-fold increase; the process that used to involve more than 5 functions such as product/R&D/algorithm/design/analyst has been reduced to one person; at the same time, a new interactive mode of one portal has reduced the number of human-computer interactions from 2,000 to at least More than 50 times, the operating efficiency has been increased by more than 40 times.

**Next is the era of general artificial intelligence that is accelerating, and He Xiaodong believes that it must be in the direction of multimodality.

When the general capabilities of AI reach a certain level, it can not only stay behind the scenes to provide technical support, but also form products that directly face human users, and even interact with humans at a deeper level like future intelligent bodies.

In this regard, JD Cloud has integrated a number of multi-modal digital human interaction capabilities on the basis of the Yanxi large model, and has also found some experience in the practice of digital human customer service, live broadcast and other scenarios.

For example, semantic-driven body movement editing has been realized, and by combining with a large model, the movement of a digital human can match the semantics when speaking, making the interaction more natural.

Another example is the dynamic local high-definition technology of the digital human body, which takes advantage of the unevenness of human visual perception and is particularly sensitive to the face, especially around the eyes. The resolution is increased through key areas, and the resolution is appropriately reduced in other areas, thereby reducing deployment costs.

On JDD, they said that Yanxi will further reduce the threshold and difficulty of operation, so that more small and medium-sized businesses and individuals can afford and use digital human services.

……

From this point of view, the reason why Yanxi’s large-scale model landing roadmap emphasizes the need for a “half-year polishing period” is ready to come out:

**You can’t just provide an API to tell the outside world that there is such an AI capability. The ultimate goal is to provide partners with directly usable product modules with end-to-end value. **

Jingdong route in the era of big models

A few months ago, companies were rushing to release large-scale models. At the World Artificial Intelligence Conference, it turned into a “large-scale model home”, and each company gave their own industry solutions.

Although it looks dazzling, if you learn more about it, you will find that these solutions inevitably have problems such as convergence of business scenarios and implementation progress is still in its infancy.

This is related to the difficulties in the industrialization of large-scale model technology, especially the problem of the last mile, which is often the boundary of whether it can be used or not. This involves the entire system engineering problem, which cannot be solved by the three elements of traditional AI.

JD.com, as the first to put forward a large model focusing on the industry, has already perceived this change and updated it to a new “three elements” connotation:

  • Scenarios. In the past, static data could not adapt to dynamic interactions. Only live scenario data in industry applications can be realized.
  • For products, a single-point algorithm is not enough to support a new large model. Only the final product form has the core competitiveness and can drive the innovation and breakthrough of the system algorithm.
  • Computing power, the progress of a single AI chip is slower than the blowout of large-scale model computing power requirements, and computing power clusters become a better solution.

So far, the route of Jingdong in the era of large-scale models has become clear:

**Industry native. **Derived from industry and service industry.

So at this summit, JD.com decoupled its underlying capabilities, and released a package of technical products and solutions from the basic layer, model layer, to MaaS and SaaS to industry partners.

Even, a clear “three-step” strategy has been given:

  • The first step, based on the core industry data, build a large base model internally;
  • The second step, in JD.com’s internal core business applications such as retail, finance, health, and logistics;
  • The third step is to fully open up large-scale model capabilities for key industrial scenarios outside JD.com’s domain, such as finance, government affairs, and health, and export controllable, credible, and affordable customized models to the industry.

Such a landing path is also JD.com’s reaffirmation of its technical pursuit to the industry:

Cost, efficiency, experience, credibility, inclusiveness, breakthrough.

As early as 2017, JD.com put forward the slogan of “Technology, Technology, Technology”. These three technologies represent three levels:

The first level is to serve the needs of one’s own business; the second level is to serve the technology of the industry; the third level is to explore future technology.

These three are coupled in pairs to form a closed loop between technology and industry—based on cutting-edge innovation breakthroughs in the industry, and then internally polish and precipitate to achieve “credibility”, and finally serve the industry to create inclusive value.

It is precisely because of industry-based thinking that JD.com has not revealed more progress since the industry first announced the industry’s large-scale model in February this year, and it is only now that it has debuted its own technology accumulation for the first time.

After all, judging from the current development situation, the difficulty of landing large models does not lie in technological catch-up, but in industrial breakthroughs.

View Original
This page may contain third-party content, which is provided for information purposes only (not representations/warranties) and should not be considered as an endorsement of its views by Gate, nor as financial or professional advice. See Disclaimer for details.
  • Reward
  • Comment
  • Repost
  • Share
Comment
0/400
No comments
  • Pin
Trade Crypto Anywhere Anytime
qrCode
Scan to download Gate App
Community
  • 简体中文
  • English
  • Tiếng Việt
  • 繁體中文
  • Español
  • Русский
  • Français (Afrique)
  • Português (Portugal)
  • Bahasa Indonesia
  • 日本語
  • بالعربية
  • Українська
  • Português (Brasil)