Selling Faces to Train AI but Becoming a Victim of Deepfakes: The Dark Truth of the Global AI Gray Industry

MarketWhisper

AI灰色產業

The in-depth investigation by the British Guardian reveals a rapidly growing global gray market: thousands of ordinary people from South Africa, India, and the United States are selling their voices, faces, gait videos, and private call records in exchange for AI training fees. As the demand for high-quality human data by AI companies has exceeded what is available on the public internet, paid data collection platforms such as Kled AI, Silencio, and Neon Mobile have emerged.

Two Real Cases: Who is Selling Themselves and Why

This global AI data gold rush is particularly driven in developing countries.

27-year-old Jacobus Louw from Cape Town, South Africa, completed a “city navigation” task on Kled AI, earning $14 for a walking video, which is about 10 times the local minimum wage. He admits he knows the cost of privacy, but due to years of neurological disease preventing him from finding employment, he has saved up $500 by selling daily videos to enroll in a massage therapy training course. “As a South African, receiving dollars is worth more than others might imagine,” Louw said.

22-year-old student Sahil Tigga from Ranchi, India, sells recordings of environmental noise through Silencio for over $100 a month; 18-year-old welding apprentice Ramelio Hill from Chicago, USA, sold about 11 hours of private call records to Neon Mobile for $0.50 per minute, earning approximately $200. His logic is simple: since tech companies already have a lot of his personal data, he might as well get a share of it.

How the AI Data Drought Has Given Rise to This Gray Market

Improvements in generative AI like ChatGPT and Gemini rely on vast amounts of high-quality human data, but mainstream open datasets such as C4, RefinedWeb, and Dolma have begun to restrict commercial licensing. Researchers estimate that AI companies will run out of fresh high-quality text as early as 2026. Using AI-generated synthetic data for training has been shown to result in models producing erroneous “garbage” and causing failures, further increasing the scarcity of real human data.

This has led to the emergence of paid data collection platforms, creating a new global digital gig economy:

Kled AI: Acquires daily photos and videos on a task basis

Silencio: Crowdsources the collection of environmental audio, settling in cryptocurrency

Neon Mobile: Acquires conversation and call recordings at $0.50 per minute

Luel AI (supported by Y Combinator): Collects multilingual conversations at about $0.15 per minute

ElevenLabs: Allows users to digitally clone their voices, with a base rate of $0.02 per minute

Bouke Klein Teeselink, an economics professor at King’s College London, points out that AI training gigs represent a new and rapidly growing category of work, and AI companies are actively paying for collections to avoid copyright disputes that could arise from complete reliance on web scraping.

Deepfakes and Irrevocable Licensing: The Real Costs of the Gray Market

The legal risks associated with these platforms are largely unknown to users. Enrico Bonadio, a law professor at St George’s University in London, notes that licensing agreements often grant platforms “global, exclusive, irrevocable, transferable, and royalty-free” rights, allowing them to sell, display, store, and create derivative works from the data, with suppliers having almost no real means to withdraw consent or renegotiate.

The experience of New York actor Adam Coy is one of the most representative cases. He licensed his likeness to the AI video editing software Captions for $1,000, with the agreement explicitly stating it could not be used for political propaganda or pornographic content, and the licensing period was one year. However, shortly after, his friend discovered a video on Instagram that had millions of views, in which “he” claimed to be a “vaginal doctor,” promoting unverified medical supplements for pregnant women. “The comments were strange because they were judging my appearance, but that wasn’t me at all,” Coy said. Since then, he has not taken any AI data gigs.

Professor Mark Graham from Oxford University summarized that this work is structurally “unstable, with no upward mobility; it is essentially a dead end,” with the only long-term winners being “platforms in the Northern Hemisphere, which capture all the enduring value.”

Frequently Asked Questions

What is the AI Training Gray Market and why is it called “gray”?

The AI training gray market refers to a series of paid collection platforms that acquire voices, faces, videos, and call records from ordinary users in exchange for compensation for AI model training. It is called “gray” because the transactions appear legal, but the ultimate use of the data is opaque, the licensing terms are highly asymmetric, and there are risks of potential misuse, such as deepfakes, straddling the line between compliance and exploitation.

What specific legal risks do individuals face when selling personal data to train AI?

Suppliers often grant irrevocable rights to biometric data to platforms without fully understanding the terms. Stanford researcher Jennifer King points out that consumers face the risk of their data being reused in “ways they dislike, do not understand, or did not anticipate, and there are almost no remedies available.” The security breach incident at Neon Mobile has confirmed that platforms may not even notify affected users after data leaks.

What is the connection between this gray market and the cryptocurrency ecosystem?

Some AI training platforms (such as Silencio) settle payments in cryptocurrency, using decentralized payments to lower the barriers for cross-border income, allowing users in developing countries to receive earnings directly in stablecoins or native tokens. This makes the AI data market an important branch of the real-world applications of cryptocurrency, while also bringing multiple considerations regarding token valuation, liquidity, and data ethics.

Disclaimer: The information on this page may come from third parties and does not represent the views or opinions of Gate. The content displayed on this page is for reference only and does not constitute any financial, investment, or legal advice. Gate does not guarantee the accuracy or completeness of the information and shall not be liable for any losses arising from the use of this information. Virtual asset investments carry high risks and are subject to significant price volatility. You may lose all of your invested principal. Please fully understand the relevant risks and make prudent decisions based on your own financial situation and risk tolerance. For details, please refer to Disclaimer.
Comment
0/400
No comments