Connect with us

Hi, what are you looking for?

Top Stories

Major Tech Firms Use 15 Million YouTube Videos, Including 88,000 from Fox News, for AI Training

Major tech firms, including Microsoft and Meta, have leveraged 15 million YouTube videos for AI training, raising serious copyright concerns.

In a recent investigation by The Atlantic, staff writer Alex Reisner uncovered that **major technology companies** have utilized at least **15 million YouTube videos** as training data for their **AI video generation** products. This extensive use of content has raised significant concerns regarding intellectual property rights and ethical practices in the rapidly evolving AI landscape.

The investigation highlights over a dozen prominent training datasets compiled and employed by companies such as **Microsoft**, **Meta**, **Snap**, **Tencent**, **Runway**, and **ByteDance**. These datasets have been instrumental in enhancing the quality of AI-generated videos, showcasing how unauthorized usage of **YouTube** content has fueled advancements in this sector. Reisner draws an analogy, stating, “Much as **ChatGPT** couldn’t write like **Shakespeare** without first ‘reading’ Shakespeare, a video generator couldn’t construct a fake newscast without ‘watching’ tons of recorded broadcasts.”

Scope of Unauthorized Data Usage

The Atlantic’s reporting briefly mentions that among the training data, over **30,000** videos from the **BBC** were included, alongside hundreds of thousands from renowned news publishers and creators such as **The New York Times**, **The Washington Post**, **The Guardian**, **Al Jazeera**, and **The Wall Street Journal**. Specifically, more than **88,000** videos were sourced from **Fox News**, roughly **70,000** from **ABC News**, and over **55,000** from **Bloomberg** channels.

Much of this content originates from platforms owned by **Vox Media**, including **Vox**, **Eater**, and **The Dodo**, which collectively account for over **30,000** videos. The **New York Times** alone contributed over **11,604** videos across different datasets, with a significant portion coming from the **Runway Gen-3** dataset, which was launched in June **2024** and received acclaim for its capabilities.

Despite the extensive use of these videos, **YouTube** CEO **Neal Mohan** has reiterated that it is against the platform’s terms of service for third parties to download content for training purposes. **Lauren Starke**, a spokesperson for Vox Media, emphasized, “In order to survive, AI platforms know they need (and their consumers want) quality, credible content like ours that gives their products relevance and purpose.” Starke also noted that these companies have spent heavily on AI infrastructure but comparatively little on the content that fuels their models.

The Legal Landscape and Implications

The investigation raises profound questions about copyright and licensing, especially as many news organizations have not authorized the use of their videos for AI training. The **New York Times** has stated that it has not sanctioned the use of its **YouTube** content for AI purposes, reinforcing its legal rights to determine how and where its content is used.

Additionally, partnerships between news outlets and AI companies are becoming more common, as seen with **Vox Media’s** deal with **OpenAI** that allows the latter to use its content for products like **ChatGPT**. Starke indicated that Vox Media is considering further partnerships while also preparing to protect its intellectual property through legal channels when necessary.

Furthermore, internal documents from **Runway**, published by **404 Media**, reveal that the company strategically targeted videos from high-quality channels for its datasets. The spreadsheet indicated that videos were tagged for their specific features, revealing an organized method of selecting content that would enhance AI training.

As the AI industry continues to advance, firms like **Runway** have already integrated their products into traditional media workflows, with companies such as **Netflix** and **Walt Disney Co.** utilizing Runway’s tools for content production. However, the absence of reported licensing agreements between **Runway** and the news publishers whose content was included remains a troubling aspect of this story.

As AI technology evolves rapidly, the need for clear guidelines and ethical frameworks around content sourcing becomes increasingly pressing. The implications of using copyrighted material without consent could have long-lasting effects on the landscape of journalism and the integrity of AI-generated content.

See also
Staff
Written By

The AiPressa Staff team brings you comprehensive coverage of the artificial intelligence industry, including breaking news, research developments, business trends, and policy updates. Our mission is to keep you informed about the rapidly evolving world of AI technology.

You May Also Like

AI Generative

Discover the top 7 AI chat apps of 2026, including Claude AI's $20 Pro plan and Google Gemini's multimodal features, guiding users to optimal...

Top Stories

As AI demand surges, Vertiv and Arista Networks report staggering revenue growths of 70.4% and 92.8%, outpacing Alphabet and Microsoft in 2026.

Top Stories

Musk's xAI acquires a third building to enhance AI compute capacity to nearly 2GW, positioning itself for a competitive edge in the $230 billion...

Top Stories

Lenovo unveils AI Glasses concept for CES 2026, featuring 8-hour battery life and advanced AI functionalities to challenge Apple and Meta's dominance.

AI Education

WVU Parkersburg's Joel Farkas reports a 40% test failure rate linked to AI misuse, urging urgent policy reforms to uphold academic integrity.

Top Stories

Hybe's AI-driven virtual pop group Syndi8 debuts with "MVP," showcasing a bold leap into music innovation by blending technology and global fan engagement.

AI Cybersecurity

Nomani investment scams surged 62% as ESET reported over 64,000 blocked URLs, utilizing AI deepfakes to mislead victims into financial loss.

Top Stories

Wedbush sets an ambitious $625 target for Microsoft, highlighting a pivotal year for AI growth as the company aims for $326.35 billion in revenue.

© 2025 AIPressa · Part of Buzzora Media · All rights reserved. This website provides general news and educational content for informational purposes only. While we strive for accuracy, we do not guarantee the completeness or reliability of the information presented. The content should not be considered professional advice of any kind. Readers are encouraged to verify facts and consult appropriate experts when needed. We are not responsible for any loss or inconvenience resulting from the use of information on this site. Some images used on this website are generated with artificial intelligence and are illustrative in nature. They may not accurately represent the products, people, or events described in the articles.