Reddit sues Perplexity for data theft for AI • Tecnoblog

Summary

Reddit has sued Perplexity and three other companies for alleged data theft and violating copyrights and terms of service.
A “ghost” post created by Reddit appeared on Perplexity, serving as proof of content misuse.
The company requests compensation and a permanent ban on the use of its data by the defendants.

Reddit has filed a lawsuit against well-known artificial intelligence startup Perplexity for alleged data theft on an industrial scale. In addition to this, the action also targets SerpApi, Oxylabs and AWMProxy – the last two are from Lithuania and Russia.

According to the indictment, detailed by the The New Tork Timesthese three companies operated by illegally collecting data from Reddit not directly, but by scraping content from Google search results. This material would be resold to power AI models from companies like Perplexity.

The allegation is that the companies violated copyright laws and the platform’s terms of service by engaging in data scraping (data scraping) without authorization.

Reddit created trap

To prove the violation, Reddit prepared a “ghost” test post, visible only to Google’s indexing robots and inaccessible in any other way. Within hours, the content of this post would have appeared in Perplexity’s search results.

According to the lawsuit, this is proof that the AI startup is using Google as a shortcut to access and display Reddit content without permission.

The lawsuit further alleges that Reddit had already sent a formal notice to Perplexity in the past, demanding that the startup stop scraping the data. The company reportedly agreed, but Reddit claims that citations to its content in the AI tool have “jumped fortyfold” since then.

In a statement sent to BloombergReddit chief legal officer Ben Lee claims that AI companies “are locked in an arms race for quality human content,” and that this pressure has “fueled an industrial-scale ‘data laundering’ economy.”

It is worth remembering that this is not the first time that Perplexity has been involved in controversy over misuse of data. In August of this year, Cloudflare accused the company of ignoring instructions from the robots.txt file and illegally extracting content from websites. In 2024, Amazon had to intervene after accusations that the startup was using its servers to plagiarize articles from outlets such as Wired e Forbes.

Reddit wants compensation

Reddit, which already has million-dollar licensing agreements with Google and OpenAI, is asking the court for financial compensation and a court order that permanently prohibits the companies from using its data.

When contacted by the press, Perplexity stated that it had not yet received the process, but that it “will always fight vigorously for users’ rights to freely access public knowledge”. Reddit also sued Anthropic, another AI giant, in June for similar reasons.

Google, which is not being sued, told the NYT that it respects website guidelines, but that, “unfortunately, there are a lot of scrapers sneaks who don’t do it.” The other two companies sued, SerpApi and Oxylabs, did not respond to requests for comment from the North American press, while AWMProxy, based in Russia, could not be located.

Source: https://tecnoblog.net/noticias/reddit-processa-perplexity-por-roubo-de-dados-para-ia/

Leave a ReplyCancel reply

Related News

Leave a ReplyCancel reply

Related News

Intel gives up selling its NEX division, focused on networks and edge computing

NVIDIA restores PhysX support to RTX 50 for popular games