Reddit blocks Internet Archive to end sneaky AI scraping

Ars Technica - All content 08月12日

Reddit blocks Internet Archive to end sneaky AI scraping

Reddit阻止互联网档案馆索引其热门论坛，因AI公司违规抓取存档内容，导致IA的存档功能受限，只能保存Reddit首页截图。

Reddit is now blocking the Internet Archive (IA) from indexing popular Reddit threads after allegedly catching sneaky AI firms—restricted from scraping Reddit—instead simply scraping data from IA's archived content.

Where before IA's Wayback Machine dependably archived Reddit pages, profiles, and comments—as part of its mission to archive the Internet—moving forward, only screenshots of the Reddit homepage will be archived. As The Verge noted, this means the archive will only be useful as a snapshot of popular posts and news headlines each day, rather than providing a backup documenting deleted posts or a window into various Reddit subcultures or any given user's activity.

Reddit has not confirmed which AI firms were scraping its data from the Wayback Machine. The company's spokesperson, Tim Rathschmidt, would only confirm to Ars that Reddit has become "aware of instances where AI companies violate platform policies, including ours, and scrape data from the Wayback Machine."

Read full article

Fish AI Reader

AI辅助创作，多种专业模板，深度分析，高质量内容生成。从观点提取到深度思考，FishAI为您提供全方位的创作支持。新版本引入自定义参数，让您的创作更加个性化和精准。

FishAI

鱼阅，AI 时代的下一个智能信息助手，助你摆脱信息焦虑

联系邮箱 441953276@qq.com

相关标签

Reddit 互联网档案馆 AI数据抓取违规行为

相关文章

A股9家公司因资金占用被“黄牌”警告，ST长康被占用余额近35亿元

深圳证监局：对兆利丰私募证券基金采取责令改正的行政监管措施

Reddit与OpenAI建立合作伙伴关系，将内容引入ChatGPT

联创股份(300343.SZ)：公司及相关人员收到山东证监局警示函

OpenAI secures key partnership with Reddit

未在减持前15个交易日披露减持计划，人福医药控股股东遭上交所监管警示

从“草根文化”到“数据金矿”：美国贴吧Reddit的19年爱恨史，及新资本故事

荣盛石化(002493.SZ)：公司及相关人员收到浙江证监局警示函

交易商协会对青白江国投、中金公司等4家机构启动自律调查

业绩预告、业绩预告修正公告披露净利润与年报存较大差异，棕榈股份及董事长等被河南证监局出具警示函