uBlock filter list to hide all YouTube Shorts

GitHub – i5heu/ublock-hide-yt-shorts: Maintained – uBlock Origin filter list to hide YouTube Shorts Skip to content You signed in with another tab or window. Reload to refresh your session. You signed out in another tab or window. Reload to refresh your session. You switched accounts on another tab or window. Reload to refresh your session. Dismiss alert i5heu / ublock-hide-yt-shorts Public forked from gijsdev/ublock-hide-yt-shorts Notifications You must be signed in to change notification settings Fork 0 Star 73 Maintained – uBlock Origin filter list to hide YouTube Shorts License MIT license 73 stars 124 forks Branches Tags Activity Star Notifications You must be signed in to change notification settings i5heu/ublock-hide-yt-shorts master Branches Tags Go to file Code Open more actions menu Folders and files Name Name Last commit message Last commit date Latest commit History 76 Commits 76 Commits .github .github .gitignore .gitignore CONTRIBUTING.md CONTRIBUTING.md LICENSE.md LICENSE.md README.md README.md comments.txt comments.txt list.txt list.txt View all files Repository files navigation uBlock filter list to hide all YouTube Shorts A maintained uBlock Origin filter list to hide all traces of YouTube shorts videos. Copy the link below, go to uBlock Origin > Dashboard > Filter lists, scroll to the bottom, and paste the link underneath the ‘Import…’ heading: https://raw.githubusercontent.com/i5heu/ublock-hide-yt-shorts/master/list.txt > uBlock Origin subscribe link < (does not work on GitHub) Bonus: hide YouTube Comments https://raw.githubusercontent.com/i5heu/ublock-hide-yt-shorts/master/comments.txt > uBlock Origin subscribe link < (does not work on GitHub) Maintancance After the initial createor of this list @gijsdev is now vanished for half a year, i ( i5heu ) took it on me to maintain this list. No affiliation to Alphabet, YouTube or Google This project is an independent, open-source initiative and is not affiliated with, endorsed by, sponsored by, or associated with Alphabet Inc., Google LLC, or YouTube. Contributing See CONTRIBUTING.md License See LICENSE.md About Maintained - uBlock Origin filter list to hide YouTube Shorts Resources Readme License MIT license Contributing Contributing Uh oh! There was an error while loading. Please reload this page . Activity Stars 73 stars Watchers 1 watching Forks 0 forks Report repository Releases No releases published Packages 0 No packages published You can’t perform that action at this time.

Source: Hacker News | Original Link

News publishers limit Internet Archive access due to AI scraping concerns

News publishers limit Internet Archive access due to AI scraping concerns | Nieman Journalism Lab HOME About Subscribe Archives Foundation Reports Storyboard LATEST STORY Washington Post layoffs disproportionately affected union members of color, preliminary Guild data shows Business Models Mobile & Apps Audience & Social Aggregation & Discovery Reporting & Production ABOUT SUBSCRIBE Business Models Mobile & Apps Audience & Social Aggregation & Discovery Reporting & Production Translations Jan. 28, 2026, 3:09 p.m. Aggregation & Discovery Business Models News publishers limit Internet Archive access due to AI scraping concerns Outlets like The Guardian and The New York Times are scrutinizing digital archives as potential backdoors for AI crawlers. By Andrew Deck and Hanaa’ Tameez Jan. 28, 2026, 3:09 p.m. Jan. 28, 2026, 3:09 p.m. As part of its mission to preserve the web, the Internet Archive operates crawlers that capture webpage snapshots. Many of these snapshots are accessible through its public-facing tool, the Wayback Machine . But as AI bots scavenge the web for training data to feed their models, the Internet Archive’s commitment to free information access has turned its digital library into a potential liability for some news publishers. When The Guardian took a look at who was trying to extract its content, access logs revealed that the Internet Archive was a frequent crawler, said Robert Hahn , head of business affairs and licensing. The publisher decided to limit the Internet Archive’s access to published articles, minimizing the chance that AI companies might scrape its content via the nonprofit’s repository of over one trillion webpage snapshots. RELATED ARTICLE The Wayback Machine’s snapshots of news homepages plummet after a “breakdown” in archiving projects Andrew Deck October 21, 2025 Specifically, Hahn said The Guardian has taken steps to exclude itself from the Internet Archive’s APIs and filter out its article pages from the Wayback Machine’s URLs interface. The Guardian’s regional homepages, topic pages, and other landing pages will continue to appear in the Wayback Machine. In particular, Hahn expressed concern about the Internet Archive’s APIs . “A lot of these AI businesses are looking for readily available, structured databases of content,” he said. “The Internet Archive’s API would have been an obvious place to plug their own machines into and suck out the IP.” (He admits the Wayback Machine itself is “less risky,” since the data is not as well-structured.) As news publishers try to safeguard their contents from AI companies, the Internet Archive is also getting caught in the crosshairs. The Financial Times, for example, blocks any bot that tries to scrape its paywalled content, including bots from OpenAI, Anthropic, Perplexity, and the Internet Archive. The majority of FT stories are paywalled, according to director of global public policy and platform strategy Matt Rogerson . As a result, usually only unpaywalled FT sto

Source: Hacker News | Original Link

My smart sleep mask broadcasts users’ brainwaves to an open MQTT broker

My smart sleep mask broadcasts users’ brainwaves to an open MQTT broker | aimilios My smart sleep mask broadcasts users’ brainwaves to an open MQTT broker 12 Feb, 2026 I recently got a smart sleep mask from Kickstarter. I was not expecting to end up with the ability to read strangers’ brainwaves and send them electric impulses in their sleep. But here we are. The mask was from a small Chinese research company, very cool hardware — EEG brain monitoring, electrical muscle stimulation around the eyes, vibration, heating, audio. The app was still rough around the edges though and the mask kept disconnecting, so I asked Claude to try reverse-engineer the Bluetooth protocol and build me a simple web control panel instead. Bluetooth The first thing Claude did was scan for BLE (Bluetooth Low Energy) devices nearby. It found mine among 35 devices in range, connected, and mapped the interface — two data channels. One for sending commands, one for streaming data. Then it tried talking to it. Sent maybe a hundred different command patterns. Modbus frames, JSON, raw bytes, common headers. Unfortunately, the device said nothing back, the protocol was not a standard one. The app So Claude went after the app instead. Grabbed the Android APK, decompiled it with jadx. Turns out the app is built with Flutter, which is a bit of a problem for reverse engineering. Flutter compiles Dart source code into native ARM64 machine code — you can’t just read it back like normal Java Android apps. The actual business logic lives in a 9MB binary blob. But even compiled binaries have strings in them. Error messages, URLs, debug logs. Claude ran strings on the binary and this was the most productive step of the whole session. Among the thousands of lines of Flutter framework noise, it found: Hardcoded credentials for the company’s message broker (shared by every copy of the app) Cloud API endpoints All fifteen command builder function names (e.g. to set vibration, heating, electric stimulation, etc.) Protocol debug messages that revealed the packet structure — header, direction byte, command type, payload, footer We had the shape of the protocol. Still didn’t have the actual byte values though. Claude then used blutter , a tool specifically for decompiling Flutter’s compiled Dart snapshots. It reconstructs the functions with readable annotations. Claude figured out the encoding, and just read off every command byte from every function. Fifteen commands, fully mapped. It works Claude sent a six-byte query packet. The device came back with 153 bytes — model number, firmware version, serial number, all eight sensor channel configurations (EEG at 250Hz, respiration, 3-axis accelerometer, 3-axis gyroscope). Battery at 83%. Vibration control worked. Heating worked. EMS worked. Music worked. Claude built me a little web dashboard with sliders for everything. I was pretty happy with it. That could have been the end of the story. The server Remember the hardcoded credentials from earl

Source: Hacker News | Original Link

uBlock filter list to hide all YouTube Shorts

GitHub – i5heu/ublock-hide-yt-shorts: Maintained – uBlock Origin filter list to hide YouTube Shorts Skip to content You signed in with another tab or window. Reload to refresh your session. You signed out in another tab or window. Reload to refresh your session. You switched accounts on another tab or window. Reload to refresh your session. Dismiss alert i5heu / ublock-hide-yt-shorts Public forked from gijsdev/ublock-hide-yt-shorts Notifications You must be signed in to change notification settings Fork 0 Star 71 Maintained – uBlock Origin filter list to hide YouTube Shorts License MIT license 71 stars 124 forks Branches Tags Activity Star Notifications You must be signed in to change notification settings i5heu/ublock-hide-yt-shorts master Branches Tags Go to file Code Open more actions menu Folders and files Name Name Last commit message Last commit date Latest commit History 76 Commits 76 Commits .github .github .gitignore .gitignore CONTRIBUTING.md CONTRIBUTING.md LICENSE.md LICENSE.md README.md README.md comments.txt comments.txt list.txt list.txt View all files Repository files navigation uBlock filter list to hide all YouTube Shorts A maintained uBlock Origin filter list to hide all traces of YouTube shorts videos. Copy the link below, go to uBlock Origin > Dashboard > Filter lists, scroll to the bottom, and paste the link underneath the ‘Import…’ heading: https://raw.githubusercontent.com/i5heu/ublock-hide-yt-shorts/master/list.txt > uBlock Origin subscribe link < (does not work on GitHub) Bonus: hide YouTube Comments https://raw.githubusercontent.com/i5heu/ublock-hide-yt-shorts/master/comments.txt > uBlock Origin subscribe link < (does not work on GitHub) Maintancance After the initial createor of this list @gijsdev is now vanished for half a year, i ( i5heu ) took it on me to maintain this list. No affiliation to Alphabet, YouTube or Google This project is an independent, open-source initiative and is not affiliated with, endorsed by, sponsored by, or associated with Alphabet Inc., Google LLC, or YouTube. Contributing See CONTRIBUTING.md License See LICENSE.md About Maintained - uBlock Origin filter list to hide YouTube Shorts Resources Readme License MIT license Contributing Contributing Uh oh! There was an error while loading. Please reload this page . Activity Stars 71 stars Watchers 1 watching Forks 0 forks Report repository Releases No releases published Packages 0 No packages published You can’t perform that action at this time.

Source: Hacker News | Original Link

News publishers limit Internet Archive access due to AI scraping concerns

News publishers limit Internet Archive access due to AI scraping concerns | Nieman Journalism Lab HOME About Subscribe Archives Foundation Reports Storyboard LATEST STORY Washington Post layoffs disproportionately affected union members of color, preliminary Guild data shows Business Models Mobile & Apps Audience & Social Aggregation & Discovery Reporting & Production ABOUT SUBSCRIBE Business Models Mobile & Apps Audience & Social Aggregation & Discovery Reporting & Production Translations Jan. 28, 2026, 3:09 p.m. Aggregation & Discovery Business Models News publishers limit Internet Archive access due to AI scraping concerns Outlets like The Guardian and The New York Times are scrutinizing digital archives as potential backdoors for AI crawlers. By Andrew Deck and Hanaa’ Tameez Jan. 28, 2026, 3:09 p.m. Jan. 28, 2026, 3:09 p.m. As part of its mission to preserve the web, the Internet Archive operates crawlers that capture webpage snapshots. Many of these snapshots are accessible through its public-facing tool, the Wayback Machine . But as AI bots scavenge the web for training data to feed their models, the Internet Archive’s commitment to free information access has turned its digital library into a potential liability for some news publishers. When The Guardian took a look at who was trying to extract its content, access logs revealed that the Internet Archive was a frequent crawler, said Robert Hahn , head of business affairs and licensing. The publisher decided to limit the Internet Archive’s access to published articles, minimizing the chance that AI companies might scrape its content via the nonprofit’s repository of over one trillion webpage snapshots. RELATED ARTICLE The Wayback Machine’s snapshots of news homepages plummet after a “breakdown” in archiving projects Andrew Deck October 21, 2025 Specifically, Hahn said The Guardian has taken steps to exclude itself from the Internet Archive’s APIs and filter out its article pages from the Wayback Machine’s URLs interface. The Guardian’s regional homepages, topic pages, and other landing pages will continue to appear in the Wayback Machine. In particular, Hahn expressed concern about the Internet Archive’s APIs . “A lot of these AI businesses are looking for readily available, structured databases of content,” he said. “The Internet Archive’s API would have been an obvious place to plug their own machines into and suck out the IP.” (He admits the Wayback Machine itself is “less risky,” since the data is not as well-structured.) As news publishers try to safeguard their contents from AI companies, the Internet Archive is also getting caught in the crosshairs. The Financial Times, for example, blocks any bot that tries to scrape its paywalled content, including bots from OpenAI, Anthropic, Perplexity, and the Internet Archive. The majority of FT stories are paywalled, according to director of global public policy and platform strategy Matt Rogerson . As a result, usually only unpaywalled FT sto

Source: Hacker News | Original Link

My smart sleep mask broadcasts users’ brainwaves to an open MQTT broker

My smart sleep mask broadcasts users’ brainwaves to an open MQTT broker | aimilios My smart sleep mask broadcasts users’ brainwaves to an open MQTT broker 12 Feb, 2026 I recently got a smart sleep mask from Kickstarter. I was not expecting to end up with the ability to read strangers’ brainwaves and send them electric impulses in their sleep. But here we are. The mask was from a small Chinese research company, very cool hardware — EEG brain monitoring, electrical muscle stimulation around the eyes, vibration, heating, audio. The app was still rough around the edges though and the mask kept disconnecting, so I asked Claude to try reverse-engineer the Bluetooth protocol and build me a simple web control panel instead. Bluetooth The first thing Claude did was scan for BLE (Bluetooth Low Energy) devices nearby. It found mine among 35 devices in range, connected, and mapped the interface — two data channels. One for sending commands, one for streaming data. Then it tried talking to it. Sent maybe a hundred different command patterns. Modbus frames, JSON, raw bytes, common headers. Unfortunately, the device said nothing back, the protocol was not a standard one. The app So Claude went after the app instead. Grabbed the Android APK, decompiled it with jadx. Turns out the app is built with Flutter, which is a bit of a problem for reverse engineering. Flutter compiles Dart source code into native ARM64 machine code — you can’t just read it back like normal Java Android apps. The actual business logic lives in a 9MB binary blob. But even compiled binaries have strings in them. Error messages, URLs, debug logs. Claude ran strings on the binary and this was the most productive step of the whole session. Among the thousands of lines of Flutter framework noise, it found: Hardcoded credentials for the company’s message broker (shared by every copy of the app) Cloud API endpoints All fifteen command builder function names (e.g. to set vibration, heating, electric stimulation, etc.) Protocol debug messages that revealed the packet structure — header, direction byte, command type, payload, footer We had the shape of the protocol. Still didn’t have the actual byte values though. Claude then used blutter , a tool specifically for decompiling Flutter’s compiled Dart snapshots. It reconstructs the functions with readable annotations. Claude figured out the encoding, and just read off every command byte from every function. Fifteen commands, fully mapped. It works Claude sent a six-byte query packet. The device came back with 153 bytes — model number, firmware version, serial number, all eight sensor channel configurations (EEG at 250Hz, respiration, 3-axis accelerometer, 3-axis gyroscope). Battery at 83%. Vibration control worked. Heating worked. EMS worked. Music worked. Claude built me a little web dashboard with sliders for everything. I was pretty happy with it. That could have been the end of the story. The server Remember the hardcoded credentials from earl

Source: Hacker News | Original Link

uBlock filter list to hide all YouTube Shorts

GitHub – i5heu/ublock-hide-yt-shorts: Maintained – uBlock Origin filter list to hide YouTube Shorts Skip to content You signed in with another tab or window. Reload to refresh your session. You signed out in another tab or window. Reload to refresh your session. You switched accounts on another tab or window. Reload to refresh your session. Dismiss alert i5heu / ublock-hide-yt-shorts Public forked from gijsdev/ublock-hide-yt-shorts Notifications You must be signed in to change notification settings Fork 0 Star 71 Maintained – uBlock Origin filter list to hide YouTube Shorts License MIT license 71 stars 124 forks Branches Tags Activity Star Notifications You must be signed in to change notification settings i5heu/ublock-hide-yt-shorts master Branches Tags Go to file Code Open more actions menu Folders and files Name Name Last commit message Last commit date Latest commit History 76 Commits 76 Commits .github .github .gitignore .gitignore CONTRIBUTING.md CONTRIBUTING.md LICENSE.md LICENSE.md README.md README.md comments.txt comments.txt list.txt list.txt View all files Repository files navigation uBlock filter list to hide all YouTube Shorts A maintained uBlock Origin filter list to hide all traces of YouTube shorts videos. Copy the link below, go to uBlock Origin > Dashboard > Filter lists, scroll to the bottom, and paste the link underneath the ‘Import…’ heading: https://raw.githubusercontent.com/i5heu/ublock-hide-yt-shorts/master/list.txt > uBlock Origin subscribe link < (does not work on GitHub) Bonus: hide YouTube Comments https://raw.githubusercontent.com/i5heu/ublock-hide-yt-shorts/master/comments.txt > uBlock Origin subscribe link < (does not work on GitHub) Maintancance After the initial createor of this list @gijsdev is now vanished for half a year, i ( i5heu ) took it on me to maintain this list. No affiliation to Alphabet, YouTube or Google This project is an independent, open-source initiative and is not affiliated with, endorsed by, sponsored by, or associated with Alphabet Inc., Google LLC, or YouTube. Contributing See CONTRIBUTING.md License See LICENSE.md About Maintained - uBlock Origin filter list to hide YouTube Shorts Resources Readme License MIT license Contributing Contributing Uh oh! There was an error while loading. Please reload this page . Activity Stars 71 stars Watchers 1 watching Forks 0 forks Report repository Releases No releases published Packages 0 No packages published You can’t perform that action at this time.

Source: Hacker News | Original Link

News publishers limit Internet Archive access due to AI scraping concerns

News publishers limit Internet Archive access due to AI scraping concerns | Nieman Journalism Lab HOME About Subscribe Archives Foundation Reports Storyboard LATEST STORY Washington Post layoffs disproportionately affected union members of color, preliminary Guild data shows Business Models Mobile & Apps Audience & Social Aggregation & Discovery Reporting & Production ABOUT SUBSCRIBE Business Models Mobile & Apps Audience & Social Aggregation & Discovery Reporting & Production Translations Jan. 28, 2026, 3:09 p.m. Aggregation & Discovery Business Models News publishers limit Internet Archive access due to AI scraping concerns Outlets like The Guardian and The New York Times are scrutinizing digital archives as potential backdoors for AI crawlers. By Andrew Deck and Hanaa’ Tameez Jan. 28, 2026, 3:09 p.m. Jan. 28, 2026, 3:09 p.m. As part of its mission to preserve the web, the Internet Archive operates crawlers that capture webpage snapshots. Many of these snapshots are accessible through its public-facing tool, the Wayback Machine . But as AI bots scavenge the web for training data to feed their models, the Internet Archive’s commitment to free information access has turned its digital library into a potential liability for some news publishers. When The Guardian took a look at who was trying to extract its content, access logs revealed that the Internet Archive was a frequent crawler, said Robert Hahn , head of business affairs and licensing. The publisher decided to limit the Internet Archive’s access to published articles, minimizing the chance that AI companies might scrape its content via the nonprofit’s repository of over one trillion webpage snapshots. RELATED ARTICLE The Wayback Machine’s snapshots of news homepages plummet after a “breakdown” in archiving projects Andrew Deck October 21, 2025 Specifically, Hahn said The Guardian has taken steps to exclude itself from the Internet Archive’s APIs and filter out its article pages from the Wayback Machine’s URLs interface. The Guardian’s regional homepages, topic pages, and other landing pages will continue to appear in the Wayback Machine. In particular, Hahn expressed concern about the Internet Archive’s APIs . “A lot of these AI businesses are looking for readily available, structured databases of content,” he said. “The Internet Archive’s API would have been an obvious place to plug their own machines into and suck out the IP.” (He admits the Wayback Machine itself is “less risky,” since the data is not as well-structured.) As news publishers try to safeguard their contents from AI companies, the Internet Archive is also getting caught in the crosshairs. The Financial Times, for example, blocks any bot that tries to scrape its paywalled content, including bots from OpenAI, Anthropic, Perplexity, and the Internet Archive. The majority of FT stories are paywalled, according to director of global public policy and platform strategy Matt Rogerson . As a result, usually only unpaywalled FT sto

Source: Hacker News | Original Link

My smart sleep mask broadcasts users’ brainwaves to an open MQTT broker

My smart sleep mask broadcasts users’ brainwaves to an open MQTT broker | aimilios My smart sleep mask broadcasts users’ brainwaves to an open MQTT broker 12 Feb, 2026 I recently got a smart sleep mask from Kickstarter. I was not expecting to end up with the ability to read strangers’ brainwaves and send them electric impulses in their sleep. But here we are. The mask was from a small Chinese research company, very cool hardware — EEG brain monitoring, electrical muscle stimulation around the eyes, vibration, heating, audio. The app was still rough around the edges though and the mask kept disconnecting, so I asked Claude to try reverse-engineer the Bluetooth protocol and build me a simple web control panel instead. Bluetooth The first thing Claude did was scan for BLE (Bluetooth Low Energy) devices nearby. It found mine among 35 devices in range, connected, and mapped the interface — two data channels. One for sending commands, one for streaming data. Then it tried talking to it. Sent maybe a hundred different command patterns. Modbus frames, JSON, raw bytes, common headers. Unfortunately, the device said nothing back, the protocol was not a standard one. The app So Claude went after the app instead. Grabbed the Android APK, decompiled it with jadx. Turns out the app is built with Flutter, which is a bit of a problem for reverse engineering. Flutter compiles Dart source code into native ARM64 machine code — you can’t just read it back like normal Java Android apps. The actual business logic lives in a 9MB binary blob. But even compiled binaries have strings in them. Error messages, URLs, debug logs. Claude ran strings on the binary and this was the most productive step of the whole session. Among the thousands of lines of Flutter framework noise, it found: Hardcoded credentials for the company’s message broker (shared by every copy of the app) Cloud API endpoints All fifteen command builder function names (e.g. to set vibration, heating, electric stimulation, etc.) Protocol debug messages that revealed the packet structure — header, direction byte, command type, payload, footer We had the shape of the protocol. Still didn’t have the actual byte values though. Claude then used blutter , a tool specifically for decompiling Flutter’s compiled Dart snapshots. It reconstructs the functions with readable annotations. Claude figured out the encoding, and just read off every command byte from every function. Fifteen commands, fully mapped. It works Claude sent a six-byte query packet. The device came back with 153 bytes — model number, firmware version, serial number, all eight sensor channel configurations (EEG at 250Hz, respiration, 3-axis accelerometer, 3-axis gyroscope). Battery at 83%. Vibration control worked. Heating worked. EMS worked. Music worked. Claude built me a little web dashboard with sliders for everything. I was pretty happy with it. That could have been the end of the story. The server Remember the hardcoded credentials from earl

Source: Hacker News | Original Link

uBlock filter list to hide all YouTube Shorts

GitHub – i5heu/ublock-hide-yt-shorts: Maintained – uBlock Origin filter list to hide YouTube Shorts Skip to content You signed in with another tab or window. Reload to refresh your session. You signed out in another tab or window. Reload to refresh your session. You switched accounts on another tab or window. Reload to refresh your session. Dismiss alert i5heu / ublock-hide-yt-shorts Public forked from gijsdev/ublock-hide-yt-shorts Notifications You must be signed in to change notification settings Fork 0 Star 71 Maintained – uBlock Origin filter list to hide YouTube Shorts License MIT license 71 stars 124 forks Branches Tags Activity Star Notifications You must be signed in to change notification settings i5heu/ublock-hide-yt-shorts master Branches Tags Go to file Code Open more actions menu Folders and files Name Name Last commit message Last commit date Latest commit History 76 Commits 76 Commits .github .github .gitignore .gitignore CONTRIBUTING.md CONTRIBUTING.md LICENSE.md LICENSE.md README.md README.md comments.txt comments.txt list.txt list.txt View all files Repository files navigation uBlock filter list to hide all YouTube Shorts A maintained uBlock Origin filter list to hide all traces of YouTube shorts videos. Copy the link below, go to uBlock Origin > Dashboard > Filter lists, scroll to the bottom, and paste the link underneath the ‘Import…’ heading: https://raw.githubusercontent.com/i5heu/ublock-hide-yt-shorts/master/list.txt > uBlock Origin subscribe link < (does not work on GitHub) Bonus: hide YouTube Comments https://raw.githubusercontent.com/i5heu/ublock-hide-yt-shorts/master/comments.txt > uBlock Origin subscribe link < (does not work on GitHub) Maintancance After the initial createor of this list @gijsdev is now vanished for half a year, i ( i5heu ) took it on me to maintain this list. No affiliation to Alphabet, YouTube or Google This project is an independent, open-source initiative and is not affiliated with, endorsed by, sponsored by, or associated with Alphabet Inc., Google LLC, or YouTube. Contributing See CONTRIBUTING.md License See LICENSE.md About Maintained - uBlock Origin filter list to hide YouTube Shorts Resources Readme License MIT license Contributing Contributing Uh oh! There was an error while loading. Please reload this page . Activity Stars 71 stars Watchers 1 watching Forks 0 forks Report repository Releases No releases published Packages 0 No packages published You can’t perform that action at this time.

Source: Hacker News | Original Link

News publishers limit Internet Archive access due to AI scraping concerns

News publishers limit Internet Archive access due to AI scraping concerns | Nieman Journalism Lab HOME About Subscribe Archives Foundation Reports Storyboard LATEST STORY Washington Post layoffs disproportionately affected union members of color, preliminary Guild data shows Business Models Mobile & Apps Audience & Social Aggregation & Discovery Reporting & Production ABOUT SUBSCRIBE Business Models Mobile & Apps Audience & Social Aggregation & Discovery Reporting & Production Translations Jan. 28, 2026, 3:09 p.m. Aggregation & Discovery Business Models News publishers limit Internet Archive access due to AI scraping concerns Outlets like The Guardian and The New York Times are scrutinizing digital archives as potential backdoors for AI crawlers. By Andrew Deck and Hanaa’ Tameez Jan. 28, 2026, 3:09 p.m. Jan. 28, 2026, 3:09 p.m. As part of its mission to preserve the web, the Internet Archive operates crawlers that capture webpage snapshots. Many of these snapshots are accessible through its public-facing tool, the Wayback Machine . But as AI bots scavenge the web for training data to feed their models, the Internet Archive’s commitment to free information access has turned its digital library into a potential liability for some news publishers. When The Guardian took a look at who was trying to extract its content, access logs revealed that the Internet Archive was a frequent crawler, said Robert Hahn , head of business affairs and licensing. The publisher decided to limit the Internet Archive’s access to published articles, minimizing the chance that AI companies might scrape its content via the nonprofit’s repository of over one trillion webpage snapshots. RELATED ARTICLE The Wayback Machine’s snapshots of news homepages plummet after a “breakdown” in archiving projects Andrew Deck October 21, 2025 Specifically, Hahn said The Guardian has taken steps to exclude itself from the Internet Archive’s APIs and filter out its article pages from the Wayback Machine’s URLs interface. The Guardian’s regional homepages, topic pages, and other landing pages will continue to appear in the Wayback Machine. In particular, Hahn expressed concern about the Internet Archive’s APIs . “A lot of these AI businesses are looking for readily available, structured databases of content,” he said. “The Internet Archive’s API would have been an obvious place to plug their own machines into and suck out the IP.” (He admits the Wayback Machine itself is “less risky,” since the data is not as well-structured.) As news publishers try to safeguard their contents from AI companies, the Internet Archive is also getting caught in the crosshairs. The Financial Times, for example, blocks any bot that tries to scrape its paywalled content, including bots from OpenAI, Anthropic, Perplexity, and the Internet Archive. The majority of FT stories are paywalled, according to director of global public policy and platform strategy Matt Rogerson . As a result, usually only unpaywalled FT sto

Source: Hacker News | Original Link

My smart sleep mask broadcasts users’ brainwaves to an open MQTT broker

My smart sleep mask broadcasts users’ brainwaves to an open MQTT broker | aimilios My smart sleep mask broadcasts users’ brainwaves to an open MQTT broker 12 Feb, 2026 I recently got a smart sleep mask from Kickstarter. I was not expecting to end up with the ability to read strangers’ brainwaves and send them electric impulses in their sleep. But here we are. The mask was from a small Chinese research company, very cool hardware — EEG brain monitoring, electrical muscle stimulation around the eyes, vibration, heating, audio. The app was still rough around the edges though and the mask kept disconnecting, so I asked Claude to try reverse-engineer the Bluetooth protocol and build me a simple web control panel instead. Bluetooth The first thing Claude did was scan for BLE (Bluetooth Low Energy) devices nearby. It found mine among 35 devices in range, connected, and mapped the interface — two data channels. One for sending commands, one for streaming data. Then it tried talking to it. Sent maybe a hundred different command patterns. Modbus frames, JSON, raw bytes, common headers. Unfortunately, the device said nothing back, the protocol was not a standard one. The app So Claude went after the app instead. Grabbed the Android APK, decompiled it with jadx. Turns out the app is built with Flutter, which is a bit of a problem for reverse engineering. Flutter compiles Dart source code into native ARM64 machine code — you can’t just read it back like normal Java Android apps. The actual business logic lives in a 9MB binary blob. But even compiled binaries have strings in them. Error messages, URLs, debug logs. Claude ran strings on the binary and this was the most productive step of the whole session. Among the thousands of lines of Flutter framework noise, it found: Hardcoded credentials for the company’s message broker (shared by every copy of the app) Cloud API endpoints All fifteen command builder function names (e.g. to set vibration, heating, electric stimulation, etc.) Protocol debug messages that revealed the packet structure — header, direction byte, command type, payload, footer We had the shape of the protocol. Still didn’t have the actual byte values though. Claude then used blutter , a tool specifically for decompiling Flutter’s compiled Dart snapshots. It reconstructs the functions with readable annotations. Claude figured out the encoding, and just read off every command byte from every function. Fifteen commands, fully mapped. It works Claude sent a six-byte query packet. The device came back with 153 bytes — model number, firmware version, serial number, all eight sensor channel configurations (EEG at 250Hz, respiration, 3-axis accelerometer, 3-axis gyroscope). Battery at 83%. Vibration control worked. Heating worked. EMS worked. Music worked. Claude built me a little web dashboard with sliders for everything. I was pretty happy with it. That could have been the end of the story. The server Remember the hardcoded credentials from earl

Source: Hacker News | Original Link

uBlock filter list to hide all YouTube Shorts

GitHub – i5heu/ublock-hide-yt-shorts: Maintained – uBlock Origin filter list to hide YouTube Shorts Skip to content You signed in with another tab or window. Reload to refresh your session. You signed out in another tab or window. Reload to refresh your session. You switched accounts on another tab or window. Reload to refresh your session. Dismiss alert i5heu / ublock-hide-yt-shorts Public forked from gijsdev/ublock-hide-yt-shorts Notifications You must be signed in to change notification settings Fork 0 Star 71 Maintained – uBlock Origin filter list to hide YouTube Shorts License MIT license 71 stars 124 forks Branches Tags Activity Star Notifications You must be signed in to change notification settings i5heu/ublock-hide-yt-shorts master Branches Tags Go to file Code Open more actions menu Folders and files Name Name Last commit message Last commit date Latest commit History 76 Commits 76 Commits .github .github .gitignore .gitignore CONTRIBUTING.md CONTRIBUTING.md LICENSE.md LICENSE.md README.md README.md comments.txt comments.txt list.txt list.txt View all files Repository files navigation uBlock filter list to hide all YouTube Shorts A maintained uBlock Origin filter list to hide all traces of YouTube shorts videos. Copy the link below, go to uBlock Origin > Dashboard > Filter lists, scroll to the bottom, and paste the link underneath the ‘Import…’ heading: https://raw.githubusercontent.com/i5heu/ublock-hide-yt-shorts/master/list.txt > uBlock Origin subscribe link < (does not work on GitHub) Bonus: hide YouTube Comments https://raw.githubusercontent.com/i5heu/ublock-hide-yt-shorts/master/comments.txt > uBlock Origin subscribe link < (does not work on GitHub) Maintancance After the initial createor of this list @gijsdev is now vanished for half a year, i ( i5heu ) took it on me to maintain this list. No affiliation to Alphabet, YouTube or Google This project is an independent, open-source initiative and is not affiliated with, endorsed by, sponsored by, or associated with Alphabet Inc., Google LLC, or YouTube. Contributing See CONTRIBUTING.md License See LICENSE.md About Maintained - uBlock Origin filter list to hide YouTube Shorts Resources Readme License MIT license Contributing Contributing Uh oh! There was an error while loading. Please reload this page . Activity Stars 71 stars Watchers 1 watching Forks 0 forks Report repository Releases No releases published Packages 0 No packages published You can’t perform that action at this time.

Source: Hacker News | Original Link

News publishers limit Internet Archive access due to AI scraping concerns

News publishers limit Internet Archive access due to AI scraping concerns | Nieman Journalism Lab HOME About Subscribe Archives Foundation Reports Storyboard LATEST STORY Washington Post layoffs disproportionately affected union members of color, preliminary Guild data shows Business Models Mobile & Apps Audience & Social Aggregation & Discovery Reporting & Production ABOUT SUBSCRIBE Business Models Mobile & Apps Audience & Social Aggregation & Discovery Reporting & Production Translations Jan. 28, 2026, 3:09 p.m. Aggregation & Discovery Business Models News publishers limit Internet Archive access due to AI scraping concerns Outlets like The Guardian and The New York Times are scrutinizing digital archives as potential backdoors for AI crawlers. By Andrew Deck and Hanaa’ Tameez Jan. 28, 2026, 3:09 p.m. Jan. 28, 2026, 3:09 p.m. As part of its mission to preserve the web, the Internet Archive operates crawlers that capture webpage snapshots. Many of these snapshots are accessible through its public-facing tool, the Wayback Machine . But as AI bots scavenge the web for training data to feed their models, the Internet Archive’s commitment to free information access has turned its digital library into a potential liability for some news publishers. When The Guardian took a look at who was trying to extract its content, access logs revealed that the Internet Archive was a frequent crawler, said Robert Hahn , head of business affairs and licensing. The publisher decided to limit the Internet Archive’s access to published articles, minimizing the chance that AI companies might scrape its content via the nonprofit’s repository of over one trillion webpage snapshots. RELATED ARTICLE The Wayback Machine’s snapshots of news homepages plummet after a “breakdown” in archiving projects Andrew Deck October 21, 2025 Specifically, Hahn said The Guardian has taken steps to exclude itself from the Internet Archive’s APIs and filter out its article pages from the Wayback Machine’s URLs interface. The Guardian’s regional homepages, topic pages, and other landing pages will continue to appear in the Wayback Machine. In particular, Hahn expressed concern about the Internet Archive’s APIs . “A lot of these AI businesses are looking for readily available, structured databases of content,” he said. “The Internet Archive’s API would have been an obvious place to plug their own machines into and suck out the IP.” (He admits the Wayback Machine itself is “less risky,” since the data is not as well-structured.) As news publishers try to safeguard their contents from AI companies, the Internet Archive is also getting caught in the crosshairs. The Financial Times, for example, blocks any bot that tries to scrape its paywalled content, including bots from OpenAI, Anthropic, Perplexity, and the Internet Archive. The majority of FT stories are paywalled, according to director of global public policy and platform strategy Matt Rogerson . As a result, usually only unpaywalled FT sto

Source: Hacker News | Original Link

My smart sleep mask broadcasts users’ brainwaves to an open MQTT broker

My smart sleep mask broadcasts users’ brainwaves to an open MQTT broker | aimilios My smart sleep mask broadcasts users’ brainwaves to an open MQTT broker 12 Feb, 2026 I recently got a smart sleep mask from Kickstarter. I was not expecting to end up with the ability to read strangers’ brainwaves and send them electric impulses in their sleep. But here we are. The mask was from a small Chinese research company, very cool hardware — EEG brain monitoring, electrical muscle stimulation around the eyes, vibration, heating, audio. The app was still rough around the edges though and the mask kept disconnecting, so I asked Claude to try reverse-engineer the Bluetooth protocol and build me a simple web control panel instead. Bluetooth The first thing Claude did was scan for BLE (Bluetooth Low Energy) devices nearby. It found mine among 35 devices in range, connected, and mapped the interface — two data channels. One for sending commands, one for streaming data. Then it tried talking to it. Sent maybe a hundred different command patterns. Modbus frames, JSON, raw bytes, common headers. Unfortunately, the device said nothing back, the protocol was not a standard one. The app So Claude went after the app instead. Grabbed the Android APK, decompiled it with jadx. Turns out the app is built with Flutter, which is a bit of a problem for reverse engineering. Flutter compiles Dart source code into native ARM64 machine code — you can’t just read it back like normal Java Android apps. The actual business logic lives in a 9MB binary blob. But even compiled binaries have strings in them. Error messages, URLs, debug logs. Claude ran strings on the binary and this was the most productive step of the whole session. Among the thousands of lines of Flutter framework noise, it found: Hardcoded credentials for the company’s message broker (shared by every copy of the app) Cloud API endpoints All fifteen command builder function names (e.g. to set vibration, heating, electric stimulation, etc.) Protocol debug messages that revealed the packet structure — header, direction byte, command type, payload, footer We had the shape of the protocol. Still didn’t have the actual byte values though. Claude then used blutter , a tool specifically for decompiling Flutter’s compiled Dart snapshots. It reconstructs the functions with readable annotations. Claude figured out the encoding, and just read off every command byte from every function. Fifteen commands, fully mapped. It works Claude sent a six-byte query packet. The device came back with 153 bytes — model number, firmware version, serial number, all eight sensor channel configurations (EEG at 250Hz, respiration, 3-axis accelerometer, 3-axis gyroscope). Battery at 83%. Vibration control worked. Heating worked. EMS worked. Music worked. Claude built me a little web dashboard with sliders for everything. I was pretty happy with it. That could have been the end of the story. The server Remember the hardcoded credentials from earl

Source: Hacker News | Original Link

uBlock filter list to hide all YouTube Shorts

GitHub – i5heu/ublock-hide-yt-shorts: Maintained – uBlock Origin filter list to hide YouTube Shorts Skip to content You signed in with another tab or window. Reload to refresh your session. You signed out in another tab or window. Reload to refresh your session. You switched accounts on another tab or window. Reload to refresh your session. Dismiss alert i5heu / ublock-hide-yt-shorts Public forked from gijsdev/ublock-hide-yt-shorts Notifications You must be signed in to change notification settings Fork 0 Star 71 Maintained – uBlock Origin filter list to hide YouTube Shorts License MIT license 71 stars 124 forks Branches Tags Activity Star Notifications You must be signed in to change notification settings i5heu/ublock-hide-yt-shorts master Branches Tags Go to file Code Open more actions menu Folders and files Name Name Last commit message Last commit date Latest commit History 76 Commits 76 Commits .github .github .gitignore .gitignore CONTRIBUTING.md CONTRIBUTING.md LICENSE.md LICENSE.md README.md README.md comments.txt comments.txt list.txt list.txt View all files Repository files navigation uBlock filter list to hide all YouTube Shorts A maintained uBlock Origin filter list to hide all traces of YouTube shorts videos. Copy the link below, go to uBlock Origin > Dashboard > Filter lists, scroll to the bottom, and paste the link underneath the ‘Import…’ heading: https://raw.githubusercontent.com/i5heu/ublock-hide-yt-shorts/master/list.txt > uBlock Origin subscribe link < (does not work on GitHub) Bonus: hide YouTube Comments https://raw.githubusercontent.com/i5heu/ublock-hide-yt-shorts/master/comments.txt > uBlock Origin subscribe link < (does not work on GitHub) Maintancance After the initial createor of this list @gijsdev is now vanished for half a year, i ( i5heu ) took it on me to maintain this list. No affiliation to Alphabet, YouTube or Google This project is an independent, open-source initiative and is not affiliated with, endorsed by, sponsored by, or associated with Alphabet Inc., Google LLC, or YouTube. Contributing See CONTRIBUTING.md License See LICENSE.md About Maintained - uBlock Origin filter list to hide YouTube Shorts Resources Readme License MIT license Contributing Contributing Uh oh! There was an error while loading. Please reload this page . Activity Stars 71 stars Watchers 1 watching Forks 0 forks Report repository Releases No releases published Packages 0 No packages published You can’t perform that action at this time.

Source: Hacker News | Original Link

News publishers limit Internet Archive access due to AI scraping concerns

News publishers limit Internet Archive access due to AI scraping concerns | Nieman Journalism Lab HOME About Subscribe Archives Foundation Reports Storyboard LATEST STORY Washington Post layoffs disproportionately affected union members of color, preliminary Guild data shows Business Models Mobile & Apps Audience & Social Aggregation & Discovery Reporting & Production ABOUT SUBSCRIBE Business Models Mobile & Apps Audience & Social Aggregation & Discovery Reporting & Production Translations Jan. 28, 2026, 3:09 p.m. Aggregation & Discovery Business Models News publishers limit Internet Archive access due to AI scraping concerns Outlets like The Guardian and The New York Times are scrutinizing digital archives as potential backdoors for AI crawlers. By Andrew Deck and Hanaa’ Tameez Jan. 28, 2026, 3:09 p.m. Jan. 28, 2026, 3:09 p.m. As part of its mission to preserve the web, the Internet Archive operates crawlers that capture webpage snapshots. Many of these snapshots are accessible through its public-facing tool, the Wayback Machine . But as AI bots scavenge the web for training data to feed their models, the Internet Archive’s commitment to free information access has turned its digital library into a potential liability for some news publishers. When The Guardian took a look at who was trying to extract its content, access logs revealed that the Internet Archive was a frequent crawler, said Robert Hahn , head of business affairs and licensing. The publisher decided to limit the Internet Archive’s access to published articles, minimizing the chance that AI companies might scrape its content via the nonprofit’s repository of over one trillion webpage snapshots. RELATED ARTICLE The Wayback Machine’s snapshots of news homepages plummet after a “breakdown” in archiving projects Andrew Deck October 21, 2025 Specifically, Hahn said The Guardian has taken steps to exclude itself from the Internet Archive’s APIs and filter out its article pages from the Wayback Machine’s URLs interface. The Guardian’s regional homepages, topic pages, and other landing pages will continue to appear in the Wayback Machine. In particular, Hahn expressed concern about the Internet Archive’s APIs . “A lot of these AI businesses are looking for readily available, structured databases of content,” he said. “The Internet Archive’s API would have been an obvious place to plug their own machines into and suck out the IP.” (He admits the Wayback Machine itself is “less risky,” since the data is not as well-structured.) As news publishers try to safeguard their contents from AI companies, the Internet Archive is also getting caught in the crosshairs. The Financial Times, for example, blocks any bot that tries to scrape its paywalled content, including bots from OpenAI, Anthropic, Perplexity, and the Internet Archive. The majority of FT stories are paywalled, according to director of global public policy and platform strategy Matt Rogerson . As a result, usually only unpaywalled FT sto

Source: Hacker News | Original Link

My smart sleep mask broadcasts users’ brainwaves to an open MQTT broker

My smart sleep mask broadcasts users’ brainwaves to an open MQTT broker | aimilios My smart sleep mask broadcasts users’ brainwaves to an open MQTT broker 12 Feb, 2026 I recently got a smart sleep mask from Kickstarter. I was not expecting to end up with the ability to read strangers’ brainwaves and send them electric impulses in their sleep. But here we are. The mask was from a small Chinese research company, very cool hardware — EEG brain monitoring, electrical muscle stimulation around the eyes, vibration, heating, audio. The app was still rough around the edges though and the mask kept disconnecting, so I asked Claude to try reverse-engineer the Bluetooth protocol and build me a simple web control panel instead. Bluetooth The first thing Claude did was scan for BLE (Bluetooth Low Energy) devices nearby. It found mine among 35 devices in range, connected, and mapped the interface — two data channels. One for sending commands, one for streaming data. Then it tried talking to it. Sent maybe a hundred different command patterns. Modbus frames, JSON, raw bytes, common headers. Unfortunately, the device said nothing back, the protocol was not a standard one. The app So Claude went after the app instead. Grabbed the Android APK, decompiled it with jadx. Turns out the app is built with Flutter, which is a bit of a problem for reverse engineering. Flutter compiles Dart source code into native ARM64 machine code — you can’t just read it back like normal Java Android apps. The actual business logic lives in a 9MB binary blob. But even compiled binaries have strings in them. Error messages, URLs, debug logs. Claude ran strings on the binary and this was the most productive step of the whole session. Among the thousands of lines of Flutter framework noise, it found: Hardcoded credentials for the company’s message broker (shared by every copy of the app) Cloud API endpoints All fifteen command builder function names (e.g. to set vibration, heating, electric stimulation, etc.) Protocol debug messages that revealed the packet structure — header, direction byte, command type, payload, footer We had the shape of the protocol. Still didn’t have the actual byte values though. Claude then used blutter , a tool specifically for decompiling Flutter’s compiled Dart snapshots. It reconstructs the functions with readable annotations. Claude figured out the encoding, and just read off every command byte from every function. Fifteen commands, fully mapped. It works Claude sent a six-byte query packet. The device came back with 153 bytes — model number, firmware version, serial number, all eight sensor channel configurations (EEG at 250Hz, respiration, 3-axis accelerometer, 3-axis gyroscope). Battery at 83%. Vibration control worked. Heating worked. EMS worked. Music worked. Claude built me a little web dashboard with sliders for everything. I was pretty happy with it. That could have been the end of the story. The server Remember the hardcoded credentials from earl

Source: Hacker News | Original Link

uBlock filter list to hide all YouTube Shorts

GitHub – i5heu/ublock-hide-yt-shorts: Maintained – uBlock Origin filter list to hide YouTube Shorts Skip to content You signed in with another tab or window. Reload to refresh your session. You signed out in another tab or window. Reload to refresh your session. You switched accounts on another tab or window. Reload to refresh your session. Dismiss alert i5heu / ublock-hide-yt-shorts Public forked from gijsdev/ublock-hide-yt-shorts Notifications You must be signed in to change notification settings Fork 0 Star 71 Maintained – uBlock Origin filter list to hide YouTube Shorts License MIT license 71 stars 124 forks Branches Tags Activity Star Notifications You must be signed in to change notification settings i5heu/ublock-hide-yt-shorts master Branches Tags Go to file Code Open more actions menu Folders and files Name Name Last commit message Last commit date Latest commit History 76 Commits 76 Commits .github .github .gitignore .gitignore CONTRIBUTING.md CONTRIBUTING.md LICENSE.md LICENSE.md README.md README.md comments.txt comments.txt list.txt list.txt View all files Repository files navigation uBlock filter list to hide all YouTube Shorts A maintained uBlock Origin filter list to hide all traces of YouTube shorts videos. Copy the link below, go to uBlock Origin > Dashboard > Filter lists, scroll to the bottom, and paste the link underneath the ‘Import…’ heading: https://raw.githubusercontent.com/i5heu/ublock-hide-yt-shorts/master/list.txt > uBlock Origin subscribe link < (does not work on GitHub) Bonus: hide YouTube Comments https://raw.githubusercontent.com/i5heu/ublock-hide-yt-shorts/master/comments.txt > uBlock Origin subscribe link < (does not work on GitHub) Maintancance After the initial createor of this list @gijsdev is now vanished for half a year, i ( i5heu ) took it on me to maintain this list. No affiliation to Alphabet, YouTube or Google This project is an independent, open-source initiative and is not affiliated with, endorsed by, sponsored by, or associated with Alphabet Inc., Google LLC, or YouTube. Contributing See CONTRIBUTING.md License See LICENSE.md About Maintained - uBlock Origin filter list to hide YouTube Shorts Resources Readme License MIT license Contributing Contributing Uh oh! There was an error while loading. Please reload this page . Activity Stars 71 stars Watchers 1 watching Forks 0 forks Report repository Releases No releases published Packages 0 No packages published You can’t perform that action at this time.

Source: Hacker News | Original Link

News publishers limit Internet Archive access due to AI scraping concerns

News publishers limit Internet Archive access due to AI scraping concerns | Nieman Journalism Lab HOME About Subscribe Archives Foundation Reports Storyboard LATEST STORY Washington Post layoffs disproportionately affected union members of color, preliminary Guild data shows Business Models Mobile & Apps Audience & Social Aggregation & Discovery Reporting & Production ABOUT SUBSCRIBE Business Models Mobile & Apps Audience & Social Aggregation & Discovery Reporting & Production Translations Jan. 28, 2026, 3:09 p.m. Aggregation & Discovery Business Models News publishers limit Internet Archive access due to AI scraping concerns Outlets like The Guardian and The New York Times are scrutinizing digital archives as potential backdoors for AI crawlers. By Andrew Deck and Hanaa’ Tameez Jan. 28, 2026, 3:09 p.m. Jan. 28, 2026, 3:09 p.m. As part of its mission to preserve the web, the Internet Archive operates crawlers that capture webpage snapshots. Many of these snapshots are accessible through its public-facing tool, the Wayback Machine . But as AI bots scavenge the web for training data to feed their models, the Internet Archive’s commitment to free information access has turned its digital library into a potential liability for some news publishers. When The Guardian took a look at who was trying to extract its content, access logs revealed that the Internet Archive was a frequent crawler, said Robert Hahn , head of business affairs and licensing. The publisher decided to limit the Internet Archive’s access to published articles, minimizing the chance that AI companies might scrape its content via the nonprofit’s repository of over one trillion webpage snapshots. RELATED ARTICLE The Wayback Machine’s snapshots of news homepages plummet after a “breakdown” in archiving projects Andrew Deck October 21, 2025 Specifically, Hahn said The Guardian has taken steps to exclude itself from the Internet Archive’s APIs and filter out its article pages from the Wayback Machine’s URLs interface. The Guardian’s regional homepages, topic pages, and other landing pages will continue to appear in the Wayback Machine. In particular, Hahn expressed concern about the Internet Archive’s APIs . “A lot of these AI businesses are looking for readily available, structured databases of content,” he said. “The Internet Archive’s API would have been an obvious place to plug their own machines into and suck out the IP.” (He admits the Wayback Machine itself is “less risky,” since the data is not as well-structured.) As news publishers try to safeguard their contents from AI companies, the Internet Archive is also getting caught in the crosshairs. The Financial Times, for example, blocks any bot that tries to scrape its paywalled content, including bots from OpenAI, Anthropic, Perplexity, and the Internet Archive. The majority of FT stories are paywalled, according to director of global public policy and platform strategy Matt Rogerson . As a result, usually only unpaywalled FT sto

Source: Hacker News | Original Link