AI web crawlers are specialized bots that scan and copy vast amounts of online content so it can be used to train AI models. Over the past few years, as these models have grown larger and hungrier for data, crawling activity has surged, especially from bots belonging to major AI companies.
For many smaller websites, this has meant sudden spikes in traffic that they never asked for, slower load times, bandwidth overuse and unexpected infrastructure strain. To cope, site owners have had to upgrade hosting and security, pushing up costs just to stay online.
Beneath the technical stress lies a deeper concern. Much of this crawling happens without meaningful consent, compensation or transparency for the people and organizations who created the content. Publishers, creators and platforms are increasingly asking why their work should be freely harvested to improve commercial AI systems.
In response, more sites are starting to block or limit AI crawlers, and infrastructure providers are experimenting with tools that let owners meter, charge for or deny access. The open web is being quietly reshaped by this tug‑of‑war, and it is unlikely to function the way it once did.