One Day the Webpages You Rely on Will Disappear

It’s a matter of ‘link rot and digital decay’ says Pew Research.

Webpages might seem like old hat in the greater context of cloud services, mobile apps, artificial intelligence, and everything else that comes off the tech conveyor belt.

But the web is still home to many information services — articles, books, graphic data displays, videos, music, databases, and so much more — that people and organizations regularly look to, and entire pages are disappearing at an alarming rate.

Despite how it might seem that digitization would save the world of knowledge, even information dies from a variety of ways, including digital obsolescence. Storage forms go out of fashion and then the necessary software and hardware can be next to impossible to find. There are only so many human, technical, and financial resources to save the massive amounts of data that the world generates.

And then there’s the lack of ongoing maintenance, which is a problem with the web, according to Pew Research. “A quarter of all webpages that existed at one point between 2013 and 2023 are no longer accessible, as of October 2023,” they write. “In most cases, this is because an individual page was deleted or removed on an otherwise functional website. For older content, this trend is even starker. Some 38% of webpages that existed in 2013 are not available today, compared with 8% of pages that existed in 2023.”

According to Pew, 23% of news webpages have at least one broken link. For government sites, it’s 21%. No matter how popular the sites or pages, the statistics are about the same. For Wikipedia pages, more than half have at least one broken link in their references section.

Social media is no better and may be worse. On X/Twitter, almost 20% of tweets disappear within months. “In 60% of these cases, the account that originally posted the tweet was made private, suspended or deleted entirely,” Pew says. “In the other 40%, the account holder deleted the individual tweet, but the account itself still existed.”

This has a number of implications for companies, including those in CRE. They can be affected by being unable to find reference material that they expected. Sometimes that can be circumvented by going to Archive.org and searching the caches for a URL, then picking an older version of the page to see if the information has been captured somewhere.

What doesn’t help is if a company doesn’t keep up with its own websites, checking for broken links and replacing them with other sources where possible. Monitoring your own sites for these breakages should be an ongoing project.