You can now read 5 articles in a month for free on Read as much as you want anywhere and anytime for just 99¢.

British Library sets out to archive the Web

The British Library will use an automated Web harvester to archive about a billion pages, some of them daily.

Lefteris Pitarakis /Associated Press

The British Library will use an automated Web harvester to archive about a billion pages, some of them daily.

LONDON — For centuries, the British Library has kept a copy of every book, pamphlet, magazine, and newspaper published in Britain. Starting Saturday, it will also be bound to record every British website, e-book, online newsletter, and blog in a bid to preserve the nation’s ‘‘digital memory.’’ The library also has to make this digital archive available to future researchers.

It says the work is urgent; firsthand accounts of everything from the 2005 London transit bombings to Britain’s 2010 election campaign have already vanished.

Continue reading below

‘‘Stuff out there on the Web is ephemeral,’’ said Lucie Burgess, head of content strategy. ‘‘The average life of a Web page is only 75 days.”

Like reference collections worldwide, the British Library has been trying to archive the Web for years in a piecemeal way, having to get permission from website owners before taking snapshots of their pages. That began to change with a law passed in 2003, but it has taken a decade of legislative and technological preparation to begin a vast trawling of all sites that end with the suffix .uk.

An automated Web harvester will scan and record 1 billion Web pages. Most will be captured once a year, but hundreds of thousands of fast-changing sites such as those of newspapers and magazines will be archived as often as once a day. The library plans to make the content publicly available by year’s end.

Loading comments...
Want each day's news headlines delivered fresh to your
inbox every morning? Just connect with us
in one of the following ways:
Please enter a valid email will never post anything without asking.
Privacy Policy
Subscriber Log In

You have reached the limit of 5 free articles in a month

Stay informed with unlimited access to Boston’s trusted news source.

  • High-quality journalism from the region’s largest newsroom
  • Convenient access across all of your devices
  • Today’s Headlines daily newsletter
  • Subscriber-only access to exclusive offers, events, contests, eBooks, and more
  • Less than 25¢ a week
Marketing image of
Marketing image of