LONDON — For centuries, the British Library has kept a copy of every book, pamphlet, magazine, and newspaper published in Britain. Starting Saturday, it will also be bound to record every British website, e-book, online newsletter, and blog in a bid to preserve the nation’s ‘‘digital memory.’’ The library also has to make this digital archive available to future researchers.
It says the work is urgent; firsthand accounts of everything from the 2005 London transit bombings to Britain’s 2010 election campaign have already vanished.
‘‘Stuff out there on the Web is ephemeral,’’ said Lucie Burgess, head of content strategy. ‘‘The average life of a Web page is only 75 days.”
Like reference collections worldwide, the British Library has been trying to archive the Web for years in a piecemeal way, having to get permission from website owners before taking snapshots of their pages. That began to change with a law passed in 2003, but it has taken a decade of legislative and technological preparation to begin a vast trawling of all sites that end with the suffix .uk.
An automated Web harvester will scan and record 1 billion Web pages. Most will be captured once a year, but hundreds of thousands of fast-changing sites such as those of newspapers and magazines will be archived as often as once a day. The library plans to make the content publicly available by year’s end.