Do Libraries Dream of Electric Sheep?


Please distribute this article. I would be glad to hear any comments readers might have.

By Lisbet Rausing,

Abstract: Imagine a New Alexandria. Imagine a library that contains the natural and social sciences of the West, peer-reviewed publications, archives, and collections.  It is electronic and in the public domain: stable, linked and searchable. New Alexandria demands an ethos of digital conservation, scholarship and public access.  It needs to be a long-term public good, hosted by reputable non-profit institutions, in stable jurisdictions. But it is technically possible. After all, Google Books aims for c. 16 million books,[2] and the non-profit Internet Archive has c. 1 million volumes.

We live in the age of electronic reproduction. The technological future is certain, and it is created by scholars, entrepreneurs and amateurs, in a Schumpeterian process of creative destruction. Even our most traditional products of learning (monographs, academic articles, etc.) are dis-intermediating.  As marginal costs of replicas near zero, what constitutes an archive, a publisher, a bookseller, or a book, is all put in question. Things fly apart at the seams.

The question is not whether change is coming. It is who will be the change-makers.  This is not a challenge of technology or finances. It is not, even, a question of law, though copyright legislation is fiendishly destructive of the democratisation of scholarly knowledge. This challenges state-builders and gatekeepers: academics, librarians, foundation staff, politicians and civil servants. Can we build structures that will preserve and order, but also share and disseminate, the world’s learning and cultures? It is a challenge of will and imagination. Here, I discuss three such challenges: the cult of the artefact, the problem of abundance, and the question of the audience.

Do Libraries Dream of Electric Sheep?[1]

The cult of the artefact is a story of our imaginary horizons. Our iconic library stories are romances of destruction, decay and amnesia. We still mourn Alexandria. We revere St Catherine’s Monastery, the Vatican archives, and the Dead Sea Scrolls. We grieve over the Christians closing the academy of Athens, and over the fall of Constantinople, where in desperation the last Grecian scholars lit the cannons with their manuscripts. Boethius, the monks of Iona, the fleeing Byzantine humanists – these are our heroes and role models.

Perhaps this is why the great libraries of the West concern themselves so centrally with the single and exceptional object,  while hiding the purchases of scholarly databases in their yearly reports’ sub-clauses and footnotes. But should we rejoice when dwindling acquisition budgets are spent on “rare books,”  “rare,” admittedly, but not in a meaningful sense threatened or endangered? And if so, why? Throughout history, libraries have struggled against destruction. They still operate within an imaginary economy of scarcity – attempting to “save” rare books.  But what does that code of “rescue” denote, except a Benjaminian cult of the physical artefact?

In an era of electronic abundance, how can libraries archive the dreams and experiences of humankind? What do we discard?  If a library is no longer a warehouse of treasures, what is it?  Harvard’s c 16 million volumes rival those of Google Books. One collection took nearly four-hundred years to achieve:  the other, less than a decade.  Harvard’s ambitions in 1638 were universal: to hold all knowledge.  But how can this be achieved today?

Bibliographies, dictionaries, encyclopaedias, library catalogues, scholarly journals and so on are all dematerializing, as they move into “the cloud.”[3] But what about processes rather than products of knowledge, such as lab books, lectures, conference proceedings, data sets, and course work?  The papers of Newton, Darwin, Einstein, and Bohr were finite. But what about “big science”?  The Large Hadron Collider at CERN takes 90 million measurements 600 million times a second, analysed by c 6,000 physicists. How is that to be archived? Worldwide, scientific data files are approaching a petabyte.[4] Every year, they double.  Even artisanal lab skills are now recorded, on wikis.[5]

Our Alexandria was not burnt, our Byzantium still stands, and our Athenian academies are blossoming. And next to our scholarly endeavours, and our rare, well-studied, cultic artefacts, we want to preserve ephemera—esoteric traditions, dying languages, oral memories. We now know that we understand only slowly what will last through the ages.  What if our next “peasant poet,” as John Clare was known, twitters,[6] writes a blog or shojo manga, or publishes via a desktop? Is that legal deposit? What if a Nigerian novelette (typically addressing a young heroine’s agonized choice between a village boy and a “big man”) is written by a Jane Austen?

if we record / remember everything, then what will a library (selective, ordered) look like.

What is the library when we can safeguard memory and images themselves and complete:  if we record / remember everything, then what will a library (selective, ordered) look like? What happens when people think the “canon” means an online strategy war game, or a shojo manga? What is the library in the era of the internet (1974), the web (1991), or Google (1998)? What is the library in its Second Life?[7]

In 2008, Tim Berners-Lee noted that the web can be modelled by biological concepts: plasticity, population dynamics, food chains, and ecosystems.[8] Does understanding the web mean grasping its quasi-biological whole? Do libraries dream of electric sheep? Do electric sheep dream of libraries?  Will Second Life take on life? And if so, what will be its – and our – library?

¤ ¤ ¤ ¤

As the open web movement has it, an old tradition and a new technology together enable an unprecedented public good.[9] The “new technology” means that the near-zero marginal costs of electronic replicas allows disintermediation.  The “old tradition” means that scholars publish without pay, for peer recognition and social utility.  Universities, recognizing this, say they “produce, preserve, and propagate knowledge.” But look closely at their libraries. They serve their faculty and students, and, when feasible, scholars at peer institutions. They do not serve the public.

Fifty years ago, that may not have mattered. But today, people are educated and engaged. As disinterested scholars, they participate in the collective projects of knowledge the technological rupture has enabled, such as Wikipedia, GalaxyZoo, ESP, Africa@home, Herbaria@home or SETI.[10] This mass voluntary participation at times is “grunt work,” related to image recognition.[11] But scholars also engage with the “hive mind,” or the public, in complex or interpretative work. For example, the Rothschild family and others are putting the Dead Sea Scroll fragments into the public domain, engaging with religious communities that have unparalleled language skills.

But on the whole, scholars exclude the public from their “core” research materials, such as House of Commons Parliamentary Papers, Historical Statistics of the United States Online, BMJ Clinical Evidence, Early English Literature Online, ehRAF Collection of Ethnography, Index of Christian Art,  Index Islamicus, Oxford Music Online, and ARTstor. Many commercially owned databases demand eye-watering fees, and / or only allow institutional subscribers. Even university-controlled collections are expensive.

It is a scandal that academic databases and research tools are unavailable to the public. After all, the public has paid for them – through research grants, tax breaks, and donations.  Why should only scholars affiliated to universities have access to PhDs, MAs, and JSTOR?[12]

Academic databases are at least digital. Public access—the right to roam–is a press-of-the-button away. But academic monographs, while produced digitally, are then – in an act of collective academic madness -turned into non-searchable paper.  And all academic writing, digital or not, is kept from the public domain for the authors’ lifetime plus seventy years.

Academic materials, being a public good, should obviously not fall under commercial copyright.  Nor should “orphan” works (out-of-print books, without known copyright holders).  But restrictive fair use rules mean that libraries do not dare digitalize their orphan works.  In the age of electronic reproduction, many books remain as rare as Gutenberg Bibles.

Today, scholars–working for public institutions, paid by tax payers–sign over their copyright to for-profit presses and journals. At best, they illicitly put their research on their websites. A “don’t ask don’t tell” stand-off means that free public access to scholarship exists only in fragments, in violation of copyright, and by means of unstable self-archiving.

To copyright legislation, add market failure.  The inflation rate for scholarly monographs is high, and prices are hyper-inflating for commercial academic journals, where three firms control over 80% of the market.  The price per page for commercial journals is up to 12 times more than for non-profit ones, and not because they are better. In the field of economics, the cost per citation is 16 times higher in commercial journals than those published by scholarly societies.[13] More recently, Google and the publishing industry have created “an effective cartel,” with “significant barriers to entry.”  As the FT rightly has noted, an “effective monopoly provider” always eventually charges monopolistic and discriminatory prices: “just as happened with academic journals in the past.”[14]

Let’s rehearse once more how university research is disseminated today: publicly funded institutions first give away, and then buy back, their own research. Adding insult to injury, the scholars who sign over their copyrights for free to for-profit journals, also donate their labour for free, as volunteer peer reviewer and editors.  It is, shall we say, an unusual business model: the producer gives away a product he then buys it back, having helped the intermediary package it.

There are worthwhile initiatives to make scholarship public. Some 10% of Anglophone academic journals are now open access. The “gold” ones are edited and peer-reviewed, and with prestige-factors equalized, citation rates are significantly higher from open access articles.[15] Yet as long as journals and university press brands are a proxy for quality to tenure committees, the stranglehold of commerce will remain. This stranglehold is not only ravaging university budgets. It also blocks the emergence of a wise and learned commonwealth, by disallowing free access to good, peer-reviewed data. Arguably, this is also a legal, freedom of information matter.

We thus urgently need better laws and wiser funding of university research. Our nudges need to go the other way. Why not presume open access, along the lines of presuming organ donor intent?  Why not make copyright something that needs to be asserted and renewed?  Could copyright automatically lapse, when it stops generating income?  Should not university presses release their tax-financed backlists into the public domain?  Could university libraries make alumni members?  Should university library catalogues be turned into blogs, allowing university members – or the public – to add commentaries and hyperlinks?

Institutions need to take a stance. Think only of the British Library’s feeble response to the 2006 Gowers Review of Intellectual Property: it pleaded for unpublished works to have “only” a copyright of life plus 70 years,[16] and it humbly asked to be allowed to make single copies of old sound and film recordings, since the then proposed extension of the 50-year music copyright to 95 years meant the certain destruction of most of the British Library Sound Archive.[17] Moreover, what it allows the British public to access for free, it sometimes sells to commercial interests abroad.

Only those scholarly fields which few professors, let alone the public, understand, are public domain. High-energy physics and molecular biology are open to all. But 20th century scholarship in the humanities and social sciences remains locked away by “The Sonny Bono Copyright Term Extension Act of 1998” (also known as the “Micky Mouse Protection Act”).[18] Look at the academic journal collection, JSTOR (if you can). Here you find the foundational work of the social sciences and the humanities–all closed to the public.  The opportunity costs for society are self-evident. The public is rewriting knowledge through Wikipedia and the like. Should these sites not be hyperlinked with JSTOR? [19] By excluding the public from scholarly literature, copyright laws prevent amateurs from using sound research methodologies.

But what about the opportunity cost for scholars and the reputational risk for universities? The web tech community is working on how to verify information on the web, to “engineer layers of trust and provenance.”  The question is not whether the web will become scholarly in some meaningful sense. It is whether twentieth-century scholarship will be integrated into the scholarly world of the web.[20] Will universities become bystanders in the world of open access knowledge?[21]

If scholars hide away and lock up their knowledge, do they not risk their own irrelevance? Today’s academics fail to engage with their immediate constituency (and former students): journalists, business leaders, professionals, politicians and civil servants. Yet these people house and feed professors. Is it not in universities’ interests to let the educated bourgeoisie, and indeed the public at large, even look at, say, the Index of Christian Art?

¤ ¤ ¤ ¤

Half a millennium ago, German town folk were dazzled by the thought that, thanks to their new-fangled printing presses, God’s word would now be put in the hands of the laity. There would be no need for intermediaries. God’s word would speak, not through the clergy, but to each humble soul.

Of course the intermediaries struck back – the Counter-Reformation was arguably just that, a rebellion of intermediaries.  But the technological rupture of the printing press was such that disintermediation was inevitable over the longue durée. We became – and look closely at the word – Protestants.

Today, at the dawn of the age of electronic reproduction, the intermediaries are again striking back. The publishers are the most blatant and crude. But academics are also intermediaries, and they too are striking back: university libraries are closed shops, JSTOR remains blocked, theses are inaccessible, and academic monographs are available only on paper and at prohibitive prices.

The obstacles to a true and electronic Reformation are real, but to be found also in “business as usual,” a reluctance to imaginatively re-draw practices, and tear down organisational and legal “silos”. Remember Henry Ford’s comment: “If I had asked my customers what they wanted, they would have asked for a better horse carriage.”

Obstacles can delay, but not stop, a technological rupture of this magnitude.  Our children – always on, multi-tasking, mobile – will not engage with a body of scholarship their elders have incomprehensibly surrounded by barbed wire. But they will remain engaged in learning. The question is not whether there will be future scholars. It is how these future scholars will remember and integrate previous scholarship. And in pondering that, which means pondering the scholarly legacy of our age, it is worth remembering that “the generational war is the one war whose outcome is certain.”

Dr Lisbet Rausing

Lisbet Rausing is a Senior Research Fellow at Imperial College’s Centre for the History of Science, Technology and Medicine.

She was educated at the University of California Berkeley and Harvard University, where she also taught for eight years. She has written two academic monographs as well as numerous scholarly articles.

Together with her husband, Peter Baldwin, Lisbet Rausing founded Arcadia in 2001.  Arcadia protects endangered treasures of culture and nature. This includes near extinct languages, rare historical archives and museum quality artifacts, as well as threatened landscapes.  Partners include Harvard, Yale, SOAS, the British Library, the Ashmolean, the Linnean Society, Imperial College, Cambridge University, Oxford University and Fauna & Flora International.  More information can be found at

Lisbet Rausing chairs Nyland, a Rausing family office, which together with other entities supports her family and its wealth management, and works with operational family companies such as Ecolean, a liquid food packaging company, and Ingleby, a global farming company. She serves on various boards and committees including the Harvard Board of Overseers, and Yad Hanadiv.

She holds honorary doctorates from Uppsala University, Imperial College and SOAS, is an elected member of St Catherine’s college, Cambridge. She is an Honorary Fellow of the British Academy, a Fellow of the Linnean Society and the Royal Historical Society.


[1] I derive the title from Philip K Dick’s futuristic novel Do Androids Dream of Electric Sheep (New York: Doubleday, 1968), which also formed the basis of the 1982 film Blade Runner.

[2] Harvard has nearly 16 million items but about half of those are periodicals. About 7 million are books, and of those, three-quarters come from outside the US, although serious collecting abroad only started from the 1860s.

[3] “The cloud,” a dematerialized and outsourced network, consists of huge data centres with software applications used by millions of people at the same time. Yahoo, Wikipedia, YouTube, Twitter, Amazon, and so on are all built on such centres. Indeed, Amazon is transforming itself from a book seller to a cloud-space renter, in Amazon Web Services, which already uses more bandwidth than its retail side. Its Simple Storage Service has c. 52 billion virtual objects.  In manufacturing, a parallel to “the cloud” might be “outsourcing”. A more homely example might be how your music experience moves from CDs, to JPEGs on your hard drive, to Pandora, which is situated in a cloud. Feature length films are of course next. What household would not appreciate instantaneous rental films from “the long tail” (the entire backlist) of Hollywood, or for that matter all the other film industries of the world?  Herald Tribune, 15 June 2009.

[4] 10 to the fifteenth power, or quadrillion.

[5] Take for example the 2005 wiki OpenWetWare, started by biological engineers at MIT, which unexpectedly morphed into a vast manual of lab techniques, alongside its original function as a collection of laboratory notebooks. Mitchell Waldrop, “Science 2.0,” Scientific American, May 2008, p. 47-51.

[6] Presently, Google worries about how their Twitter searches are indexed when they are a few minutes old, rather than in real time. Herald Tribune, 15 June 2009.

[7] Second Life refers to an online virtual world where scientists have begun conducting real research projects, essentially on Darwinist theory, while taking on digital alter egos. It was founded in 2003 by Linden Lab as an open-ended platform where users (avatars) can create their own environment. It is, if you will, an open-ended SIM world, and it had c 13.5 million users in mid-2008. By that date, its SciLands had grown into a mini-continent of some 45 islands – of, admittedly, nearly a thousand in all – inhabited not only by individual scientists but also by more of 300 universities and museums as well as by organisations such as NASA.  Science News, 24 May 2008, p. 20-23.

[8] Nigel Shadbolt and Tim Berners-Lee, “Web Science Emerges,” Scientific American, October 2008.

[9] Freedom for IP, “Budapest Open Access Initiative,” 19 November 2007:

[10] SETI (the Search for Extra Terrestrial Intelligence) has three million people donating spare computer time to seek for narrow bandwidth radio signals in space.  In Folding@home, some 40,000 PlayStation 3 volunteers help Stanford scientists fold proteins. In foldReCAPTCHA, amateurs digitize the New YorkTimes’s back catalogue. In the ESP project, the public has labelled c. 50 million photographs (to help computers think).  In GalaxyZoo, c. 160,000 people help John Hopkins astronomers to classify galaxies, and in Africa@home, volunteers help the University of Geneva create Africa maps through satellite images.  Conservation biology depends on amateur surveys, and at Herbaria@home, volunteers decipher herbaria in British museums.

[11] Crowdsourcing is also of course a tool for political activists.  It is used to demonstrate corruption (by tracing the flights of Tunisia’s presidential jet), to find war criminals (in Darfur), or to advocate changes in the Catholic Church. The Economist Technology Quarterly, 6 September 2008, 8-10.

[12] JSTOR is said to have hundreds of millions of referrals from Google a year, the vast majority of which are refusals.  There is considerable internet rage over JSTOR being closed.

[13] “Improving Scholarly Publishing Practice at Harvard: Report on the Provost’s Committee on Scholarly Publishing,” Harvard, p 5. The report notes that commercial publisher’s profits for scholarly journals are estimated at around 40%, an astonishingly high figure for any industry.

[14] Financial Times (London edition), 19 June 2009.

[15] A now-classic Nature article of 2001, “Online or invisible?” (Vol 411, nr 6837) analysed c 120,000 articles in computer science from 1989 to 2000. It found that, standardized for age-cohort, public domain articles had 4.5 times more citations. The correlation also held for top-end articles, from prestigious conferences.

[16] A 2006 European Union directive stipulates copyright protection for life plus 70 years for authors of literary, artistic, cinematographic, and audiovisual works.  EUR-Lex, “Directive 2006/116/EC of the European Parliament and of the Council of 12 December 2006 on the term of protection of copyright and certain related rights (codified version)”: Literary works mean more than just fiction.  The EU directive refers to “the rights of an author of a literary or artistic work within the meaning of Article 2 of the Berne Convention”.  Article 2 of the Berne Convention states, “The expression ‘literary and artistic works’ shall include every production in the literary, scientific and artistic domain, whatever may be the mode of form of its expression, such as books, pamphlets and other writings; lectures, addresses, sermons and other works of the same nature; dramatic or dramatico-musical works; choreographic works and entertainments in dumb show; musical compositions with or without words; cinematographic works to which are assimilated works expressed by a process analogous to cinematography; works of drawing, painting, architecture, sculpture, engraving and lithography; photographic works to which are assimilated works expressed by a process analogous to photography; works of applied art; illustrations, maps, plans, sketches and three-dimensional works relative to geography, topography, architecture or science.”  World Intellectual Property Organization.”  Berne Convention for the Protection of Literary and Artistic Works: EU copyright protection is automatic and does not need to be formally registered.  BUYUSA.GOV, U.S. Commercial Service, “Copyright Protection in the European Union:”

[17] Undated pamphlet: picked up at the British Library, “Intellectual Property: A Balance: The British Library Manifesto”.

[18] To be compared to the first British copyright statue, the Statue of Anne in 1710, which set copyright at fourteen years, renewable only once. The need to renew copyright was removed in the US in 1992, and additionally copyright has become an assumed (rather than to be asserted) right.

[19]Look up, for example, the eminent historian Natalie Zemon Davies in Wikipedia.  The bibliography is good. But few of the entries are blue (linkable). Then look up, say, typhus, or any other major illness.  You can hyperlink to recent medical research—but only rarely to the history of medicine references.

[20] The Research Blogging website, started by Seed Media Group, aggregates and indexes posts on peer-reviewed data, and allows them to be tagged with metadata enabling priority of publication (The Economist, 20 September 2008, p. 96). The Transparent Accountable Datamining Initiative is at MIT and has a wide remit. The DBpedia project was started at the Free University (Berlin) and Leipzig University. It semantically queries the infobox templates embedded in Wikipedia’s (English) articles (2.3 million of them, as of late 2008). Nigel Shadbolt and Tim Berners-Lee, “Web Science Emerges,” Scientific American, October 2008, p. 65 and passim.

[21] I borrow the concept of cultural agoraphobia from James Boyle at Duke University, and from a lecture he gave at Cambridge University Library on 12 March 2009, entitled “Cultural Agoraphobia and the Future of the Library”.

