Skip to:

How web crawlers are putting booksellers out of work


Perhaps we shouldn’t expect the Times Literary Supplement, to give the TLS its full original name, to keep us informed about digital publishing. But the TLS clearly feels it is well-placed to comment, and the latest review of several in the past few years on digital publishing is no different to the rest.


It is a review of a book entitled Words Onscreen: the fate of reading in a digital world, by Naomi Baron (in the TLS of July 3 2015). The review is by Leah Price, who teaches English at Harvard University, so she should know what she is talking about.


The thesis of the book is not surprising, given its title. Ms Baron’s book cites research that shows users reading with equal attention on screen and in print form, which would suggest the format we use is irrelevant. But she goes on to speculate that “we are right to predict that we’ll skim more hastily over electronic text”. I agree our expectations for digital and print formats may initially be different – but that’s not what the research found.


At this point the review takes a nosedive. The reviewer states confidently that EPUB is “an open file format”, and such formats “allow books to be read on any device” – which is of course not true. You can’t read EPUB on a Kindle, even though it is simple to convert from to the Kindle format.


But the real problem is the reviewer’s strange idea of what digital publishing implies. According to her, the real change is “from libraries through which an optimist could hope to read his way, to data on too vast a scale to allow any operation except search. As a result, text increasingly addresses machines, not humans.”


Since when could any user, optimist or otherwise, hope to read his or her way through a university print library? The myth that a print library is somehow manageable is one of the most persistent in academia. Certainly digital publishing has expanded the total quantity of titles enormously; one scholarly article estimated there were 1.35m scholarly articles published in 2006 ), and a researcher recently estimated there were a total of 50m scholarly papers published to the end of 2009.


Just to find content in all this clearly requires a machine. Yet Price complains about the “web crawlers and algorithms that, having thrown so many human booksellers out of work, may now do the same to their customers”.


What does this astonishing sentence mean? How could such a sentence even appear in the TLS, one of those rare periodicals where people pay attention to what words mean? You could argue that a company such as Amazon might be responsible for putting booksellers out of business, but a web crawler? An algorithm? Do web crawlers put bookshop customers out of work?  I can only interpret it as an uninformed rage against digital content, a longing to see a world return where libraries held only books, and newspapers existed only on paper.