Pedro Domingos likes big ideas. He sets out to describe to us how computers can write their own programs. For example, there is the well-established case of handwriting recognition. This is a form of machine learning in which the computer is provided with sufficient examples (and a training set) to enable the machine to learn to do something. If you show the machine the number “9” written enough ways, the machine eventually becomes as good or even better than a human at recognising a handwritten “9”.
Unfortunately, he alternates between very sensible and clear description like this, and sweeping optimistic generalisations. Mr Domingos is in no doubt who the new masters of the world are going to be. In his potted description of commerce, he describes the how “the progression from computers to the Internet to machine learning was inevitable ... once the inevitable happens and learning algorithms become the middlemen, power becomes concentrated in them.” In fact, there is no future for any company without using machine learning: “a company without machine learning can’t keep up with one that uses it ... businesses embrace it because they have no choice.” That’s a very stern conclusion!
TrendMD is (as its website states) “a content recommendation engine for scholarly publishers, which powers personalized recommendations for thousands of sites”. An interesting blog post by Matt Cockerill of TrendMD (published February 2016) claims “TrendMD’s collaborative filtering engine improves clickthrough rates 272% compared to a standard ‘similar article’ algorithm in an A/B trial”. That sounds pretty impressive.
According to publishers’ own figures (reported in The Guardian in February 2016), there was a decline in ebook sales in 2015 compared to the preceding year. Although this decline was small compared to the previous year (a drop of only 2.4%), the figure was noted with alarm by publishers and by many commentators who had predicted the end of the physical book when ebooks were first introduced – after all, 2015 was the first year that ebook sales had not increased. Why are ebook sales declining?
The Journal Impact Factor has been discussed, and criticized, for years. A recent Scholarly Kitchen article looks at another proposal for improving the impact factor (Optical Illusions, 21 July 2016). This is by no means the first suggested improvement to the impact factor metric – a search on Scholarly Kitchen itself reveals there are several posts on this topic each year.
Perhaps the biggest problem with the Journal Impact Factor is this. Most journals, from Nature to the smallest journal, seem to have a similar graph when number of citations are measured by individual articles in that journal. A few articles are cited a lot, followed by a very long tail of articles that get few or even zero citations. We all know this, but we persist in believing a Journal Impact Factor is in some way representative for each article in that journal.
It is a sign of the maturity of open access that good, reliable figures are available. The latest stats from OASPA, the Open Access Scholarly Publishers Assocation, reveal that there were 160,995 open-access articles published in 2015. What do they mean by OA? OASPA counts those arcitles that are published with a CC-BY licence.
Any author will ask questions such as the ones above, and academic authors are no exception. In one sense, we have better answers than were possible just 20 years ago. Although thousands of copies of print books are sold per year, in those days there was little evidence coming back to the publisher that those books were actually read. In fact, one joke among publishers was that encyclopedias and bibles had one thing in common: they were more bought than read. A typical publisher would receive just a handful of comments from readers each year. As publishers, we knew the books were sold; but we didn’t know if they had ever been read. So if an author had asked us if anyone read their book, we couldn’t say.
Do you understand this graphic? It is an example of a sparkline, by Edward Tufte. Tufte was, if not the originator of sparklines, one of its earliest advocates. He wrote about them in his book Beautiful Evidence (2006); he defines sparklines as “small, intense, wordlike graphics, embedded in the context of words and numbers”. Tufte’s ideas were very influential and were taken up by Microsoft in their 2010 release of Excel. But I don't agree with him about sparklines.
The "huge leap forward" were the words of The Bookseller's BookBrunch newsletter, reporting (21 June 2016) a new version (1.2) of the Thema ebook classification scheme (released in May 2016). Normally a point release of a standard is isn't a huge leap forward, so I was curious to know more. Because BookBrunch is open only to subscribers, I can’t see who is responsible for that "huge leap forward" comment. But the Thema classification scheme itself is open and accessible to anyone who wants to find out more details of this new initiative for book classification.