Skip to:

Nine things you might not know about search

I was fortunate to be invited to talk at Enterprise Search Europe in London this week, which gave me an excellent two-day overview of the state of search. It was an impressive event, combining practitioners, academics, and visionaries, with detailed, blow-by-blow presentations from large corporations on how they switched the enterprise search system to a new platform (Reed Business Information), or how they implemented search (Airbus Industries), alongside descriptions of small-scale investigations of how search works or doesn’t work in SMEs or institutions.

The London Book Fair 2014: Upstairs, Downstairs

There seemed to be two different events at this year’s Book Fair. Downstairs, in the main hall, there were new books galore, events with authors and celebrities. In the rights section, agents were working flat out to sell rights in hundreds of different territories. It was business as usual, as it has been for the last 38 years. But elsewhere upstairs it was a different story.<--break->

How difficult is text analytics?

A recent ISKO meeting ("Taming the News Beast", 1 April, London) presented the current state of the art for text analytics relating to news publishing. Although the presentations were excellent, and the organisations represented were leading edge, I found the day valuable not for what was described in the presentations, but for some issies that were revealed during the talks. Perhaps text analytics isn't so advanced as some people might think.

Content modelling - new bottles, old wine?

Content modelling is on everyone’s lips these days, yet it’s a term that seems not to have existed just a few years ago. Is it some entirely new concept? As usual, a quick look on the Web reveals several definitions, some of which concur, and most of which differ in emphasis.

So, for Cleve Gibbon, a content model is a representation of the types of content and their inter-relationships. For example, a car dealership may have content types for vehicle, dealer and manufacturer. This, of course, is where you start when modelling a relational database.

Creating illustrated books is an Agile process

Years ago I worked at Dorling Kindersley producing illustrated four-colour books - that is to say, with pictures on every page, and the text often wrapped around the pictures. Whatever the subject, cookery, health and wellbeing, or DIY, Dorling Kindersley’s working methods were the same. Designers and editors sat at desks facing each other. To communicate or to explain a concept, the designer would create a quick pencil rough, and the editor would comment on it. Or the editor would explain his or her idea and the designer would visualise it. Some of the ideas inevitably led nowhere.

Digital humanities: more than just text mining

A recent book by Matthew Jockers, Macroanalysis, outlines an approach to digital humanities based on what is usually referred to as text mining. I can't help feeling that the "macroanalysis" approach, which is very similar to that of Franco Moretti's "distant reading" (from his book of the same title), looks at only one aspect of digital humanites, which seems to get all the attention while another important aspect is ignored.


A genuine XML workflow for journals

The terms "XML workflow" and "XML first" are used so frequently that it is as if the simple repetition of the terms provides sufficient proof that what is claimed to be happening is actually taking place. Many workflows that claim to be XML first do not provide full round-tripping of the content, and certainly not at the same time being fully compliant with the industry-standard DTD. At a recent presentation by Rave Technologies (London, 19 November) a genuine 100% XML workflow was demonstrated for journals, and it was impressive in several ways.

AccountingWEB: a model community

AccountingWEB has been running for some 15 years, which makes it one of the longest-established communities on the Web. I'm not even sure there was a Web 15 years ago. And the slick website oozes confidence. Simply reading the numbers of reads or comments for content on the home page makes you realise this is a thriving site: over 8,000 reads for "the worst mistake accountants make" (strange that such a story should be so popular with accountants), 7,700 reads for a new article about an accountant fined by his professional association for abusing HM Revenue & Customs officials.