Topic Collections: A Better Way of Organizing & Integrating STM Information?
Or, What Netflix, Zappos, and Pandora Can Teach Us About Giving Customers What They Want
Elizabeth Willingham, COO, Silverchair
In the final quarter of 2009, many of our current and prospective customers asked us how semantics can be used to generate topic collections. Even though topic collections are not what I usually bring up first when trying to foment enthusiasm for semantic meta-data, they are the two-by-fours of a semantic architecture. Once they’re hammered into place, you can quickly see how they frame the house.
Examples from outside of scholarly publishing provide inspiration:
- Zappos, where you can perform a faceted browse to find shoes you want in exactly the right style ("toe-ring sandals") by the right designer in the right color and size. How does it work? There’s a taxonomy of shoes, and Zappos’ inventory is tagged with its concepts.
- Netflix, where you can set your taste preferences for video viewing ("feel good," "gritty") from a taxonomy of these qualities so that the site can recommend movies you probably will want to watch.
- Pandora, where you can create a custom radio station that plays music similar to music you indicate an appreciation for, based on hundreds of descriptive qualities with which individual songs have been tagged.
What are these content-rich sites doing well? As venture capitalist Peter Rip said in this blog post, "understanding the demand side of the equation is the difference between flourishing and evaporating. … They understand that Media now begins with Me."
Web publishers have no excuse for continuing to ignore the more important and consistent message we hear from end users: "I want everything in one place."
Key Realizations for Publishers
We interpret scholarly publishers’ growing interest in semantically driven topic collections as a sign of two key realizations:
-
Publishers realize their content needs a new organizing principle. Traditional print-based collecting and organizing devices—books, journals, chapters, sections, articles—can still serve a purpose in web publishing, but let’s face it: the "aught decade" of 2000–2009 is over. And when I say "aught," for publishers that meant "ought," as in "We ought to put our books and journals on the web." That intention has come to fruition for most publishers, but clearly the lightning-fast pace of technology development for information delivery, particularly in the past 5 years, has kept many publishers panting and gasping for breath as they try to keep up. Publishers are also a bit in shock that "putting it on the web" (often no easy—or inexpensive—feat) turns out not to be enough. The traditional ways of organizing information worked well for ink on paper, but in the digital environment often become clunky silos that don’t come anywhere close to matching the human brain’s drive and ability to make connections among similar and related information.
With 20-20 hindsight, I wish I’d pushed our first customers of Silverchair Content Manager (in 1999–2005) to "think outside the book" and break away from book structure in presenting their content. How far ahead of the curve they would be now! I digress, but it’s interesting to reflect on how crazy and inconceivable that would’ve been 10 years ago versus how necessary it seems now. (A great Scholarly Kitchen post promotes the importance of leaping before looking.)
-
Publishers realize the importance of listening to their "real" customers. Many publishers have long done a fine job of listening to the customers who actually buy their books, journals, and electronic products—that is, librarians, bookstores, distributors, etc.—but too many have failed to engage with the actual consumers of their information. In fairness, that was hard to do when these folks were readers of printed books and journals vs. today’s online users who conveniently leave a trail of evidence about their information consumption and can contact you directly and easily. Whenever we at Silverchair lead focus groups to inform product development or perform usability testing on our applications—both of which put us in direct contact with end users—we usually note that they care much less about who wrote it and who published it than any of us wants to admit. (For example, we’ve asked bright, hardworking, and eager-to-succeed medical residents how often they consult Respected and Successful Textbook X in the course of a day seeing patients. Answer: "Is that the one with the blue cover? I use it all the time.") Probably best not to stop branding and acquiring high-quality authors based on this feedback, but web publishers have no excuse for continuing to ignore the more important and consistent message we hear from end users: "I want everything in one place." And when they say "everything," what they mean is "Everything that I am interested in / Everything I need to know / Everything that answers my question on _____topic." (Side note about listening: Even if you don’t have the time or resources to run focus groups or other formal feedback-gathering sessions with customers, the searches they run on your electronic products are telling you what they want from your product. If you’re not regularly reviewing search logs, you’re not listening to your customers. And if you have topical browsing on your site and you’re not tracking their paths through these doorways, you’re not listening to your customers.)
Topic collections that are derived from a layer of semantic meta-data feature the best of both worlds—expert curation with automated tagging and collection updating as new content is published.
Growing Demand for Automation in Curation
Topic collections are not new to professional and scholarly electronic products (whether book-based, journal-based, or both), but growing demand for smarter deployments of them is driving publishers to look for more flexible and sustainable methods for development. Most topic collections we see today are not even as robust as a book table of contents. I think many are frozen in a primitive state because within publishing organizations, the content people (who understand the information and how it should or could be organized topically) and the technology/implementation people (who have to build or maintain the application to deploy the topic collections) are at loggerheads. The content people want to curate the collections in an ongoing manual process that would probably do quite well to serve the content needs of users but that is not scalable and is at odds with a machine-driven process. (Read more about the "Curation Economy".)
Topic collections that are derived from a layer of semantic meta-data feature the best of both worlds—expert curation with automated tagging and collection updating as new content is published. Curation is carried out through 1) the selection of the taxonomy used to tag the content that will be collected (see "How to..." column in this newsletter for more information about selecting a taxonomy) and 2) the defining of rules for what qualifies for collection and what doesn’t. Automation is carried out through the tagging process (i.e., the matching of the taxonomy tags with the content using an intelligent auto-tagger) and routines that update the site by drawing appropriately tagged new content into an existing collection. In our experience, this balance between curation and automation meets with approval from content experts, application developers, and, most important, end users.
Topic Collection Examples
On AccessSurgery, a site we developed and maintain for McGraw-Hill, users can access content collections on all procedures required to be performed by surgical residents during their training—and at a granular level, such as "percutaneous endoscopic gastrostomy" (Figure 1). These surgical content collections are cross-media, featuring textbook content describing the indications, pre-op, and post-op of the procedure; videos showing the procedure being performed; and images from atlases detailing the procedures step-by-step.
Click image to view larger version.
FIGURE 1. Topic collections at McGraw-Hill’s AccessSurgery. Note topical organization across different media types.
On Patient Safety Network, which we developed on behalf of the Agency for Healthcare Research and Quality in collaboration with Bob Wachter and his team of patient safety experts at UCSF, faceted browsing allows the user to view a collection of all articles published (for example) on wrong-site surgery. That collection grows automatically any time another related article is added to the site simply through the process of tagging.
Other examples of topic collections on sites developed on the semantic Silverchair Content Manager platform include the following:
- Image collections on AccessMedicine showcase thousands of images all organized and viewable by topic, from broad topics like "pain" to narrow topics like "major histocompatibility complex."
- Topic "cards" on JAMAevidence feature collections of diverse information on specific concepts in evidence-based medicine, such as "number needed to treat," which features textbook content, glossary definitions, and calculators relevant to that topic.
- Topical collections of innovations in health care (Figure 2), where users can implement a faceted browse to create collections as granular as "innovations about improving follow-up care of pre-schoolers treated for asthma in public health clinics."
Click image to view larger version.
FIGURE 2. Topic collections at AHRQ’s Health Care Innovations Exchange. Users can also perform a faceted browse by selecting multiple categories to create a personalized topic collection.
Collecting Users
It is important to note that users can be topically organized as well as content. When users express topical interests through methods either active (checking off boxes in registration) or passive (running searches, browsing, and basic content retrieval and navigation), they lay the groundwork for meaningful social networking and community building around your content. Silverchair is deploying this approach at CardioExchange, Massachusetts Medical Society’s recently launched cardiology practice community where members identify their practice and academic interests so that other members with the same interests can be suggested to them as friends, and so that they can be alerted to the publication of content directly related to their interests. See this issue’s "Solutions" column highlighting CardioExchange for more information about "semantic social networking" on Silverchair Content Manager.
To learn more about topic collections for your publications, please contact Silverchair.
Next: Solutions: CardioExchange