Anne O’Tate

Anne O’Tate is an alternative interface for searching PubMed developed by the University of Illinois at Chicago (UIC).


INTRODUCTION
Anne O'Tate is an alternative interface for searching PubMed developed by the University of Illinois at Chicago (UIC). It was developed as part of the Arrowsmith project, which has been developing informatics tools for advanced text mining of the biomedical literature. The tool is hosted on the Arrowsmith website on UIC servers and is freely available to the public [1]. The tool is designed to mine results data for relevant keywords, Medical Subject Headings (MeSH) terms, and bibliometric data to help users refine and develop their search strategies.

DESCRIPTION
Anne O'Tate interfaces with Pub-Med through an application program interface (API) developed by the National Center for Biotechnology Information (NCBI). Users enter search terms in a text field, and those terms are passed to PubMed through the API, which then returns the results set. A "Limits" tab provides a subset of PubMed search filters, and a "Details" tab displays the detailed search string that is passed to PubMed. Anne O'Tate's intended audience appears to be librarians and researchers conducting a general literature search, particularly those having a difficult time refining their search strategies.

Basic search
The front page provides the user with a search box with Limits and Details tabs, a bulleted list of instructions, help text with links to various PubMed help pages, and a description of the tool with a link to an article outlining the tool and the algorithms used for its functions. Unlike PubMed, the search box does not provide any predictive suggestions, but otherwise the search text is treated the same as if the user entered it in PubMed. Natural language is mapped, and field tags and Boolean operators are recognized.
The Limits tab includes dropdown menus for Field, Publication Type, Age, Language, Humans and Animals, Gender, Publication Date, and PubMed subsets. There is also a checkbox to limit results to those that have abstracts. The options available in the filters represent a subset of the filters available in PubMed and cannot be customized. The Details tab displays the full search string used by PubMed and is the equivalent of the Search Details box in the PubMed interface. Selected Limits are displayed in bold at the top of both the Limits and Details tabs, but they do not display on the search results page.

Search results
Search results are presented in reverse chronological order, twenty items per page, displaying a citation for each record along with its PMID and a link to related articles. The title links to the PubMed record, and each author name links to an Arrowsmith index that displays the full names of all authors that match the name, their years of publication, their affiliation information, and topics that they frequently publish about, with links to that author's publications in both PubMed and Anne O'Tate. The Related Titles link pulls up the full list of similar articles from PubMed.

Search refinement
Anne O'Tate does not perform any data mining on the initial results set. Instead, it provides a list of data mining tools in the left sidebar, each of which will perform a specific function. The first tool in the list, Important Words, will analyze the text in the title and abstract of each result for words that "show high enrichment" and "should have high coverage" [1]. To the best of my understanding, these concepts relate to the uniqueness of the word within the results set as compared to all articles and the frequency of the word within the results set, respectively. This calculation, based on an index of all words in the titles and abstracts in MEDLINE, is updated annually.
The Important Phrases tool uses TopMine, which performs "phrase mining based on raw Selecting a data mining tool will open a new browser tab containing the results of the corresponding function. The initial results set will remain in its own tab. The results from these data mining tools can be used to refine the results, and using the tools against the new results set will yield new results.

EVALUATION
The Anne O'Tate interface is very spare and unadorned, clearly prioritizing function over form. Simple tables are used to present the out-put of most of the data mining tools, and all visualizations, such as histograms, are displayed using ASCII text. Although initial search results are returned reasonably quickly, the data mining functions are very slow and degrade in performance in proportion to the size of the results set. The phrase and gap analysis tools are considerably slower than other data mining tools. The interface and performance make Anne O'Tate clunky to use.
On the other hand, the quality of the information yielded by the tools in Anne O'Tate should not be understated. The Important Words and Phrases tools provide excellent suggestions for relevant keywords in a search set. The MeSH Pairs reveals tight relationships between MeSH terms in the results set. The Authors and Journals tools can be used to identify topic experts and potential avenues for publication. Also, the Year tool can be used to show publication trends. These tools can be used iteratively as the results set is refined, helping the user quickly home in on a small set of highly relevant articles. The Affiliation tool is not as useful as the rest, perhaps because affiliation information that publishers provide tends to vary widely. Affiliations may take the form of a country, city, or department name, making it extremely difficult to derive any meaningful information from the Affiliation tool results.
Anne O'Tate's results are highly accurate and verifiable. The Details tab allows the user to see the exact search used in PubMed. The user can use the Boolean AND on this search string with a fieldspecific refinement term (such as an author or journal title) in Pub-Med to confirm the results. The one caveat to this is that the internal index of MEDLINE words that are used in the "enrichment" calculations for the importance ranking is only updated annually, which can impact the ranking for recent highly published topics. It is unclear, however, just how much inaccuracy this produces, given that the "coverage," also used in the importance ranking algorithm, is based on frequency calculations from the live results set.

SIMILAR TOOLS
Alternative interfaces to PubMed are becoming increasingly difficult to find. Many of the tools listed on the PubMed Alternative Interfaces page of the Health Librarians wiki, HLWIKI International-such as GoPubMed, eTBLAST, and PubFocus-either no longer exist or do not seem to be maintained. Quertle, now Quetzal-info, requires a paid subscription to access. The two remaining tools on HLWIKI's "Best of Breed" list are BibliMed and PubGet [3]. PubGet, a free PubMed interface, provides only a very rudimentary search interface and provides no way to refine your search, and thus is too dissimilar to Anne O'Tate for an adequate comparison.
BibliMed is a free PubMed interface (registration required) that provides a more robust set of features than PubGet. The initial search field provides some predictive MeSH suggestions, although these are not required. The Bib-liMed results page provides a relevance ranked list of MeSH terms along the left sidebar and a list of relevant books (retrieved from Amazon) on the right sidebar. Above the search results are links to numerous related resources, such as PubMed Health, Trip EBM search, Google Scholar, and Clini- Although it is currently unable to connect to PubMed and does not seem to be actively maintained, gopubmed was the most similar tool to Anne O'Tate. Its ontologybased search methods provided semantic analysis of the title and abstract text in the results set [4]. Gopubmed provided relevanceranked MeSH topics as well as intext annotation of mapped MeSH terms. Gopubmed also provided very similar author, journal, and publication date rankings to Anne O'Tate and, through its Statistics page, provided gorgeous and useful visualizations including histograms, a world map, a publication timeline, and an author network map. Its speed, clean interface, and visual tools would have given it a huge advantage over Anne O'Tate. It is worth looking at if only to see what a great PubMed tool could be.

CONCLUSION
In spite of its obsolete look and feel and its slow performance, Anne O'Tate provides an excellent tool set for searching PubMed. Its data mining tools provide a variety of dynamic content analysis that can be of great use in identifying relevant search terms and bibliometrics. Its performance may prevent it from being a go-to tool for day-today searching, but for difficult or complex searches, Anne O'Tate may be just the supplementary tool for your PubMed toolbox.