January-February 2002
Volume 1 - Issue 2

In This Issue:
 

PUBMED PARTICULARS

Automatic Term Mapping

There were several postings recently on Medlib-l about how PubMed translates what's typed into the query box into an actual search. Understanding how PubMed does this is crucial to constructing efficient and effective search strategies.

PubMed's process is called Automatic Term Mapping. For those who are familiar with it, here's a refresher. For those who aren't, here's your introduction to the secret inner workings of PubMed.

Step 1: MeSH Table

Automatic Term Mapping begins once a word(s) is typed into the query box and a search is initiated. PubMed checks the word(s) against the MeSH Table first. If it finds a match, it slaps the MeSH term into the search, adds the appropriate text words (in order to catch those "in-process" or "supplied by publisher" citations that are not yet indexed with MeSH terms) and runs the search. It figures its work is done.

Step 2: Journal Table

If it does not find a match, the next place PubMed looks is the Journal Table. If it finds a match, it runs the search. This is important to understand due to some of the titles of journals indexed in Medline. If the title is, say, Blood or Cell or Science, Automatic Term Mapping means these terms will get matched to their corresponding MeSH terms, not the journal. PubMed, once it finds the MeSH match, stops. It never even looks in the journal table. How can you force PubMed to look in the journal table? By constructing a search using the [journal] tag. However, this will turn off the Automatic Term Mapping function.

Step 3: Phrase List

If there is no match to be found in the Journal Table, PubMed next consults the Phrase List. This is a prefab list of multi-word strings that PubMed "reads" together as a phrase. If it finds a match here, Automatic Term Mapping ends and the search is executed.

A note about phrase searching and PubMed: PubMed can only read phrases it's been taught to read. It doesn't do adjacency searching. It doesn't do proximity searching. It simply compares the words in the search box to the Phrase List. That's all.

You can force PubMed to try to read a multi-word string as a phrase by putting quotation marks around the search string. This will turn off the Automatic Term Mapping.

If there is no match to the Phrase List, PubMed proceeds to the final step.

Step 4: Author Index

Finally, PubMed will compare what's in the search box to the Author Index IF what's in the search box conforms to the author format - meaning, of course, that it's a word followed by one or two letters. If there's a match, then PubMed will run the search.

What if there's still no match? If there is more than one word entered in the search box, PubMed will chop off the word on the far right and repeat the process with the remaining words. Eventually, if there are no matches, PubMed will do an [All Fields] search for each term and AND the terms together.

PubMed repeats this process for every single search UNLESS we do something that turns off the Automatic Term Mapping, such as: using quotation marks around multi-word terms (""); using truncation (*); or, using command searching ([tags]).

And that's the PubMed fandango called Automatic Term Mapping.

Interested in additional inside secrets of PubMed or NLM's Gateway? Feel free to contact me for training classes. Call the NER office at 1-800-338-7657 or email me at Donna.Berryman@umassmed.edu

Donna Berryman, Outreach Coordinator


NLM | NN/LM | NER


Comments to:
nnlm-ner@umassmed.edu
University of Massachusetts Medical School
222 Maple Avenue Shrewsbury, MA 01545
Phone:  800-338-7657
508-856-5979
Fax:  508-856-5977