| HOME | HELP | FEEDBACK | SUBSCRIPTIONS | ARCHIVE | SEARCH | TABLE OF CONTENTS |
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Research paper |
Stanford Medical Informatics and Stanford Center for Clinical Informatics, Stanford University School of Medicine, Stanford, California
* Correspondence and reprints: Yang Huang, PhD, Stanford Medical Informatics, MSOB X215, 251 Campus Drive, Stanford, CA 94305-5479 (Email: huangy{at}stanford.edu).
Received for publication: 09/18/06; accepted for publication: 01/29/07.
| Abstract |
|---|
|
|
|---|
Design: Negations are classified based upon the syntactical categories of negation signals, and negation patterns, using regular expression matching. Negated terms are then located in parse trees using corresponding negation grammar.
Measurements: A classification of negations and their corresponding syntactical and lexical patterns were developed through manual inspection of 30 radiology reports and validated on a set of 470 radiology reports. Another 120 radiology reports were randomly selected as the test set on which a modified Delphi design was used by four physicians to construct the gold standard.
Results: In the test set of 120 reports, there were a total of 2,976 noun phrases, of which 287 were correctly identified as negated (true positives), along with 23 undetected true negations (false negatives) and 4 mistaken negations (false positives). The hybrid approach identified negated phrases with sensitivity of 92.6% (95% CI 90.993.4%), positive predictive value of 98.6% (95% CI 96.999.4%), and specificity of 99.87% (95% CI 99.799.9%).
Conclusion: This novel hybrid approach can accurately locate negated concepts in clinical radiology reports not only when in close proximity to, but also at a distance from, negation signals.
| Introduction |
|---|
|
|
|---|
Though word-based indexing using a vector space model5 is simple and powerful, concept-based indexing using biomedical terminologies can improve the performance of biomedical IR over that of the vector space model.6 The National Library of Medicines (NLM) Unified Medical Language System (UMLS)7 provides comprehensive coverage of biomedical concepts and many researchers have studied a variety of approaches to concept-based indexing of clinical documents using the UMLS.812
Negation is commonly seen in clinical documents13 and may be an important source of low precision in automated indexing systems.14 Because of the different semantics associated with affirmative biomedical concepts vs. when the same concepts are negated, it is essential to detect negations accurately to facilitate high performance document retrieval in clinical documents.
Generally speaking, negation is complex in natural languages, such as English. It has been an active research topic for decades. Researchers have approached this topic from both linguistic and philosophical perspectives.15,16 In most cases, negation involves a negation signal, a negated phrase containing one or more concept(s), and optionally some supporting feature (pattern), which helps us locate the negated phrase. In the following example,
There is no evidence of cervical lymph node enlargement."no" is the negation signal used to denote that a following concept is negated; "cervical lymph node enlargement" is the negated phrase; while "evidence of" is the supporting phrase feature.
| Background |
|---|
|
|
|---|
Mutalik et al. showed in Negfinder,14 that a One-token Look-Ahead Left-to-right Rightmost-derivation (LALR(1)) parser could reliably detect negations in surgical notes and discharge summaries to achieve a sensitivity of 95.7% and a specificity of 91.8% without extracting syntactical structures of sentences and phrases as in full NLP parsing. It helped reduce the input complexity for the LALR(1) parser to replace words in text with UMLS concept IDs before negation tagging. Such a concept replacement process may impact the overall performance of negation detection; however, its performance was not reported. As pointed out by the authors, the limitations of such a single token look-ahead parser prevented it from detecting negated concepts correctly if the negation signal was more than a few words away from negated concepts.
NegEx, a regular expressionbased algorithm developed by Chapman et al.,18 though simple to implement, has been shown to be powerful in detecting negations in discharge summaries18 with a sensitivity (recall) of 77.8%, a positive predictive value (PPV, precision) of 84.5%, and a specificity of 94.5%. NegEx could identify a term as negated after it was mapped to a UMLS concept. The results were calculated using successfully mapped UMLS terms only. An improved NegEx (version 2) was reported to have generally lower performance in pathology reports without any customizations for the new document type.24 This later study included text phrases not mapped to UMLS concepts and identified UMLS concept mapping as one major source of error.
More recently, Elkin et al. studied this problem using a negation ontology containing operators and their associated rules.19 Operators were two sets of terms with one set starting negations and another set stopping the propagation of negations. Each sentence was first broken into text and operators, with text mapped to SNOMED CT concepts by Mayo Vocabulary Server using automated term composition (ATC).25 Those concepts were assigned one of the three possible assertion attributes according to the negation ontology, with "negative assertion" being studied. This study expanded the previous studies by formally evaluating the coverage of concept mapping proceeding negation detection. The human reviewer in the study identified that 205 of 2,028 negative concepts were not mapped by SNOMED CT, revealing the terminologys coverage of 88.7% of the negative concepts. The 205 unmapped negative concepts were not included in the gold standard in calculating the performance of negation assignment: the sensitivity (recall) of 97.2%, the PPV (precision) of 91.2%, and the specificity of 98.8%.
The published evaluations on negation detection used mainly lexical approaches without using the syntactical structural information of a sentence embedded in its parse tree generated through full NLP parsing. They were shown to be effective and reliable; however, determining the scope of a negation was noted as a challenge by several authors.14,18 The above approaches could determine the scope of negations reliably when a negated concept is close to a negation signal, but unsatisfactorily when they are separated with multiple words not mapped to a controlled terminology. We have devised a novel approach including a classification scheme based on syntactical categories of negation signals and their corresponding natural language phrase patterns to support locating negated concepts both in close proximity to and at a distance from negation signals.
ChartIndex is an automated concept indexing system using a contextual indexing strategy26 and an NLP approach for noun phrase identification27 before concept mapping to improve indexing precision. It uses an open-source high-performance statistical parser, the Stanford Parser,28 to generate a full parse tree of each sentence, which provides the syntactical information on sentence structure used by our negation detection approach. A classification of negations was first developed according to the syntactical categories of negation signals, and the phrase patterns required to locate negated phrases. One then uses a hybrid approach, combining regular expression matching and a grammatical approach, to locate negated phrases within a parse tree. The classifier first detects possible negation in a sentence, and classifies the negation into one of 11 categories by regular expression matching. The computer then extracts the negated phrases from the parse tree, according to grammar rules developed for that negation type. Regular expression matching is fast and sensitive in identifying the type of negations, while the grammatical approach helps locate negated phrases accurately within or outside the proximity of the negation signal.
In most previous studies, negative concepts were tallied only when they were in a controlled terminology, such as UMLS or SNOMED CT. It was plausible as the purpose of study was to assess a negation detection module placed after concept mapping process. To evaluate the impacts of concept mapping on negation detection, Mitchell et al.24 allowed human reviewers to mark phrases not mapped to a controlled terminology and included unmapped phrases in the performance calculations. Similarly, we used phrases in the text without a concept mapping process in this study to evaluate the performance of negation assignment, independent of concept mapping.
| Methods |
|---|
|
|
|---|
For the purpose of this study, only complete negations (versus partial negations such as "probably not") within a sentence are considered. Phrases were considered negated if they were indicated as "completely absent" in the clinical document. Normal findings and test results were not considered as negated based on discussions with physicians. In the following example given by Mutalik et al., "several blood cultures, six in all, had been negative," "several blood cultures" were not considered as negated because it is a noun phrase representing a test, the results of which were normal. Negations within a word were not considered, as in the case of negative prefix or suffix, because they are often semantically ambiguous, moreover, the best way to represent these words may depend on the controlled terminologies used for concept encoding. For example, people would agree that "nontender" is a negation meaning "not tender," however, "colorless" is itself a concept in SNOMED CT (263716002) to describe a color attribute: transparent. People may not agree on whether it is a true negation or how to represent it, either as "colorless" (263716002) or as negation of "colors" (263714004). Mutalik et al. gave more examples in their paper:14 "A final issue is that many UMLS concepts themselves represent antonymous forms of other concepts, e.g., words beginning with "anti-," "an-," "un-," and "non-." Such forms are not necessarily negations. (Thus, an anti-epileptic drug is used when epilepsy is present; "non-smoker," however, is a true negation.)" We decided to mark-up negated biomedical noun phrases instead of UMLS concepts, to focus on evaluating negation detection in this experiment. The concepts represented by these negated phrases, whether in a controlled terminology or not, are negated concepts.
Deriving Negation Grammar
The document collection used in our study was 1,000 de-identified radiology reports of six common imaging modalities from Stanford University Medical Center. After deriving a grammar-based classification scheme from a limited set of 30 reports (see below), the coverage of the classification was validated using another 470 reports. These 500 reports served as the training set. A negation detection module (NDM) was implemented using the above grammar, which was later tested on a set of 132 reports randomly selected from the remaining 500 reports.
To construct a negation grammar in radiology reports, we manually identified sentences with negations in 30 reports of all six modalities and marked-up negation signals, negated phrases, and negation patterns. We also studied the published literature to construct a more extensive list of possible negation patterns from a linguistics perspective. Instances and patterns were reviewed by a physician from a clinical application perspective to construct a preliminary classification of negations. Negations were classified based on the syntactical categories of negation signals, and the phrase patterns. The following is an example of one such grammar rule for an adjective-like negation type, where the negation signal is a determiner such as "no" or a preposition such as "without," or an adjective such as "absent," followed by a noun phrase to negate phrases. Determiners and prepositions are different from adjectives. However, the noun phrases, such as "evidence of" following "no," "without," or "absent," determine the scope of the negated phrases in the same way. Therefore we categorize these three together to reduce the total number of grammar rules.
NNS {of
for
to suggest} NegdPhr
{no
without
absent}
{mammographic
significant}
{evidence
feature
area
pattern
history
sign}
{features
areas
patterns
signs}
IN0 NP
{without}
NP PP
DT
JJ0 [JJ1] NN
NNS
[JJ1] NN
NNS
{no}
{absent}
{mammographic
significant}
{evidence
feature
area
pattern
history
sign}
{features
areas
patterns
signs}
IN NP
{of
for}
NegdPhr Example: There is no evidence of cervical lymph node enlargement.
Note: N Negation Signal, [JJ] optional Adjective, NN Noun singular, NNS Noun plural, IN Preposition, NegdPhr Negated Phrase, NP Noun Phrase, VP Verb Phrase, PP Prepositional Phrase, A
B either A or B.
Figure 1 shows a parse tree of the above sentence generated by the Stanford Parser, with each token of the sentence tagged with a Part-Of-Speech (POS) tag. A POS tag identifies the syntactic category of a sentence component, such as JJ for an adjective and NN for a noun. As shown in Figure 1 each grammar rule like the above was further translated into a structural rule to extract negated phrases within a parse tree: 1. Locate the noun phrase (NP) with a head from a small set of nouns such as "evidence" and modified by word "no," "without," or "absent;" 2. Locate the prepositional phrase (PP) headed by "of" or "for" following the above NP; 3. Extract the NP under the above PP, which contains the negated phrase (NegdPhr).
|
In the above process, a comprehensive syntactical classification of negations in radiology reports was obtained. The classification has been shown in Table 1, with negations firstly classified based on the syntactical category of negation signals as an adjective-like (such as "no, absent" and a preposition such as "without"), adverb (such as "not"), verb (such as "deny"), and noun (such as "absence") respectively. To locate negated phrases, negations are further classified based on phrase patterns, including words critical in defining negation patterns and the negation patterns themselves. The first two columns of the table contain the syntactical categories of negation signals and phrase patterns, which support locating negated phrases not in close proximity with negation signals from the output of an NLP parser. The third column contains examples for each category, where negation signals are bolded, negation patterns are italic, and negated phrases are underlined.
|
|
An example of a parse tree output from the Stanford Parser is shown in Figure 3. This sentence may pose a difficulty for most negation detection algorithms not using the structural information in the sentences parse tree, because the negated phrase "The previously identified isoechoic nodule on the right" and the negation signal "not" are separated by the phrase "larger than 7 mm." However, with the help of the parse tree, it is clear that, syntactically, the verb phrase (VP) "is not seen ..." negated its subject, a noun phrase (NP) composed of two noun phrases, "The previously identified isoechoic nodule on the right" and "larger than 7 mm." Thus, we were able to tag both of them as a negated composite noun phrase, with the help of the parse tree in Figure 3 and the grammar developed in the previous step.
|
Using the gold standard, we measured the recall and precision of negation detection, together with the inter-rater agreement ratio and priming bias.
| Results |
|---|
|
|
|---|
Four physicians were assigned as four pairs with each pair inspecting 30 pre-tagged reports in the test set of 120 clinical radiology reports. Their agreement on negated phrases is shown in Table 2.
|
|
| Discussion |
|---|
|
|
|---|
In this approach, a negated phrase is usually well-scoped because the syntactical role of the negation signal is well-defined by the parse tree. As shown in Figure 4, the NDM is able to tag "para aortic soft tissue stranding or leak" as negated but not "the aneurysm" using the grammar for negation type 4 in Appendix A (available as an online data supplement at www.jamia.org). This is because the negative signal "no" clearly has its scope defined by the noun phrase (NP) node over "no para aortic soft tissue stranding or leak," while "the aneurysm" is an NP sitting under a VP which is outside the scope of "no."
|
Error Analysis
Table 4
shows the classification of errors made by the computer program on 132 radiology reports.
|
It was actually more commonly seen that the grammar did not contain the corresponding phrase pattern of a known negation signal to extract negated phrases. Here is an example,
No focal lucency or endosteal scalloping is noted to suggest multiple myeloma.
Here "multiple myeloma" is considered negated as well as "focal lucency or endosteal scalloping," through the phrase pattern "no ... noted to suggest."
Nine FNs (34.6%) were caused by incomplete phrase patterns.
There were also errors caused by irregular use of the language and other reasons. This accounted for two FNs (7.7%) and one FP (25%).
For example,
"No perinephric collections are identified or renal masses." "Renal masses" should be tagged as negated in the above, however, this sentence was hardly correct grammatically.
Limitations
The first limitation was the comprehensiveness of such a manually derived negation grammar. Even though we had manually validated our grammar on a larger set of 470 reports and initial results during the grammar development showed very good coverage, the syntactical and lexical negation grammar was shown to be not as comprehensive as expected during the test. The sensitivity (recall) of 92.6% (95% CI 90.993.4%) is significantly lower than the PPV (precision) of 98.6% (95% CI 96.999.4%). One important reason was that reviewers have slightly different definitions of negations even though they agree with each other most of the time. For a total of 29 false negatives initially brought up in the first round on the set of 120 reports, reviewers did not reach agreement on 12 of them after seeing the markups by the other physician. Because none of the reviewers were involved in the grammar development, the grammar used in the NDM was thus not as "comprehensive" as the total of four reviewers. However, we should note that even though this limitation is reflected in the coverage of the negation grammar in this work, it would also apply to other approaches to the evaluation of negation detection, as differences in the understanding of negations among developers and reviewers do exist and they negatively impact the measured performance of the system independent of the negation detection approach used.
More importantly, the performance of extracting negated phrases from a full parse tree was obviously limited by the parsing performance of the NLP parser. As documented in our previous study,27 the noun phrase identification is harder for longer maximal noun phrases due to the ambiguity of English. For example, the parser may not be able to attach a modifying prepositional noun phrase at the right level of a parse, resulting in a noun phrase identification error. In such cases, different words of a biomedical phrase may be separated far apart in a parse tree, and thus missed by the NDM. Our radiology report corpus had many incomplete sentences in sections including the Impression section. These fragmented sentences were mostly well-formed long NPs with preposition phrase attachments or other structures. The presence of incomplete sentences indeed negatively impacted the parsing performance, to a much lesser degree compared to the surgical pathology reports we had. We had expected the parsing performance would cap the performance of negation detection at a lower level. However, the parser was able to get most low-level structures right in a parse tree, even when it did not get the sentence completely right. It occurred much more often for negation instances to be located in a relatively small part of the parse tree using simple syntactical structures. We envision that it is a long-term gradual process to improve the parsing performance of NLP parsers.
The NDM does not handle affixed negations or negations across sentence boundaries. As discussed previously, the definition and utility of detecting negations within a word may depend on applications and the controlled terminologies used for encoding. It is possible to develop a lexical scanner to scan each word token for negative affixes to work with the concept mapping process. Tokens stripped off negative affixes usually represent more basic concepts thus, may be more likely to be mapped into controlled terminologies. Therefore, it could be an important extension to study affixed negations.
Another important limitation was that our study was done on radiology reports only. Chapman et al. documented that radiology reports contained only two thirds of frequently used negation phrases found in non-radiology reports.30 Therefore, the approach described in this study should be further validated using other types of narrative clinical documents.
Conjoined noun phrases can be very difficult to parse due to ambiguity. We took a practical approach and our grammar handles conjoined noun phrases based on the output of the Stanford Parser and made no efforts in parsing out each smaller noun phrase.
We are integrating the NDM developed in this work into the ChartIndex concept indexing system, which will provide further data in evaluating its performance. Moreover, we plan to expand the NDM to work on other types of clinical documents.
| Conclusion |
|---|
|
|
|---|
| Footnotes |
|---|
| References |
|---|
|
|
|---|
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| HOME | HELP | FEEDBACK | SUBSCRIPTIONS | ARCHIVE | SEARCH | TABLE OF CONTENTS |