Why Technical English

Thesaurus – what’s up with it

April 28, 2009 · 1 Comment

By Galina Vitkova

A thesaurus (according to the Wikipedia) is a catalogue that contains synonyms and sometimes antonyms even if it shouldn’t be a complete list of all the synonyms or antonyms for a particular word. Besides, its entries are intended to draw up distinctions between similar words, which assist in choosing exactly the right word.

In Information Science, Library Science, and Information Technology, specialized thesauri are designated for information retrieval. They are a type of controlled vocabulary for indexing or tagging purposes. Such a specialized thesaurus can be used as the basis of an index for online material. The specialized thesauri typically focus on one discipline, subject or field of study

Thesauri for information retrieval have their own unique terminology, which specifies different kinds of terms and relationships between them.

Terms are the basic semantic units for conveying concepts (i.e. an abstract idea represented in a language by a word)). Terms are defined within various fields of human activities. They are usually single-word nouns because nouns are the most concrete part of speech. Verbs can be converted to nouns – “cleans” to “cleaning”, “reads” to “reading”, and so on. Adjectives and adverbs, however, seldom express any meaning useful for indexing. When a term is ambiguous (i.e. can be interpreted in more than one way), a “scope note” from the given field of knowledge can be added to give direction on how to interpret the term. For example, the term is placed in context, which allows a user to distinguish e.g. between “bureau” the office and “bureau” the furniture. Not every term needs a scope note, but their presence is of considerable help in using a thesaurus correctly.

“Term relationships” are links between terms. These relationships can be divided into three types: hierarchical, equivalency or associative.

  • Hierarchical relationships are used to indicate terms, which are narrower and broader in scope. A “Broader Term” (BT) is a more general term, e.g. “Apparatus” is a generalization of “Computers”. Reciprocally, a Narrower Term (NT) is a more specific term, e.g. “Digital Computer” is a specialization of “Computer”. BT and NT are reciprocals; a broader term necessarily implies at least one other term that is narrower.
  • The equivalency relationship is used primarily to connect synonyms and near-synonyms.
  • Associative relationships are used to connect two related terms whose relationship is neither hierarchical nor equivalent. This relationship is described by the indicator “Related Term” (RT). Associative relationships should be applied with caution because excessive use of RTs will reduce specificity in searches. For example, if the typical user is searching with term “A”, would he/she also wants resources tagged with term “B”? If the answer is no, then an associative relationship should not be established.

In information technology, moreover, a thesaurus represents a database or a list of semantically orthogonal topical search keys (i.e. search keys that are semantically independent of each other). In the field of Artificial Intelligence, a thesaurus may sometimes be referred to as an ontology. In this case the ontology is a formal representation of a set of concepts within a domain and the relationships between those concepts. It is used to reason about the properties of that domain, and may be used to define the domain.


Examples of Specialized Thesauri for Information Retrieval

Categories: English knowledge · English studying · education · technical English
Tagged: , , , ,