Reference resolution in a speech recognition environment

title: Reference resolution in a speech recognition environment
author: Dimitri Woei-A-Jin
published in: 2001
appeared as: Master of Science thesis
Delft University of Technology
pages: 306
PDF (1.320 KB)

Abstract

In this thesis the methods implemented to resolve anaphora in the speech recognition environment of the SPICE-EPG demonstration prototype, an electronic programming guide, of Philips Research are described. The SPICE-EPG uses shallow-parsing, which provides no information about sentence structure and only relevant phrases are returned. To resolve anaphora, syntactic information is very important, and without it anaphora resolution becomes very difficult. To overcome the lack of syntactic information a reference resolution model is used, which determines the preference for referents without needing syntactic information and a set of filters is applied to be able to determine some of the dependencies between different phrases, which are needed to successfully solve anaphora. Three different ways to determine the dependencies between the phrases are employed: looking at the properties of the different phrases and determine the dependency based on the match with these properties, assigning a subphrase to a phrase which indicates the dependency, and assign a superphrase to a phrase which indicates the dependency. The first method is applied when two different phrases do not necessarily appear next to each other, but other unrelated phrases can occur between them. The second method is suitable when the two phrases always occur next to each other, and one of them provides extra information about the other. The third method is employed when a so called compound reference occurs: a phrase refers to a property of another phrase, which is a reference itself. This group of methods is tested on a small corpus, which is based on examples of reference given by co-workers, based on their ideas about the type of references which the electronic programming guide should ideally be able to handle. Offline tests show that the chosen method is adequate in resolving references which fall within the scope of the project. Online tests however show that additional measures must be taken to solve certain problems with speech recognition errors.

 
blue line
University logo