The use of plan recognition for understanding dialogues was first described by Allen and Perrault [1]. By using planning rules, their system could take an utterance and perform planning in reverse, by matching the consequence of each rule to a given utterance and introducing plan hypotheses called alternatives from the different available antecedents. From these hypotheses, the plan rules were used to search for a continuation of each alternative. The system partitioned the plan rules according to the capabilities of the system and the capabilities of the user, so that "obstacles" could be identified. Obstacles are parts of the plan that cannot be achieved by the user, but can be achieved by the system. The response of the system is then determined by the obstacles for which solutions can be provided.
Allen and Perrault give several examples of how a cooperative response can be provided. In one, an agent asks a question, for which an answer is given, but as a bonus, the hearer applies the planning rules to recognise the speaker's plan, identifies a further obstacle involving lack of information, and provides further information without being asked. In another example, a speaker asks a yes/no question about a property of a referent. Finding that the answer is no, the hearer infers that the speaker may have been looking for a referent for which the property is satisfied, and so the cooperative response is to instead provide a suitable referent. In another example, the planning of an indirect speech act is explained. In this example, a speaker asks a question that is seemingly irrelevant to any plan, such as "can you pass the salt?", flouting the maxim of relevance, until it is inferred that this question can be answered as a side-effect of another plan that is relevant, namely one in which the speaker wants the salt passed. While Allen and Perrault described the use of plan recognition in understanding an utterance, it must also be used from the speaker's perspective in choosing an utterance. For example, an agent might choose between the questions "what is the time and platform number of the Windsor train?", and "What is the time of the Windsor train?". In choosing the latter, the speaker must reason about the hearer's plan recognition process. The speaker expects that the hearer will recognise his plan and provide the information without being asked.
There are various devices of ambiguity in dialogue whose success depends on plan recognition. For example, a referring expression like "the ball" might refer to the red ball or to the blue ball. Suppose a speaker has just picked up the red ball. He can then say "Shall I pass the ball?" and expect that the hearer will recognise his plan and find that the only ball that can be passed is the red one. Grosz and Sidner [28] describe how this happens by referring to the agent's attentional state. The attentional state is the collection of actions and objects associated with the focus point in the plan structure. If an agent is focussed, its contribution must attach itself to the focus point, and therefore must refer to the attentional state. Anaphora can also be planned in the same way. Where a pronoun may refer to more than one agent, the hearer would be expected to select whichever agent allows a coherent and focussed continuation of the plan to be constructed [67]. Carberry [9] shows that by generating an expectation from a plan, ellipsis can be used to communicate sentence fragments, from which the intent of the speaker can be recovered by the hearer by matching the expectation with the fragment.
Grosz and Sidner [29] have formalised a logical model of the cooperative planning and plan recognition process. They describe the conditions under which an agent can infer that a plan is intended in which both the speaker and the hearer will act. This is called a shared plan. Assuming that when an agent recognises a speaker's intention, he will adopt that intention as well, the conditions are that it is mutually believed that the speaker intends the intention, and it is mutually believed that the actions in the plan are correct. If these conditions hold, then the mutual belief that each agent intends its part of the plan can be established. Shared plans are formalised as a set of inference rules on the mental state of the speaker. Shared plans are in some ways too strong to be useful. There are many situations in which mutual beliefs do not hold about intentions, due to differences of beliefs about the domain state and plan rules, yet the speaker can still form a useful plan. For instance, a dialogue planner may ask for milk in his coffee, but the hearer, not believing that milk is available, may well form a plan to ask him if cream is alright instead. In this example, a shared plan does not exist for the first utterance because the agents do not have mutual beliefs about the correctness of the plan. Such plans will feature regularly in this thesis, since the planner to be described is designed to deal with uncertainty and differences in beliefs.