Natural Language Processing

Language is meant for communicating about the world. The fields of AI attempts to make machines understand human language, both in terms of speech and written form. The largest part of human linguistic communication occurs as speech. Natural language processing (NLP) is the process of computer analysis of input provided in a human language (natural language), and conversion of this input into a useful form of representation.
Entire language processing problem in two tasks:

  1. Processing written text: Using lexical, syntactic, and semantic knowledge of the language as well as required real world information
  2. Processing spoken language: All information included in (i) plus additional knowledge about phonology as well as enough added information to handle the further ambuiguity that arise in speech (eg.noise)

Natural Language Understanding (NLU)

  • NLU includes understanding the language, generating meaningful sentences to respond to queries a well as other tasks such as multilingual translation.

Problems Related to NLP
There are many problems. The major problems are:

  1. Natural languages are incomplete description of the information

e.g. Some dogs are outside.
(i) Some dogs are on the lawn
(ii) Three dogs are on the lawn
2. The same expressions means different things in different contexts
E.g. Where’s the water? (In a chemistry lab, it must be pure)
Where’s the water? (When you are thirsty, it must be potable)
3. No natural language program can be complete
Because new words, expressions, and meanings can be generated quite freely. E.g. I’ll fax it to you
4. There are lot of ways to say the same thing.
E.g. Mary was born on October 11.
Mary’s birthday is October 11.
Levels of analysis for natural language
Language is a complicated phenomenon . To manage this complexity, linguists have defined different level of analysis for natural language.

  1. Prosody:
  • Deal with rhythm and intonation of language. This level of analysis is difficult to formalize and often neglected

2. Morphology:

  • Examines the sounds that are combined to form language; this branch of linguistics is important for computerized speech recognition and generation.

3. Morphology:

  • Concerned with the components (morphemes) that make up words.
  • There include the rules governing the formation of words such as the effect of prefixes (un-,non-, anti-,etc) and suffixes (-ing, -ly, etc) that modify the meaning of root words
  • Morphological analysis is important in determining the rate of a world in a sentence including its tense, number, and parts of speech

4. Syntax:

  • Studies the rules for combining words into legal phrases and sentences and use of those rules to parse and generate sentences.

5. Semantics:

  • Considers the meaning of words, phrases and sentences and the ways in which meaning is conveyed in natural language expressions.

6. Pragmatics:

  • is the study of ways in which language is used and its effects on the listener

7. World Knowledge:

  • includes knowledge of the physical world i.e. the domain in concern.

Steps In Natural Language Processing

  1. Morphological Analysis:
  • individual words are analyzed and non-word tokens such as punctuation, are separated from the words.

2. Syntactic Analysis:

  • Linear sequence of words is transferred into structures that show how the owrds relate to each other. Some word sequences may be rejected if they violate the language’s rules for new words may be combined.

3. Semantic Analysis:

  • structures are created by the semantic analyzer are assigned meanings.
  • syntactically valid sentence may be semantically invalid. Eg. Tiger eat grass.

4. Discourse Integration:

  • The meaning of an individual sentence may depend on the sentences that precede it and may influence the meanings of the sentence that follow it. Eg. John wanted it.

5. Pragmatic Analysis:

  • The structure representing what was said is reinterpreted to determine what was actually meant. Eg. Do you know what time is it?

Applications of NLP

  • Machine Translation – translation between two natural languages
  • Information retrieval – web search
  • Query Answering/Dialogue – natural language interface with a database system or a dialogue system.
  • Report generation – Generation of reports such as weather reports
  • Some small application – Grammar checking, spell checking, etc.

 

Leave a Reply

Your email address will not be published. Required fields are marked *