Improve accuracy of Parts of Speech tagger for Gujarati language


  • Sirajuddin Y. Hala Department of Computer Engineering, VVP Engineering College, Rajkot
  • Sagar H. Virani Ass.Prof. Computer Engineering Department, VVP Engineering College, Rajkot


– Parts of speech tagging, Natural Language Processing, Gujarati


Parts-of-Speech (POS) tagging is a technique for annotation of lexical categories to every word in input
sentence. POS tagging is widely used for linguistic analysis of input text. It is very essential task and preprocessing step
for all the natural language processing activities. A POS tagger takes a sentence from input data and assigns a unique
parts of speech tag to each lexical item of the sentence. Parts-of-Speech (POS) tagger has been developed for many
languages as a part of Natural Language Processing. Till now many POS Taggers are available for Indian Languages
like Hindi, Bengali, Tamil, Telugu, Malayalam, and Panjabi etc. Gujarati is a resource poor Indo-Aryan Language on
which very less languages processing work is done. Very fewer tools are available for language processing on Gujarati.
The main concern of developing POS tagger for any Language is to imp rove accuracy of tagging and remove ambiguity
in sentences due to language structure. This work focuses on developing Parts of Speech Tagger for Gujarati Language.



