Unsupervised learning of probabilistic context-free grammar using iterative biclustering

Kewei Tu, Vasant Honavar

Research output: Chapter in Book/Report/Conference proceedingConference contribution

9 Scopus citations

Abstract

This paper presents PCFG-BCL, an unsupervised algorithm that learns a probabilistic context-free grammar (PCFG) from positive samples. The algorithm acquires rules of an unknown PCFG through iterative biclustering of bigrams in the training corpus. Our analysis shows that this procedure uses a greedy approach to adding rules such that each set of rules that is added to the grammar results in the largest increase in the posterior of the grammar given the training corpus. Results of our experiments on several benchmark datasets show that PCFG-BCL is competitive with existing methods for unsupervised CFG learning.

Original languageEnglish (US)
Title of host publicationGrammatical Inference
Subtitle of host publicationAlgorithms and Applications - 9th International Colloquium, ICGI 2008, Proceedings
Pages224-237
Number of pages14
DOIs
StatePublished - 2008
Event9th International Colloquium on Grammatical Inference, ICGI 2008 - Saint-Malo, France
Duration: Sep 22 2008Sep 24 2008

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume5278 LNAI
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Other

Other9th International Colloquium on Grammatical Inference, ICGI 2008
Country/TerritoryFrance
CitySaint-Malo
Period9/22/089/24/08

All Science Journal Classification (ASJC) codes

  • Theoretical Computer Science
  • General Computer Science

Fingerprint

Dive into the research topics of 'Unsupervised learning of probabilistic context-free grammar using iterative biclustering'. Together they form a unique fingerprint.

Cite this