Graph-based NLP

From Language and Information Technologies

Jump to: navigation, search

The goal of this research project is to investigate efficient graph-based representations of text, and explore the application of ranking models based on such graph structures to natural language processing tasks. We bring together methods from computational linguistics and graph-theory, and combine them into a suite of innovative approaches that will improve and ultimately solve difficult problems in natural language processing. Specifically, we are currently working on the application of graph centrality algorithms to problems such as word sense disambiguation, text summarization and keyword extraction.


Funded by the Texas Advanced Research Program

Contents

People

Recent Publications

  • Dragomir Radev and Rada Mihalcea, Networks and Natural Language Processing, submitted to the Artificial Intelligence Magazine, 2008.
  • Samer Hassan, Rada Mihalcea and Carmen Banea, Random-Walk Term Weighting for Improved Text Classification, in International Journal for Semantic Computing, December 2007.
  • Samer Hassan, Rada Mihalcea and Carmen Banea, Random-Walk Term Weighting for Improved Text Classification, in Proceedings of the IEEE International Conference on Semantic Computing (ICSC 2007), Irvine, CA, September 2007. pdf best student paper award
  • Ravi Sinha and Rada Mihalcea, Unsupervised Graph-based Word Sense Disambiguation Using Measures of Word Semantic Similarity, in Proceedings of the IEEE International Conference on Semantic Computing (ICSC 2007), Irvine, CA, September 2007. pdf
  • Rada Mihalcea and Hakan Ceylan, Explorations in Automatic Book Summarization, in Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP 2007), Prague, June 2007. pdf
  • Samer Hassan, Andras Csomai, Carmen Banea, Ravi Sinha and Rada Mihalcea, UNT: SubFinder: Combining Knowledge Sources for Automatic Lexical Substitution, in Proceedings of the 4th International Workshop on the Semantic Evaluations (SemEval 2007), Prague, Czech Republic, June 2007. pdf
  • Chris Biemman, Irina Matveeva, Rada Mihalcea, Dragomir Radev, Proceedings of the NAACL Workshop on Graph-based Algorithms for Natural Language Processing (TextGraphs 2007), Rochester, April 2007. pdf
  • Rada Mihalcea and Dragomir Radev, Proceedings of the NAACL Workshop on Graph-based Algorithms for Natural Language Processing, New York City, June 2006. pdf
  • Rada Mihalcea and Samer Hassan, Text summarization for improved text classification, book chapter in Current Issues in Linguistic Theory: Recent Advances in Natural Language Processing, Editors Nicolas Nicolov and Ruslan Mitkov, John Benjamins Publishers, 2006.

Downloads

  • GWSD (Unsupervised Graph-based Word Sense Disambiguation) is a system for unsupervised all-words graph-based word sense disambiguation

Previous Publications

  • Rada Mihalcea, Unsupervised Large-Vocabulary Word Sense Disambiguation with Graph-based Algorithms for Sequence Data Labeling, in Proceedings of the Joint Conference on Human Language Technology / Empirical Methods in Natural Language Processing (HLT/EMNLP), Vancouver, October, 2005. pdf
  • Rada Mihalcea and Paul Tarau, An Algorithm for Language Independent Single and Multiple Document Summarization, in Proceedings of the International Joint Conference on Natural Language Processing (IJCNLP), Korea, October 2005. pdf
  • Rada Mihalcea and Samer Hassan, Using the Essence of Texts to Improve Document Classification, in Proceedings of the Conference on Recent Advances in Natural Language Processing (RANLP), Borovetz, Bulgaria, September 2005. pdf
  • Rada Mihalcea Language Independent Extractive Summarization, in Proceedings of the 43nd Annual Meeting of the Association for Computational Linguistics, companion volume (ACL 2005), Ann Arbor, MI, June 2005 (demo).
  • Paul Tarau, Rada Mihalcea and Elizabeth Figa, Semantic Document Engineering with WordNet and PageRank, in Proceedings of the ACM Conference on Applied Computing (ACM-SAC 2005), New Mexico, March 2005.
  • Rada Mihalcea, Paul Tarau and Elizabeth Figa, PageRank on Semantic Networks, with application to Word Sense Disambiguation, in Proceedings of The 20st International Conference on Computational Linguistics (COLING 2004), Switzerland, Geneva, August 2004. pdf
  • Rada Mihalcea and Paul Tarau, TextRank: Bringing Order into Texts, in Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP 2004), Barcelona, Spain, July 2004. pdf
  • Rada Mihalcea, Graph-based Ranking Algorithms for Sentence Extraction, Applied to Text Summarization, in Proceedings of the 42nd Annual Meeting of the Association for Computational Linguistics, companion volume (ACL 2004), Barcelona, Spain, July 2004.pdf
Personal tools