Finding important information in unstructured text

From Language and Information Technologies

Jump to: navigation, search

A vast majority of the information we deal with in everyday life consists of raw, unstructured text, where the most important facts or concepts are not always readily available, but hidden in the myriad of details that accompany them. To handle and digest the sheer amount of information we are exposed to in this information age, more sophisticated procedures are required to unveil the important parts of a text, and to allow us to process more information in less time. The goal of this project is to develop robust and accurate techniques to automatically extract important information from unstructured text, in the form of keyphrases (keyphrase extraction) or entire sentences (extractive summarization).


Funded by Google

People

Students

Publications

  • Andras Csomai and Rada Mihalcea, Linking Educational Materials to Encyclopedic Knowledge, in Proceedings of the International Conference on Artificial Intelligence in Education (AIED 2007), Los Angeles, CA, July 2007. pdf
  • Rada Mihalcea and Hakan Ceylan, Explorations in Automatic Book Summarization, in Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP 2007), Prague, June 2007. pdf
  • Andras Csomai and Rada Mihalcea, Investigations in Unsupervised Back-of-the-Book Indexing, in Proceedings of the Florida Artificial Intelligence Research Society (FLAIRS 2007), Key West, May 2007. Best paper in track. pdf
  • Andras Csomai and Rada Mihalcea, Creating a Testbed for the Evaluation of Automatically Generated Back-of-the-book Indexes, in Proceedings of the Conference on Computational Linguistics andIntelligent Text Processing (CICLing), LNCS, Mexico City, February 2006. pdf data
Personal tools