The Benchmark of Paragraph and Sentence Extraction Summaries on Outlier Document Filtering Applied Multi-Document Summarizer

Authors

  • M. Turan Yıldız Technical University
  • C. Sönmez İstanbul Technical University
  • M. C. Ganiz Doguş University

DOI:

https://doi.org/10.5755/j01.itc.43.4.7010

Keywords:

Document Processing, Extractive Summarization, Outlier Detection, Similarity Measure

Abstract

We studied outlier document filtering (ODF) for extractive sentence summarization. Our results are superior compared to the average of the participant systems’ using DUC 2006. Furthermore, we add extractive paragraph summarization to the same system. It is surprising that the results are nearly the same for ROUGE metrics. Although extractive paragraph summarization has a better performance for precision, extractive sentence summarization has a slightly better performance on the recall and F-Score which is the harmonic mean of recall and precision. The ODF is successful for both extractive sentence and paragraph summarization. The similarity metric (match percent) suggested in the article prevents the domination of longer sentences/paragraphs on shorter sentences/paragraphs in selection. As a result, the ODF provides the flexibility of paragraph extraction instead of sentence extraction for simplicity and readability and less work load.

DOI: http://dx.doi.org/10.5755/j01.itc.43.4.7010

Downloads

Published

2014-12-16

Issue

Section

Articles