The Benchmark of Paragraph and Sentence Extraction Summaries on Outlier Document Filtering Applied Multi-Document Summarizer
Keywords:Document Processing, Extractive Summarization, Outlier Detection, Similarity Measure
AbstractWe studied outlier document filtering (ODF) for extractive sentence summarization. Our results are superior compared to the average of the participant systems’ using DUC 2006. Furthermore, we add extractive paragraph summarization to the same system. It is surprising that the results are nearly the same for ROUGE metrics. Although extractive paragraph summarization has a better performance for precision, extractive sentence summarization has a slightly better performance on the recall and F-Score which is the harmonic mean of recall and precision. The ODF is successful for both extractive sentence and paragraph summarization. The similarity metric (match percent) suggested in the article prevents the domination of longer sentences/paragraphs on shorter sentences/paragraphs in selection. As a result, the ODF provides the flexibility of paragraph extraction instead of sentence extraction for simplicity and readability and less work load.
Copyright terms are indicated in the Republic of Lithuania Law on Copyright and Related Rights, Articles 4-37.