The Role of Discourse Units in Near-Extractive Summarization

Publication TypeConference Paper
Year of Publication2016
AuthorsLi, J. Jessy, K. Thadani, and A. Stent
Conference NameProceedings of SIGDIAL

Although human-written summaries of
documents tend to involve significant edits
to the source text, most automated summarizers
are extractive and select sentences
verbatim. In this work we examine how
elementary discourse units (EDUs) from
Rhetorical Structure Theory can be used
to extend extractive summarizers to produce
a wider range of human-like summaries.
Our analysis demonstrates that
EDU segmentation is effective in preserving
human-labeled summarization concepts
within sentences and also aligns with
near-extractive summaries constructed by
news editors. Finally, we show that using
EDUs as units of content selection instead
of sentences leads to stronger summarization
performance in near-extractive
scenarios, especially under tight budgets.

