Automatic headline generation is a sub-task of document summarization with many reported applications. In this study we present a sequence-prediction technique for learning how editors title their news stories. The introduced technique models the problem as a discrete optimization task in a feature-rich space. In this space the global optimum can be found in polynomial time by means of dynamic programming. We train and test our model on an extensive corpus of financial news, and compare it against a number of baselines by using standard metrics from the document summarization domain, as well as some new ones proposed in this work. We also assess the readability and informativeness of the generated titles through human evaluation. The obtained results are very appealing and substantiate the soundness of the approach
HEADS: Headline Generation as Sequence Prediction Using an Abstract Feature-Rich Space / Colmenares, Carlos A.; Litvak, Marina; Mantrach, Amin; Silvestri, Fabrizio. - (2015), pp. 133-142. (Intervento presentato al convegno NAACL 2015 tenutosi a Denver, Colorado) [10.3115/v1/N15-1014].
HEADS: Headline Generation as Sequence Prediction Using an Abstract Feature-Rich Space
Fabrizio Silvestri
2015
Abstract
Automatic headline generation is a sub-task of document summarization with many reported applications. In this study we present a sequence-prediction technique for learning how editors title their news stories. The introduced technique models the problem as a discrete optimization task in a feature-rich space. In this space the global optimum can be found in polynomial time by means of dynamic programming. We train and test our model on an extensive corpus of financial news, and compare it against a number of baselines by using standard metrics from the document summarization domain, as well as some new ones proposed in this work. We also assess the readability and informativeness of the generated titles through human evaluation. The obtained results are very appealing and substantiate the soundness of the approachI documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.