Domain-oriented two-stage aggregation: generating baseball play-by-play narratives
Domain-oriented two-stage aggregation: generating baseball play-by-play narratives
Files (excerpt)
Published date
2015
Resource type
Publisher
ISBN
ISSN
DOI
Call no.
Other identifier(s)
Edition
Copyrighted date
Language
eng
File type
application/pdf
Extent
6 pages
Other title(s)
Authors
Advisor
Other Contributor(s)
Citation
7th International Conference on Knowledge and Smart Technology (KST 2015), 42-47
Degree name
Degree level
Degree discipline
Degree department
Degree grantor
Abstract
This paper presents an end-to-end natural language
generation system that performs aggregation in two stages: the
first takes advantage of the information implicit in the source
knowledge base in order to aggregate event components into
complex sentences. The second stage examines the developing
context of the text in order to aggregate similar adjacent events
into more fluent text. The source knowledge base is the
Retrosheet collection of play-by-play baseball scoresheets
encoded in machine-readable form. The output is reasonably
fluent and natural, human-readable play-by-play narratives of
historical baseball games. The system was tested against all
regular season major league games played from 1950 to 1969,
taking less than a second to produce three to five pages of text for
each game. The aggregation achieved resulted in a substantial
improvement in native speaker judgments of fluency and
readability.