Domain-oriented two-stage aggregation: generating baseball play-by-play narratives [Full Text] ( Baldwin, James Songsak Channarukul 2018-06-28T02:44:27Z 2018-06-28T02:44:27Z 2015
dc.description.abstract This paper presents an end-to-end natural language generation system that performs aggregation in two stages: the first takes advantage of the information implicit in the source knowledge base in order to aggregate event components into complex sentences. The second stage examines the developing context of the text in order to aggregate similar adjacent events into more fluent text. The source knowledge base is the Retrosheet collection of play-by-play baseball scoresheets encoded in machine-readable form. The output is reasonably fluent and natural, human-readable play-by-play narratives of historical baseball games. The system was tested against all regular season major league games played from 1950 to 1969, taking less than a second to produce three to five pages of text for each game. The aggregation achieved resulted in a substantial improvement in native speaker judgments of fluency and readability. en_US
dc.format.extent 6 pages en_US
dc.format.mimetype application/pdf en_US
dc.identifier.citation 7th International Conference on Knowledge and Smart Technology (KST 2015), 42-47 en_US
dc.language.iso eng en_US
dc.rights.holder Baldwin, James en_US
dc.rights.holder Songsak Channarukul en_US
dc.subject Natural language generation en_US
dc.subject Aggregation en_US
dc.title Domain-oriented two-stage aggregation: generating baseball play-by-play narratives en_US
dc.type Text en_US
mods.genre Conference Paper en_US
Excerpt bundle
Now showing 1 - 1 of 1
Thumbnail Image
712.01 KB
Adobe Portable Document Format