A dataset containing sentences from Public Domain sources.
Four Battles book proofread by Niko Partanen.
myv
Data frame with an Erzya book
- doc_id
name of the text in original corpus
- sentence_id
sentence id, unique within a text
- sentence
sentence text
...
Source
http://urn.fi/URN:NBN:fi-fe2014082633380