Skip to content

Resolve "Naive Rule-based Sentence Segmenter"

Appledora requested to merge 4-create-naive-ssegmenter into main
  • compiled a list of global sentence terminators generated using unicode properties
  • implemented a simple regex-based naive sentence splitter
  • added golden rule based benchmarking for English, Arabic, German and Spanish.
EN score: 47.92%
GR score: 25.00%
SP score: 80.00%
AR score: 33.33%

Closes #4 (closed)

Edited by Appledora

Merge request reports