Resolve "Evaluation: update and modify the sentence evaluation code" (!12) · Merge requests · repos / research / Wiki NLP Tools

Appledora requested to merge 27-eval-bench into main Apr 04, 2023

Updated the old sentence benchmarking to make it compatible with the current tokenizer implementation. Given a json file of following format :

{
"en" = [sentence 1, sentence2 ... sentence100],
"de" = [sentence 1, sentence2 ... sentence100],
"bn" = [sentence 1, sentence2 ... sentence100]
....
}

the benchmarking code outputs a csv file with following columns.

<correct> <partially correct> <incorrect> <missing> <accuracy>

The code also generates a benchmarking log as a csv file:

We can identify four types of errors:

type 1 (2-no-match): splits into two sentences. But neither the input sentences
type 2 (>2-one-match): splits into more than two sentences, with at least one of the sentences
type 3 (>2-no-match): splits into more than two sentences, with none of the sentences
type 4 (no-split): doesn't split into two sentences

Closes #27 (closed)

Edited Apr 10, 2023 by Appledora

Admin message

Admin message

Admin message

Resolve "Evaluation: update and modify the sentence evaluation code"

Merge request reports