Skip to content

richardxoldman/detokenized-gec-datasets

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Detokenized-GEC-Datasets

The repository contains detokenized GEC datasets described in the paper Adapting LLMs for Minimal-Edit Grammatical Error Correction.

The detokenized datasets are licensed under the terms of the original datasets. We are not the authors of these datasets. The datasets originate from the following research works:

A New Dataset and Method for Automatically Grading ESOL Texts (Yannakoudakis et al., ACL 2011)

The CoNLL-2014 Shared Task on Grammatical Error Correction (Ng et al., CoNLL 2014)

JFLEG: A Fluency Corpus and Benchmark for Grammatical Error Correction (Napoles et al., EACL 2017)

The BEA-2019 Shared Task on Grammatical Error Correction (Bryant et al., BEA 2019)

About

The repository contains detokenized GEC datasets.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published