Skip to content

mp-access/Categorization-Experiments-Re

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Overview

  • This repository contains the code that was used to evaluate which categorization approach performed best at categorizing student implementations.
  • To assess categorization quality, an ideal categorization of the "shirt-size"-task was crafted. It is shown in the "sample-data/shirt-size/optimal-categories" folder.

Code Categorization Approaches

  • In the "approaches" folder, all the tested approaches are listed. They are seperated by approach type, currently "jaccard", "llm", and "tsed".
  • For each approach, there is one file that performs the offline clustering experiments and one that performs the online clustering experiments.
    • Within those files, the relevant environment variables are set (e.g. data of which task is used. Since evaluation is only possible for the "shirt-size" task (because only this task has a ground-truth categorization), it is currently set everywhere).
    • Running these files performs the clustering and outputs the result (as well as the evaluation if the "shirt-size" task data was used).
      • The results are persisted in a file and stored in the "results" folder of the approach.

Sample Data

  • Anonymized data of three old ACCESS tasks are contained in this repository ("arithmetic-expression", "invert-dictionary", and "shirt-size"). They can be found in the "sample-data" folder and can be used in the experiments.
  • For each task, there's one .json file containing all the submissions that students made, and one .json file that contains only the first submissions students made.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages