DaLA
Danish linguistic acceptability dataset with corrupted and non-corrupted sentences, published as ~8.68k examples with train/val/test and full-train splits.
Open datasets released or maintained by OdenseNLP.
Danish linguistic acceptability dataset with corrupted and non-corrupted sentences, published as ~8.68k examples with train/val/test and full-train splits.
Danish-culture benchmark dataset based on the Danish Culture Canon, with 746 closed question-answer pairs for evaluating LLM cultural understanding.