PsychoSafe

Python

Pipeline for building and evaluating psychologically informed refusal behavior in large language models through data creation, prompting, fine-tuning, and evaluation.

Repository

PropMe

Python

Framework for evaluating memorization and propensity-aware memorization of training data in large language models.

Repository

DaLA

Python, Jupyter

Source code for creating the Danish Corpus of Linguistic Acceptability, designed to evaluate Danish linguistic acceptability with real-world errors.

Repository

bitlinear

Python

Production-ready implementation of 1.58-bit layers for quantization-aware training and efficient inference.

Repository

brainsurgery

Python

Swiss-army-knife style toolkit for scripted tensor surgery on model checkpoints, with YAML plans, CLI workflows, and a Web UI.

Repository

DeToNATION

Python, Distributed Training

A communication framework that optimizes distributed AI training by reducing latency bottlenecks and handling heterogeneous clusters.

Repository

JustEval

Python

A streamlined framework for evaluating language generation quality in two steps: generation and automatic evaluation.

Repository

mldataforge

Python

Swiss-army-knife toolkit for transforming and processing machine learning datasets, including large-scale format conversion and splitting.

Repository

mlgroom

Python

Utility for grooming and managing machine learning job queues.

Repository

projectives

Python

AI-assisted analyser for projective techniques in qualitative health research.

Repository

SDU-Daisy

Python

Benchmark and tooling for evaluating LLM understanding of Danish culture using closed question-answer pairs grounded in the Danish Culture Canon.

Repository