Portfolio

A selection of projects that I've worked on over the years.

My full GitHub page: mjs227
Homepage

λ-GRPO

Code for the paper GRPO is Secretly a Process Reward Model. Implements λ-GRPO: a custom, PRM-sensitive GRPO variant informed by the theoretical results in the paper.

RandomWorld

Code for the paper Procedural Environment Generation for Tool-Use Agents. RandomWorld is a pipeline for the procedural generation of interactive tools and compositional tool-use environments for RL (and SFT) fine-tuning.

GFoLDS

Code for the paper Exploring Graph Representations of Logical Forms for Language Modeling (and also a large chunk of my dissertation). GFoLDS is a custom-built transformer architecture that takes as input graph representations of logical forms, which allows it to exceed the performance of comparably-sized standard transformer models while using 6.5x less data.

AdversarialNLI

Code for the experiments I ran in the paper It is not True that Transformers are Inductive Learners: Probing NLI Models with External Negation (which was also my Master's capstone project).

Modified Adsorption (Python Implementation)

A python implementation of the Modified Adsorption algorithm from Talukdar and Crammer (2009). Just for fun :)