M.I
About

Posts

  • May 27, 2025

    Sparse Autoencoders in Protein Engineering Campaigns: Steering and Model Diffing

  • Feb 15, 2025

    AISC: Probing for Deception Detection

  • Nov 22, 2024

    Interpretability Hackathon

  • Oct 30, 2024

    AI Policy Hackathon

  • Oct 3, 2024

    Mechanistic Exploration Gemma 2 List Generation

  • Sep 29, 2024

    AI Safety Fundamental Final Project

  • Mar 19, 2024

    AI Control with Mechanistic Interpretability

  • Mar 3, 2024

    Towards a Probabilistic Disentanglement of Transformer Activations Part 2

  • Feb 28, 2024

    Towards a Probabilistic Disentanglement of Transformer Activations Part 1

  • Oct 15, 2023

    Word Embeddings: A Comprehensive Guide Part 2

  • Sep 22, 2023

    Word Embeddings: A Comprehensive Guide Part 1

  • Aug 23, 2023

    Balanced Sentence Part 1

  • May 19, 2023

    Introduction to mechanistic Interpretability

  • May 19, 2023

    My first post

subscribe via RSS

M.I

  • M.I
  • gerard.boxo@estudiantat.upc.edu
  • gboxo
  • uadeoif

This blog is dedicated to discussing Mechanistic Interpretability