• Skip to primary navigation
  • Skip to content
  • Skip to footer
M.I
  • Home
  • About
  • AI Safety
  • AIxBio
  • Tags

    AI Control with Mechanistic Interpretability

    I’m excited to share our recent paper on AI Control through Mechanistic Interpretability approaches.

    less than 1 minute read

    Gerard Boxó

    Gerard Boxó

    Researcher in AI Safety & AIxBio. MSc in Bioinformatics.

    • Barcelona
    • LinkedIn
    • Twitter
    • GitHub
    • Curriculum Vitae

    I’m excited to share our recent paper on AI Control through Mechanistic Interpretability approaches.

    Updated: March 19, 2024

    Share on

    X Facebook LinkedIn Bluesky
    Previous Next

    You May Also Enjoy

    Can LLMs Design Nanobodies at Scale?

    less than 1 minute read

    🧬 Can LLMs Design Nanobodies at Scale? 🧬

    BioAgent Eval Traces

    less than 1 minute read

    Instruction following inside CoT for better Model Organisms

    less than 1 minute read

    Towards Mitigating Information Leakage When Evaluating Safety Monitors

    less than 1 minute read

    • Feed
    © 2025 M.I. Powered by Jekyll & Minimal Mistakes.