Publications Details

Publications / Conference Presentation

Arbitrary Autoencoder Injection for Interpretability Experimentation

Campos, Marco V.; Cauthen, Katherine R.; Krofcheck, Daniel J.; Naugle, Asmeret; Simpson, Sarah E.; Doyle, Casey L.; Sweitzer, Matthew D.; Xi, Michael

Goes over a simple software library (Python) for utilizing sparse autoencoders for more models than just language models.

Top