Our research aims to provide insights into the neural sociology of LLMs—examining how different social preferences factorize internally within the model’s latent space. We seek to identify interpretable latent units…
Our research aims to provide insights into the neural sociology of LLMs—examining how different social preferences factorize internally within the model’s latent space. We seek to identify interpretable latent units…