Publications Details

Publications / LDRD Report

Assessing the nature of large language models: A caution against anthropocentrism

Generative AI models garnered a large amount of public attention and speculation with the release of OpenAI’s chatbot, ChatGPT in November of 2022. At least two opinion camps exist – one that is excited about the possibilities these models offer for fundamental changes to human tasks, and another that is highly concerned about the power these models seem to have – especially since the release of GPT-4, which was trained on multimodal data and has ~1.7 trillion (T) parameters. We evaluated some concerns regarding these models’ power by assessing GPT 3.5 using standard, normed, and validated cognitive and personality measures. These measures come from the tradition of psychometrics in experimental psychology and have a long history of providing valuable insights and predictive distinctions in humans. For this seedling project, we developed a battery of tests that allowed us to estimate the boundaries of some of these models’ capabilities, how stable those capabilities are over a short period of time, and how they compare to humans.