Publications Details
Assessing the nature of large language models: A caution against anthropocentrism
Generative AI models garnered a large amount of public attention and speculation with the release of OpenAI’s chatbot, ChatGPT in November of 2022. At least two opinion camps exist – one that is excited about the possibilities these models offer for fundamental changes to human tasks, and another that is highly concerned about the power these models seem to have – especially since the release of GPT-4, which was trained on multimodal data and has ~1.7 trillion (T) parameters. We evaluated some concerns regarding these models’ power by assessing GPT-3.5 using standard, normed, and validated cognitive and personality measures. These measures come from the tradition of psychometrics in experimental psychology and have a long history of providing valuable insights and predictive distinctions in humans. For this seedling project, we developed a battery of tests that allowed us to estimate the boundaries of some of these models’ capabilities, how stable those capabilities are over a short period of time, and how they compare to humans.