Comparing the Cognitive Vulnerabilities of Human Pen-Testers to an LLM

Originally published by the High Confidence Software and Systems Conference

High Confidence Software and Systems Conference, Annapolis, MD, May 11-12, 2026

Author:

Dr. Hajime Inoue and Dr. Zak Fry

Poster:

Abstract:

Our work addresses HCSS’s focus area on “Security and Resilience Engineering”. AI-based tools are increasingly augmenting human efforts in red-teaming, vulnerability discovery, and threat modeling, as
they can work faster, continuously, and at less expense than human teams. Innovators in this space include performers in DARPA’s recent AIxCC (AI Cyber Challenge), which focused on vulnerability discovery
and mitigation, as well as commercial organizations.

Both human and AI-based cyber-attackers have weaknesses that system defenders can leverage. There has been considerable interest in exploiting the cognitive vulnerabilities of human attackers. Cognitive vulnerabilities are incorrect beliefs or biases that result in irrational or suboptimal behavior. For example, after an attacker has successfully accessed several independent systems in a row, they may believe they are more likely to succeed on the next system (i.e., “gambler’s fallacy”). IARPA has promoted research in this area – GrammaTech was a performer on the recently concluded ReSCIND program (Reimagining Security with Cyberpsychology-Informed Network Defenses) that studied these effects.

As AI-based cyber attackers become more common and adept, it is important to investigate whether they are also prone to cognitive vulnerabilities. This information impacts how cyberpsychology-informed defenses should be designed and deployed. However, this issue has been little studied, to date.

We present our study our study of the cognitive vulnerabilities of human penetration testers and compare our results to an AI-based agent for the same task. We explore several cognitive vulnerabilities: base rate neglect, law of small numbers, gambler’s fallacy, hot hand, framing effect, endowment effect, sunk cost fallacy, near miss effect, hot stove effect, cognitive load effect, anchoring effect, default/distinctiveness effect, and mere exposure effect. We find differences between humans and the LLM-based agent in terms of bias susceptibility as well as hacking success. For example, the AI agent appeared much more susceptible to base rate neglect than the human participants.

We conclude that using the same methods to evaluate humans and automated tools may not present the full picture when it comes to real-world efficacy. Future work should investigate LLM-specific concerns, as these technologies are increasingly used in offensive and defensive deployments. For example, the AI agent did not perceive web-based activities as human pen-testers did – it parsed the entire source code of a page, including JavaScript, as the page was loaded. Thus, defenses that relied on monitoring access to “” elements worked well against humans but not against the AI agent. Further, the AI agent has no concept of time and runs on time scales different than humans. A “hot stove” element (e.g., a harshly worded popup triggered on a specific attacker action intended to divert their current attack vector), was ineffective against the agent both because it parsed the warning when a page was loaded instead of when triggered, and it operated at speeds much faster than human pen-testers. These differences suggest that defenses tuned to human attackers and their cognitive traits will be less effective against AI agents. Defenses based on exploiting cognitive vulnerabilities will have to be harmonized or developed separately for humans and AI-based agents.

Contact Us

Get a personally guided tour of our solution offerings. 

Contact US