AI in defense shifts from tools to human-AI teaming; interaction-centered design improves trust, decisions, and security outcomes in complex environments.
This section traces the evolution of AI concepts, starting from early 'unmanned systems' and 'cybernetics' to J.C.R. Licklider’s vision of 'man-machine symbiosis.' It argues that the true strategic impact of modern AI lies not in its artificialness, but in its ability to redefine intelligence at both individual and organizational levels, especially with opaque algorithmic models. It highlights a critical 'responsibility gap' where human operators are held accountable for decisions generated by AI systems over which they have diminishing control. The core challenge presented is to foster trustworthy collaboration between humans and AI, rather than to prevent full automation, particularly in high-stakes domains like national security where human oversight remains ethically and legally imperative.
This part distinguishes between merely generating 'trust' and ensuring true 'trustworthiness' in AI-enabled systems. Historically, designing for trustworthiness focused on creating reliable automated systems and training users for appropriate reliance, with initiatives like DARPA’s Explainable AI (XAI) and Google’s People + AI guidebook aiming to increase user insight. However, the article cites evidence that transparency alone can boost user trust without necessarily improving decision quality, leading to dangerous 'unwarranted trust' in critical defense applications, exemplified by incidents like the USS Vincennes targeting of Iran Air Flight 655 and the Patriot missile fratricide. It emphasizes that trustworthiness is now understood as a collective property of the human-AI unit, leading to the prominence of 'human-AI teaming,' and advocates for focusing on the interaction itself as the fundamental unit for designing and evaluating such systems.
This section proposes an 'interaction-centered' approach that integrates human-centered and AI-centered design paradigms, focusing on how design decisions manifest in practical operational contexts. **Interaction-Centered Design:** It highlights that AI transparency features (e.g., visualizations, confidence scores) are interpreted contextually by users based on their expertise and task demands, leading to varied decision outcomes. Examples from airport security and clinical triage illustrate how a '90% confident' AI output can be perceived as too low or implausibly high, depending on the stakes and user workload. The article advocates for drawing on 'participatory design' traditions from other safety-critical fields, such as healthcare with 'smart' hospital infusion pumps, to ensure AI system designs align seamlessly with existing human workflows and do not impede frontline personnel, a crucial principle for trustworthy national security applications. **Interaction-Centered Evaluation:** It stresses that evaluation must also center on interaction, moving beyond isolated algorithmic benchmarks that offer limited insight into real-world human-AI decision-making, especially concerning human cognitive biases and compounding errors under pressure. The 2021 U.S. drone strike in Kabul is cited as a tragic example of how information interpretation under time pressure, not just data inaccuracy, can lead to catastrophic failure. It acknowledges the high cost of realistic human-in-the-loop experiments but proposes developing evaluation methods that incorporate observed interaction data, such as metrics for human-AI communication, error detection, and intervention effectiveness. Programs like DARPA’s ASIST and EMHAT, and the U.S. Air Force’s DASH events, are presented as models for scalable, simulation-based evaluations that can strategically guide AI development by bridging foundational research with operationally relevant advances.