
What Business Psychologists Need to Know When Working with LLMs
Large language models (LLMs), that is, AI systems that can generate human-like text, are beginning to appear in wellbeing apps, coaching tools and even therapy-adjacent services. They can help with admin, education and signposting. But evidence also shows real risks if they are used without clear guardrails.
This article distils several relevant takeaways for Business Psychologists, from a selection of recent research papers, looking at consideration of stigma, equity, consent, and scope-of-practice.
The aim isn’t to dismiss innovation; it’s to support responsible, evidence-based practice that protects the public and upholds essential standards, informed by emerging evidence that these systems are most effective when they assist rather than replace trained professionals.
A Few Key Terms (in plain English)
- LLM (Large Language Model): a computer program trained on vast amounts of text to predict and generate words in response to prompts.
- Algorithmic bias: when an AI’s outputs are systematically unfair to certain groups because its training data or design is skewed.
- Informed consent (digital): people clearly understand what the tool can and cannot do, how their data is used, and any risks, before they use it.
- Scope-of-practice: the safe, ethical limits of what a tool (or practitioner) is qualified and authorised to do.
Takeaway 1: Stigma – Subtle Words, Real Harm
The Risk
A multi-institution team including Stanford researchers reports that state-of-the-art LLMs can express stigmatising attitudes toward mental health conditions and sometimes encourage inappropriate or unsafe lines of thought (for example, reinforcing delusions), even when safety filters exist.
Why It Matters to Business Psychologists
Language shapes wellbeing. Stigmatising phrasing, even if unintended, can discourage help-seeking or harm therapeutic progress.
Options for Mitigation
- Build human-in-the-loop workflows so sensitive outputs are reviewed or escalated to trained professionals, rather than delivered directly.
- Require domain-specific safety testing (e.g., psychosis, suicidality, eating disorders), with real-world prompts and stress tests, not just generic “toxicity” filters.
Takeaway 2: Equity – Don’t Widen The Gap
The Risk
Studies in some settings have found that certain AI tools can produce differential quality of advice or empathic response (across gender, ethnicity, language proficiency, and communication style), potentially disadvantaging women, ethnic minority users, non-native English speakers, and those using informal language. These effects are often linked to biased or unrepresentative training data.
Why It Matters to Business Psychologists
Equity is central to professional practice. If a triage bot or self-help assistant is less accurate or less supportive for some groups, we risk baking inequity into pathways to care.
Options for Mitigation
- Require vendors to show performance broken down by demographic and language variations (not just average scores).
- Prefer tools evaluated on diverse, clinically relevant datasets, not only internet text. Reviews in psychiatry and digital medicine repeatedly call out the need for better, representative evaluation.
Takeaway 3: Consent – Clarity Before Chat
The Risk
People may think they are “in therapy” or interacting with a trained professional individual, when they are actually using a general chatbot. They may not know where their data goes, who can access it, or how it might be used to train future models.
In a workplace setting, staff may believe they are receiving personalised coaching, counselling, or HR advice from a qualified professional when they are actually talking to a general-purpose AI.
Employees may also not realise that their data (what they type into the chatbot) could be stored, shared, or used to train future AI systems.
Why It Matters to Business Psychologists
This blurs boundaries and could lead to employees making decisions based on unqualified advice. For example, on conflict resolution, career moves, or sensitive wellbeing issues.
And, in terms of data, interventions that concern Business Psychologists may deal with sensitive topics (such as motivation, stress, workplace conflict, diversity). If people assume privacy when none exists, trust in the organisation or the psychologist may be undermined.
Options for Mitigation
- Plain-English disclosures up front, that the tool is a tool, and how data is handled.
- Granular controls, offering opportunities for users to delete conversation histories.
- Context-aware consent, ensuring the system immediately provides emergency guidance and signposts to human services if a user starts addressing particular (e.g. crisis) topics.
Takeaway 4: Scope-of-practice – Tools Assist; Human Treat
The Risk
LLMs can sound confident and compassionate, which can mislead users into over-trust. Yet current evidence shows inconsistent clinical validity and safety gaps for direct therapeutic use. Some papers show promising diagnostic performance in vignettes, but that does not equal full, real-world clinical competence across varied presentations.
Why It Matters to Business Psychologists
Business Psychologists rely on evidence-based practice and professional standards. If an AI is mistaken for a qualified practitioner, it undermines both ethical practice and public trust in the discipline. Furthermore, misuse could lead to poor advice, increased risks (e.g. mishandled disclosures of stress or harassment), or liability for employers who adopted the tools without recognising their limits.
Options for Mitigation
- Treat LLMs as assistive; frame the service offering in terms of administrative, and adopt a “human-in-the-loop” model with clear escalation rules.
- Follow a staged integration to ensure potential issues are spotted early and can be resolved.
- Evaluate equity by working with disaggregated performance data (including for example, gender, ethnicity, and language information).
- Include scenario testing to surface potential issues on wellbeing and bias.
- Prohibit “agreeable” or sycophantic mirroring in high-risk contexts; require corrective, evidence-based phrasing.
- Insist on consent that people actually read, using short, layered notices; consider including examples of appropriate vs. inappropriate uses.
- Set boundaries and establish easy and apparent escalation to qualified professionals; log and review escalations.
These mitigation suggestions are not exhaustive, but provide a basis for meaningful conversations that could lead to relevant action being taken.
Conclusion
LLMs can help widen access to information and support the mental-health ecosystem, but only if we reject stigma, centre equity, secure informed consent, and respect scope-of-practice.
The evidence emerging indicates these systems must assist, not replace, trained professionals, with robust testing and governance. As a community of Business Psychologists, our role is to champion responsible, evidence-based adoption that improves, not jeopardises, people’s wellbeing.
References
-
Financial Times. (2025). AI medical tools downplay symptoms in women and ethnic minorities (journalistic synthesis of peer-reviewed work on bias).
Journalism cited to contextualise peer-reviewed findings; practice decisions should rest on the peer-reviewed sources which follow:
-
Hua, Y., et al. (2025). A scoping review of LLMs for generative tasks in mental health care. NPJ Digital Medicine.
-
Kim, J., et al. (2024). LLMs outperform clinicians on OCD vignettes (caution on generalisation to real practice). NPJ Digital Medicine.
-
Lawrence, H. R., et al. (2024). The opportunities and risks of LLMs in mental health. JMIR Mental Health.
-
Moore, J., Grabb, D., Agnew, W., et al. (2025). Expressing stigma and inappropriate responses prevents LLMs from safely replacing mental health providers.
-
Stade, E. C., et al. (2024). Roadmap for clinical LLMs in psychotherapy. Nature Reviews Psychology.
