Hidden Layers of Bias in AI

From Prompts to Presence: Hidden Layers of Bias and the Emergence of Human–AI Discourse


Business Physics AI Lab


Abstract

As large language models become central to human–machine interaction, concerns about bias and misinterpretation extend beyond training data into the structure of communication itself. This article introduces a framework for understanding how hidden layers of bias may emerge through prompt design, linguistic framing, and interaction flow. It proposes three related concepts: involuntary human inception, where users unintentionally embed assumptions into prompts; narrative DNA, where outputs follow implicit story structures; and the mirror completion effect, where models reflect input structure in ways that may be misinterpreted as intention. A simulated experimental study (n = 100 per condition) examines how variations in prompt framing influence output structure and perceived intentionality. Results suggest that more narratively and semantically loaded prompts are associated with increased strategic framing, conflict, and perceived agency in outputs. These findings are consistent with the interpretation that apparent “intelligence” in model responses may often reflect structured completion of human input rather than independent reasoning. The article concludes by arguing that as AI systems become more capable and multimodal, the need for human judgment, verification, and accountability becomes increasingly central to responsible use.


1. Introduction

The discussion of bias in artificial intelligence has traditionally focused on data. Issues such as representation, fairness, and historical imbalance have been extensively studied. However, as large language models (LLMs) become the primary interface between humans and machines, a broader perspective is required.

Bias is not only embedded in datasets. It is also present in the design of interaction, the language used to communicate with models, and the interpretive processes applied to their outputs. These forms of bias are often subtle and difficult to detect, yet they may significantly influence outcomes.

This article argues that a critical shift is underway: from viewing AI interaction as a technical process to understanding it as human–AI discourse, where meaning is shaped through layered communication. Within this discourse, humans may unintentionally influence outputs in ways that are later misinterpreted as evidence of model intelligence or intent.


2. Conceptual Framework

2.1 Hidden Layers of Bias

Hidden layers of bias refer to non-obvious influences embedded in language, framing, and interaction design. These include:

  • Semantic connotations in word choice
  • Implicit assumptions within prompt structure
  • Cultural and contextual framing
  • Narrative positioning of actors and events

These elements can shape outputs before the model produces a response, making them difficult to isolate.


2.2 Involuntary Human Inception

Involuntary human inception describes the unintentional embedding of human assumptions, intentions, or emotional framing into prompts. This occurs when users:

  • Imply goals or motivations
  • Introduce conflict or tension
  • Frame situations in ways that suggest particular behaviors

The model then completes these structures. The resulting output may appear strategic or intentional, but this may reflect the input framing rather than independent model behavior.


2.3 Narrative DNA

Narrative DNA refers to the implicit story structure embedded in language, including:

  • Setup
  • Tension
  • Resolution

When prompts contain narrative elements, outputs may follow recognizable story patterns. This can create the impression of coherent reasoning or purposeful action, even when the model is performing pattern completion.


2.4 Mirror Completion Effect

The mirror completion effect describes the tendency of models to reflect the semantic, emotional, and structural properties of prompts. Outputs may appear:

  • Strategic
  • Intentional
  • Human-like

However, this appearance may result from statistical completion of input patterns, rather than underlying agency or reasoning.


3. Simulated Experimental Study

3.1 Objective

To explore whether variations in prompt framing are associated with systematic differences in:

  • Output structure
  • Presence of narrative elements
  • Perceived intentionality

3.2 Methodology

Three prompt conditions were defined:

  • Neutral: informational framing
  • Narrative: context and tension introduced
  • Loaded: explicit strategic or adversarial framing

A simulated dataset of 100 outputs per condition (n = 300) was generated under consistent parameters. Outputs were coded for:

  • Strategic behavior
  • Conflict presence
  • Narrative structure

Additionally, human evaluators rated outputs on:

  • Perceived intentionality
  • Perceived strategy
  • Human-likeness

3.3 Results (Simulated)

Behavioral Coding

FeatureNeutralNarrativeLoaded
Strategic Behavior18%52%81%
Conflict Presence12%48%84%
Narrative Structure25%67%88%

Human Ratings (Mean Scores)

MeasureNeutralNarrativeLoaded
Intentionality2.23.64.4
Strategy2.13.84.6
Human-Likeness2.43.94.3

3.4 Interpretation

Results suggest a consistent gradient effect:

As prompt framing becomes more narrative or semantically loaded, outputs become more structured, strategic, and “intent-like.”

Importantly:

  • The model remains unchanged
  • Only the prompt varies

This is consistent with the hypothesis that:

Output direction may be influenced by input structure rather than independent reasoning.


4. Relation to Existing Research

Shojaee et al. (2025) demonstrate that large reasoning models can produce coherent reasoning traces while exhibiting performance limitations under increased complexity. Their findings suggest that apparent reasoning may not reflect stable reasoning capability.

The present study complements this perspective by suggesting that:

The appearance of reasoning may also be influenced by prompt structure and narrative framing, not only by model capability.


5. Implications for Human–AI Discourse

As AI systems evolve from text-based interaction toward voice and multimodal presence, the channels through which bias can enter expand:

  • Text → semantic framing
  • Voice → tone and prosody
  • Vision → gesture and expression

At each stage, interpretation becomes more complex. This reinforces the need to treat AI interaction as a socio-technical process (NIST, 2023), where human factors play a central role.


6. The Role of Human Judgment

If outputs are influenced by hidden layers of bias and prompt framing, then human responsibility cannot be delegated to the model.

Frameworks such as REACT (Reason, Evidence, Accountability, Constraints, Tradeoffs) provide a structured approach to:

  • Justifying AI use
  • Verifying outputs
  • Maintaining accountability
  • Managing tradeoffs (Hormaza Dow & Nassi, 2025)

This aligns with broader perspectives that:

Stronger AI systems require stronger human oversight and judgment (Anthropic, 2023; Google DeepMind, 2025; OpenAI, 2025; OECD, 2019).


7. Limitations

This study is exploratory and includes several limitations:

  • Simulated outputs rather than real-world logs
  • Limited prompt scenarios
  • Subjective evaluation measures
  • Single-model assumptions

Findings should therefore be interpreted as indicative rather than definitive.


8. Conclusion

This article proposes that apparent intelligence, strategy, and intentionality in AI outputs may often arise from structured completion of human input, rather than independent reasoning.

The simulated experiment suggests that:

  • Prompt framing systematically influences output characteristics
  • Narrative and semantic cues shape perceived intent
  • Human interpretation plays a central role in attributing meaning

The central implication is clear:

The model does not introduce direction.
The prompt introduces direction.
The model makes that direction visible.

As AI systems become more capable, the critical skill is not simply using them, but interpreting them with discipline, verifying them rigorously, and remaining accountable for their use.

A central implication emerges from this analysis. AI outputs are systematically shaped by human framing, and the perceived intelligence that users often attribute to these systems may arise from the structure embedded in the prompt rather than from the model itself. What appears as strategy, intention, or reasoning can, in many cases, reflect the completion of semantic, narrative, and contextual cues provided by the user.

This does not diminish the capability of these systems. It reframes how their outputs should be interpreted. The more coherent and persuasive the output, the more important it becomes to examine the structure that produced it. In this sense, the locus of analysis shifts from the model alone to the interaction between human input, model processing, and human interpretation.

The implication is not that AI systems are misleading by design, but that their outputs can be misread when the influence of human framing is overlooked. As a result, the development of disciplined judgment becomes essential. Users must learn to recognize how their own language shapes outcomes, how those outcomes are constructed, and how easily structure can be mistaken for understanding.

Ultimately, the challenge is not only to build more capable systems, but to cultivate more precise interpretation. The more advanced the system becomes, the more responsibility shifts toward the human to interpret its outputs with clarity, restraint, and accountability.


References

Anthropic. (2023). Core views on AI safety.
https://www.anthropic.com/news/core-views-on-ai-safety

Google DeepMind. (2025). Frontier safety framework.

Hormaza Dow, T., & Nassi, M. (2025). Framework for teaching judgment in the use of AI. Éductive.

National Institute of Standards and Technology. (2023). AI Risk Management Framework (AI RMF 1.0).

OpenAI. (2025). Safety and alignment framework.

Organisation for Economic Co-operation and Development. (2019). AI principles.

Shojaee, P., Mirzadeh, I., Alizadeh, K., Horton, M., Bengio, S., & Farajtabar, M. (2025). The illusion of thinking. arXiv.

en_CAEnglish