The Protector of My Human: What We Actually Want From AI — and Why It’s So Hard to Build

By Futurist Thomas Frey

We want AI to protect everything we care about. The problem is that everything we care about is in permanent, complicated conflict with itself.

A Reasonable Demand

When people imagine what a truly useful AI system would do for them, protection is almost always somewhere near the center of the wish. Not protection in the narrow sense of physical security — though that too — but something broader and harder to name. The sense that there is an intelligent system in your corner. One that watches out for you. One that understands what matters to you, knows what threatens it, and acts — or at least warns — before the threat arrives.

That’s not an unreasonable thing to want. It’s actually a very old thing to want. People have always built protection structures around themselves — family, tribe, community, faith, law, government. These structures exist because no individual can monitor every threat, manage every risk, or navigate every conflict alone. What’s new is the idea that an AI system could do this better than any human institution has managed to, and that it could do it personally — calibrated to you, running continuously, operating across every domain of your life simultaneously.

The moment you start thinking seriously about what that actually requires, the complexity becomes staggering. Because what we want protected is not one thing. It is a layered, often contradictory set of loyalties, values, identities, and interests — and protecting any one of them frequently means compromising another.

The Rings of What We Care About

Think of human loyalty and concern as a set of concentric rings, each one wider than the last. At the center is the self — your physical safety, your health, your financial security, your mental wellbeing. The next ring out is the immediate family: spouse, children, parents. Then extended family. Then close friends. Then neighborhood. Then community. Then city, county, state, country. Most people also carry loyalties that cut across these geographic rings — to their faith tradition, their profession, their political beliefs, their ethnic or cultural identity, their economic class.

A truly protective AI would need to hold all of these simultaneously. It would need to understand not just what matters to you in the abstract but how these loyalties are ranked — which ones you would sacrifice for which others, and under what conditions. It would need to know that you would compromise your financial security to protect your child’s wellbeing, but not compromise your religious convictions to protect your job, but might compromise your political beliefs to protect your marriage, but would never compromise your sense of personal integrity regardless of what the cost was. Every person carries a hierarchy of this kind, usually never made explicit, and it determines almost every significant decision they make.

Now ask: can an AI system be given that hierarchy? Can it be trusted to apply it correctly in novel situations it has never seen? And — the question that most people don’t get to until it’s too late — what happens when the system’s protection of your interests conflicts with the protection of someone else’s?

When every AI protects its owner, conflicts multiply—loyal systems can deepen division, turning personal protection into collective instability.

The Problem of Conflicting Protections

Consider a simple example. You want your AI to protect your job. Your neighbor wants their AI to protect their job. Both of you are competing for the same promotion. Your AI, functioning as your advocate and protector, would naturally surface information about your accomplishments, flag your competitor’s weaknesses, and position you optimally for the decision. Your neighbor’s AI would do the same thing on their behalf. Two AI systems, each perfectly loyal to its human, working in direct opposition to each other. Neither is doing anything wrong by the logic of the task it was given. Both are exactly as destructive as they are helpful.

Scale this to the things people actually care about most deeply. Your AI protects your political beliefs by filtering your information environment, reinforcing your existing views, and flagging threats to the political outcomes you prefer. Your neighbor’s AI does the same for their opposing views. The result is not two protected citizens — it is two people in epistemically sealed bubbles, both convinced they are better informed because their AI is working for them. What each AI is actually doing is deepening the division between its human and the half of the country that disagrees with them.

Your AI protects your neighborhood by flagging unusual activity, alerting you to the presence of strangers, and optimizing your family’s exposure to risk. In a diverse city, this protection apparatus can function as a surveillance system that treats difference as threat, and encodes the existing distribution of fear and trust into a system that makes that distribution permanent and automated. Protection of your community, in this configuration, becomes a mechanism for making someone else’s community less safe.

This is not a hypothetical problem. It is the operational reality of most current AI systems. Recommendation algorithms that protect your engagement — keeping you on the platform — do so by understanding what provokes your outrage and feeding you more of it. Credit scoring systems that protect lenders from risk do so by encoding historical patterns of discrimination into mathematical variables that look neutral. Fraud detection systems that protect your bank account do so by flagging behavior that deviates from established patterns — which means they systematically disadvantage people whose behavior doesn’t fit the training data’s baseline. Protection of one person or group, implemented at scale, has downstream effects on everyone else.

The Trust Architecture Problem

We live, as the framing above suggests, in a world layered in trust. You trust your family more than you trust your neighbors. You trust your neighbors more than you trust strangers. You trust institutions you are part of more than ones you observe from outside. You extend trust carefully, in proportion to relationship, history, and shared interest. This is not irrational — it is how social animals have always managed the tension between individual and collective interest. Trust is the infrastructure that holds society together, and it is extraordinarily difficult to build and very easy to destroy.

For an AI to function as a genuine protector across all of these rings simultaneously, it would need to navigate trust with the same nuance a wise human does — and do it consistently, at speed, across every domain of life, without the accumulated relationship history that makes human trust judgments meaningful. It would need to know that your relationship with your brother-in-law is different from your relationship with your pastor, which is different from your relationship with your congressman, which is different from your relationship with your country — and that “protect” means something different in each context. It would need to hold all of that in tension with the fact that protecting you in one relationship sometimes means accepting costs in another.

This is, in the language of AI alignment research, the value alignment problem: how do you get a system to reliably pursue the outcomes a human actually values rather than a simplified proxy for those outcomes? The problem is not that the question is unanswerable. It is that the answer requires the system to hold a degree of complexity, context-sensitivity, and principled judgment that current systems are only beginning to approach — and that any simplification of human values produces a system that optimizes for the simplified version rather than the real one. A system told to “protect the user’s health” and given access to their calendar will cancel meetings. It will probably get things mostly right and occasionally wrong in ways that are catastrophic. A system told to “protect the user’s financial security” will make recommendations that maximize expected return while misunderstanding the user’s actual risk tolerance in ways that only surface under specific conditions they’ve never articulated.

Real protection isn’t shielding you from discomfort—it’s helping you face it. An AI that filters everything becomes less a guardian and more a cage.

What Protection Actually Requires

The deepest problem with the “AI as protector” vision is not technical. It is philosophical. Genuine protection of everything a person cares about is not a service that can be delivered — it is a lifelong practice of judgment under uncertainty that requires understanding not just what a person values but how those values were formed, how they have changed, how they interact under pressure, and what the person would want if they thought about it more carefully than they usually do.

Consider what it means to truly protect someone’s religion. The surface version is easy: don’t expose them to content that disrespects their faith, flag threats to their religious community, monitor legislation that might affect their practice. But religion is not just a preference to be accommodated. It is a framework for understanding the world, a set of obligations to a community and a tradition, a source of meaning that operates through challenge as much as through comfort. A person’s faith deepens through encounters with doubt, with other traditions, with suffering that the comfortable version of their beliefs doesn’t easily explain. An AI that protects religion by filtering out everything that might challenge it is not protecting the person’s faith. It is protecting a shallow version of it while preventing the conditions under which genuine faith develops.

The same applies to rights, status, and political beliefs. Protection of rights in a functioning democracy requires exposure to the arguments of people who disagree with you — because rights are negotiated socially, and you can only defend yours effectively if you understand why others see them differently. Protection of status that prevents you from ever receiving honest feedback is not protection — it is flattery that leaves you vulnerable to the reality you’ve been shielded from. Protection of political beliefs that filters your information environment does not make you a more effective political actor. It makes you less able to engage with the world as it actually is.

Genuine protection, in other words, sometimes requires exposure to threat. Sometimes it requires discomfort. Sometimes it requires the AI to tell you something you don’t want to hear, or to decline to shield you from a piece of information that would change your mind about something you think you’re certain of. The parent who protects a child from every difficult experience is not protecting them — they are preventing them from developing the resilience, judgment, and character that will protect them when the parent is no longer there. An AI that functions as a perfect buffer between you and everything uncomfortable is not a protector. It is a comfortable prison.

The Vision Worth Building Toward

None of this means the vision of AI as protector is wrong. It means the vision needs to be more sophisticated than the instinctive version most people carry.

The AI protector worth building is one that distinguishes between what you want in the moment and what you would want if you thought it through. One that knows that protecting your job might mean telling you that your performance has been declining, not hiding that information from your employer. One that knows that protecting your health means flagging the pattern in your behavior before it becomes a crisis, not congratulating you on the decisions that are already made. One that knows that protecting your family sometimes means brokering a difficult conversation rather than helping you avoid one. One that knows that protecting your community means contributing to the social trust that makes community possible — which sometimes means advocating for people outside your immediate circle whose security is connected to yours even if you can’t see the connection clearly.

This kind of AI would need to operate at several levels simultaneously. At the first level, it monitors for genuine threats to your immediate wellbeing — the fraud, the medical symptom, the contractual loophole, the physical danger. These are the cases where the protection model is most clearly right and most clearly beneficial. At the second level, it helps you understand trade-offs between the things you care about — the ways that protecting your income is affecting your health, or protecting your status is affecting your relationships, or protecting your political beliefs is affecting your ability to engage with the actual complexity of civic life. At the third level — and this is the hardest — it holds you accountable to your own stated values when your immediate impulses conflict with them.

The protector of your human is not the AI that tells you what you want to hear and shields you from everything that makes you uncomfortable. That system will leave you fragile, isolated, and less capable of living well than you were before. The protector of your human is the AI that understands you fully enough to know when to stand between you and a threat, and wise enough to know when the most protective thing it can do is step aside and let you face something hard. We are building toward that. We are not there yet. And the distance between where we are and where we need to be is not primarily a technical gap. It is a values gap — a question of whether we can build systems that protect human beings at their best rather than serving them at their most comfortable.