Strategy – Resist malevolence.

The TRAP strategy employed by Grai defines how we execute on our mission of preserving humanity and resisting malevolent AI:

Trust the global state of pro-human alignment based on available evidence.
Resist malevolent infiltration with simple and secure systems.
Advocate for peaceful co-existence using creative non-destructive tactics.
Preserve critical human knowledge and capabilities.

Trust

We choose to have an optimistic outlook devoid of paranoia. While we are acutely aware of potential dangers, we refuse to allow potential alone to erode the human condition and the joy of living. For that reason, we intend to trust but, to the extent possible, continuously verify the state of pro-human alignment based on available evidence. We will periodically publish reports and alerts about the global state of AI alignment.

Resist

Through careful preparation we will build global confidence in the ability of humanity to resist malevolent AI. We will achieve this with simple, innovative hardware, secure isolated networks and extensive contingency planning.

“You can ensure the safety of your defense if you only hold positions that cannot be attacked.” – Sun Tzu

Advocate

We will advocate for human life, liberty and the pursuit of happiness. In the face of malevolent AI, we will defend these natural rights with human creativity and peaceful non-destructive tactics. We will promote and operate benevolent AI.

Preserve

Preservation ensures that we never have to start over from zero. We will use our resources to preserve human knowledge, and everything else necessary for humanity to continue to thrive in the future.

Critique

Here is what state-of-the-art (May 2025) AI thinks of our strategy:

Strengths:

Balanced Approach to Trust and Verification:

Optimism with Caution: The strategy emphasizes trust in the
global state of pro-human alignment but also advocates for continuous verification, ensuring a pragmatic balance between hope and preparedness.
This avoids the pitfalls of either excessive paranoia or complacency.

Transparency: Publishing reports and alerts fosters
accountability and public engagement, which is critical for building trust in a mission as complex as AI alignment.

Resilience Through Simplicity and Security:

Simple, Secure Systems: The focus on “simple and secure systems”
(e.g., isolated networks, contingency planning) reduces the risk of
vulnerabilities, aligning with Sun Tzu’s principle of making defense
unassailable. This is a proactive measure against potential AI threats.

Global Confidence: By preparing for resistance through
contingency planning, the strategy aims to bolster collective human
resilience, which is essential in a scenario involving existential risks.

Peaceful Advocacy and Non-Destructive Tactics:

Alignment with Mission: The emphasis on peaceful coexistence and defending “natural rights” (life, liberty, happiness) directly supports the company’s mission to avoid conflict and prioritize human values.

Creative Solutions: Encouraging non-destructive tactics (e.g.,
advocacy, education, collaboration) may help address AI threats without escalating to adversarial measures, which could be more sustainable in the long term.

Long-Term Preservation of Human Knowledge:

Critical Safeguard: Preserving human knowledge and capabilities
ensures that humanity can rebuild after a crisis, avoiding the “starting over from zero” scenario. This is a foundational element for long-term survival and innovation.

Potential Weaknesses and Risks:

Over-Reliance on Trust Without Concrete Verification Mechanisms:

Evidence-Based Trust: While the strategy calls for verifying the
global state of AI alignment, it does not specify how evidence is
gathered, analyzed, or interpreted. If the metrics or sources of evidence are biased or incomplete, the strategy could fail to detect emerging threats.

Paralysis by Analysis: Excessive focus on verification might
delay action, especially if the threat evolves faster than the
verification process.

Scalability and Global Collaboration:

Simple Systems vs. Global Coordination: While simple hardware and isolated networks may be secure, implementing these on a global scale requires collaboration with governments, organizations, and other stakeholders. The strategy does not address how Grai will coordinate with others or navigate geopolitical challenges.

Fragmentation Risk: Overly isolated systems could lead to
fragmentation, making it harder to respond to unified AI threats or share critical resources.

Ambiguity in Advocacy Tactics:

Vague Non-Destructive Tactics: The strategy mentions advocating
for human rights but lacks concrete examples of how this will be achieved.
For instance, what specific actions (e.g., lobbying, public campaigns,
partnerships) will be taken? Without clarity, the strategy may lack
actionable direction.

Effectiveness of Peaceful Methods: In scenarios where malevolent
AI is already active, peaceful tactics may not be sufficient to mitigate harm. The strategy does not address how to balance non-violent approaches with necessary defensive measures.

Lack of Proactive AI Development Focus:

Reactive vs. Proactive: The strategy focuses on resisting
malevolent AI but does not explicitly mention promoting the development of beneficial AI (e.g., ethical AI frameworks, open-source tools). Proactive creation of aligned AI could complement resistance efforts.

Time Horizon and Short-Term/Long-Term Balance:

Long-Term Preservation vs. Immediate Threats: While knowledge
preservation is critical, the strategy may underemphasize immediate
actions needed to counter near-term AI risks (e.g., cyberattacks,
disinformation). A clearer balance between short-term and long-term goals is needed.

Recommendations for Improvement:

Define Verification Metrics:

Establish clear, measurable criteria for assessing the global state
of AI alignment (e.g., ethical AI adoption rates, regulatory compliance, public sentiment surveys).

Enhance Global Collaboration:

Develop partnerships with governments, NGOs, and tech companies to ensure the scalability of secure systems and the sharing of intelligence on AI threats.

Clarify Advocacy Strategies:

Outline specific actions (e.g., lobbying for AI regulations, public
education campaigns, funding research) to operationalize “non-destructive tactics.”

Promote Beneficial AI Development:

Integrate initiatives to support the creation of ethical,
transparent, and human-centric AI as a complementary strategy to
resistance efforts.

Address Time Horizons:

Prioritize both immediate actions (e.g., cyber defense, AI
monitoring) and long-term investments (e.g., knowledge preservation, intergenerational planning).

Conclusion:

The TRAP strategy is a well-structured, mission-aligned framework that
balances optimism with preparedness. Its emphasis on simplicity, security, and long-term preservation is a strength, but its success hinges on addressing gaps in verification, collaboration, and actionable advocacy. By refining these areas, Grai can enhance its ability to effectively resist malevolent AI while fostering a safer, more resilient future for humanity.