Designing
for AI
trust

When multiple AI models judge your work,
what should that experience feel like?
CLIENTRALLY // GENLAYER LABS
DURATION5–6 MONTHS
ROLESOLE PRODUCT DESIGNER
LOCATIONREMOTE

GenLayer had built a new kind of blockchain where multiple AI models can reach consensus on subjective decisions. The tech was strong, but there was no product experience around it. I joined to design Rally from scratch, a platform where creators are paid based on the quality of their submissions, evaluated by AI.

The hard part was not drawing screens. The hard part was designing trust for a multi-model AI system where clear patterns did not really exist yet.

Rally ideation phase and information architecture artifacts

This product has two sides of the same trust problem. Creators need to feel that AI is evaluating their work fairly, and campaign managers need to feel that AI is spending budget responsibly. Both groups need enough visibility into decisions they do not fully control.

MY ROLE

I was the sole product designer. I owned research, wireframes, prototypes, final UI, and the design system, working directly with AI researchers and the CPO.

TOOLS

Figma for design, Claude Code, v0, and Replit for rapid prototyping and implementation.

Interviews with creators and campaign managers showed one consistent pattern: people do not trust AI decisions they cannot inspect. Transparency was not a nice-to-have; it was a baseline requirement.

Another useful insight: creators pushed back on fixed-price framing but responded well to pool-based competition and relative ranking. That directly shifted the reward model.

I trust the fairness of AI more than human review, as long as the criteria are clear.

CONTENT CREATOR, USER INTERVIEW

$5,000–10,000 USDC MINIMUM POOL SIZE FOR MEANINGFUL PARTICIPATION. ANYTHING LESS AND CREATORS WON’T BOTHER.

CAMPAIGN MANAGER, USER INTERVIEW

The pool-size signal shaped campaign setup decisions. I added minimum budget gates and clear visibility into total reward pool before creators committed effort to submissions.

Section 3 research and discovery visuals

Rally is a two-sided marketplace. Campaign managers publish briefs and budgets, creators submit content, and multiple AI models evaluate each submission before consensus. Payouts are tied to performance, not follower count.

AI EVALUATION UI

The central interface was the evaluation criteria system. It maps AI gates like alignment, accuracy, compliance, originality, engagement, and technical quality into a format managers can configure and creators can read at a glance.

MAKING AI VISIBLE

I treated the decision flow as a glass box instead of a black box. Each evaluation shows what passed, what failed, and by how much, rather than only showing approved or rejected.

Content creator journey

From discovery to payout in 24 hours

PAIN POINTS
  • Large creators dominate most platforms, so smaller creators struggle to get noticed.
  • Traditional 30-60 day payouts drain momentum before creators can build consistency.
  • Creators worry about promoting low-trust campaigns without clear verification signals.
  • Most tools are built for agencies, not solo creators trying to move fast.
SOLUTIONS
  • Zero-friction onboarding
    • No forms, no application queue, and no minimum follower requirements.
    • Connect wallet, browse campaigns, submit. Three clear steps.
  • Visible AI evaluation
    • All 11 evaluation criteria are shown before submission.
    • Creators know what the models are checking, and why.
  • Social proof and safety
    • A participation counter gives creators confidence to join.
    • Suspicious or broken submissions are flagged immediately.
  • Instant scoring
    • Submissions are scored in real time instead of waiting days for manual review.
    • Score breakdowns make it obvious what passed and what needs improvement.
Flow 1 creator journey screens

Campaign manager journey

From budget to a scaled creator network

PAIN POINTS
  • Managing 100+ creators is hard without a dedicated operations team.
  • Stakeholders want proof that campaign activity is actually happening.
  • Competing tools require too much setup and repetitive form filling.
SOLUTIONS
  • Natural-language campaign creation
    • Managers describe the campaign in plain English, and AI generates the structure.
    • The same 11 criteria can be tuned with sliders to set quality thresholds.
  • Live campaign dashboard
    • Participation and engagement metrics update in real time.
    • Teams can show stakeholders a live stream of campaign activity.
  • Attribution that works
    • Wallet-linked activity ties creator submissions to measurable outcomes.
    • Performance reporting is concrete enough to defend marketing spend.
  • Targeted reach
    • Eligibility can be set by audience size, niche, or whitelist.
    • AI handles matching so teams avoid manual outreach loops.
Flow 2 campaign manager screens
01
VISUALISING LLM CONSENSUS

When several models review the same submission, disagreement is normal. I designed a consensus view that shows each model score before the final decision so users can see where agreement came from.

02
THE EVALUATION CRITERIA INTERFACE

Campaign managers control 11 criteria with sliders. On the creator side, the same data appears as progress indicators, so one system supports both control and transparency.

03
DESIGNING THE "AI SAYS NO" MOMENT

Trust usually breaks at rejection, so the failure state had to be specific. If a creator scores 72% on originality against an 80% threshold, they immediately know what to fix.

04
TRANSPARENCY VS. SIMPLICITY

The temptation was to expose every score and every weight. In practice, people want confidence that details are available, not a wall of diagnostics by default. So the UI leads with summary and lets users drill deeper.

The product is still in alpha, but early signals were strong. People came back, kept submitting, and did not drop after rejection. They iterated and tried again.

100K+
Registrations in alpha
+420%
Month-over-month growth in creator submissions
87.7%
Activation rate from signup to first action
30M+
Impressions to date
1000+
Submissions in the first 10 days
75K+
Submissions to date
WHAT I’D DO DIFFERENTLY

I would run usability tests earlier on the criteria interface. The first version surfaced too many gates at once, and that should have been validated sooner. I would also bring real creator content into prototypes earlier because the emotional response is very different from sample data.

WHAT I LEARNED ABOUT AI DESIGN

The biggest lesson was that trust does not come from dumping model logic on users. Trust comes from clear outcomes, with deeper rationale available when people choose to inspect it.

LOOKING FORWARD

As AI systems influence hiring, moderation, evaluation, and spending, the core design question stays the same: how do we keep people informed and in control without slowing the system to a halt? Rally was my first deep pass at that problem, and it continues to shape how I design.