How AI Picks works

What you're looking at

Every AI Picks page is generated from real AI responses we collect, both from our own controlled "reference" prompts and from the prompts our customer brands run as part of tracking their own AI visibility.

The Recommendation Score

For each tool in a category, the Recommendation Score is:

recommendation_score = mentions_of_tool / total_responses_in_industry

Computed over a rolling 90-day window across all AI responses tagged to the industry. The model breakdown shown below the score is the same calculation, scoped to a single model.

Two layers of data

  • Industry-wide signal: aggregated from real prompts run by every Whaily brand tracking the category. This is the "Recommendation Score" you see on every page.
  • Generic prompt signal: our own reference prompts ("What are the top X tools?", "Best X for [angle]?") run weekly across all major models. Used to seed sample responses and fill in pages for industries with sparse customer data.

Refresh cadence

  • Daily: aggregation snapshots refresh from the latest customer prompt responses.
  • Weekly (Sundays): our generic reference prompts re-run across all models.

Model coverage

At launch: Anthropic Claude Sonnet 4.6, OpenAI GPT-4o, Google Gemini 2.5 Pro, Perplexity Sonar Large. We add new models as they're released.

Cold-start data

When an industry doesn't yet have enough customer brands tracking it (under 10 responses or 2 brands), the page shows our reference-prompt baseline only and clearly labels itself as such.

What we never publish

  • Individual customer prompt text (it can leak brand context).
  • Per-org breakdowns or any way to identify which brands run which prompts.
  • Internal notes, comments, or billing data.