How AI Picks works
What you're looking at
Every AI Picks page is generated from real AI responses we collect, both from our own controlled "reference" prompts and from the prompts our customer brands run as part of tracking their own AI visibility.
The Recommendation Score
For each tool in a category, the Recommendation Score is:
recommendation_score = mentions_of_tool / total_responses_in_industryComputed over a rolling 90-day window across all AI responses tagged to the industry. The model breakdown shown below the score is the same calculation, scoped to a single model.
Two layers of data
- Industry-wide signal: aggregated from real prompts run by every Whaily brand tracking the category. This is the "Recommendation Score" you see on every page.
- Generic prompt signal: our own reference prompts ("What are the top X tools?", "Best X for [angle]?") run weekly across all major models. Used to seed sample responses and fill in pages for industries with sparse customer data.
Refresh cadence
- Daily: aggregation snapshots refresh from the latest customer prompt responses.
- Weekly (Sundays): our generic reference prompts re-run across all models.
Model coverage
At launch: Anthropic Claude Sonnet 4.6, OpenAI GPT-4o, Google Gemini 2.5 Pro, Perplexity Sonar Large. We add new models as they're released.
Cold-start data
When an industry doesn't yet have enough customer brands tracking it (under 10 responses or 2 brands), the page shows our reference-prompt baseline only and clearly labels itself as such.
What we never publish
- Individual customer prompt text (it can leak brand context).
- Per-org breakdowns or any way to identify which brands run which prompts.
- Internal notes, comments, or billing data.
