GAIA by Microsoft

A benchmark for General AI Assistants that tests AI systems on real-world tasks requiring reasoning, multi-modality, and tool use capabilities.

View on AIWEBTOOLS.AI