Which Domains Should I Work On?

how agent benchmarking efforts distribute across domains of human work

• Current benchmarks are concentrated in the Computer and Mathematical domain

• Leaving other highly-digitized domains (e.g., Legal, Management) underrepresented

What Benchmarks Contribute to This Domain?

What Do Tasks Look Like in this Domain?