Meta Introduces Autodata, an Agentic Data Scientist for High-Quality Synthetic Data
Autodata, developed by Meta (FAIR), is a general method that enables AI agents to function as data scientists, building high-quality training and evaluation data. It involves an agent creating and curating data, performing actions a human would take, and iteratively improving the data generation recipe. Experiments in computer science, legal reasoning, and mathematics show improved results over classical synthetic dataset creation methods, with meta-optimizing the data scientist agent providing further performance uplifts.
Autodata offers a way to convert increased inference compute into higher quality model training and benchmarks, pushing the frontier of AI research by generating more challenging and refined datasets.
Learn one new AI thing every day.
Daily Deck sends you seven plain-English cards like this every morning. Free.
Start free