← Library · Core concept

Data Versioning

Data versioning is the practice of tracking and managing changes to datasets over time. This includes recording modifications, additions, and deletions, associating each change with a unique identifier or timestamp. It's crucial for reproducibility and debugging in machine learning.

In plain terms

Think of it like keeping different drafts of a document, so you can always go back and see how it evolved or revert to an earlier version.

Why it matters

It enables developers to reproduce past results, understand how data changes impacted model performance, and collaborate effectively on data-intensive projects.

Learn one new AI thing every day.

Daily Deck sends you seven plain-English cards like this every morning. Free.

Start free