Discussion about this post

User's avatar
Sam's avatar

The pain is real, and automation here is overdue. That said, would’ve liked more depth. For example, on how this differs from observability/metadata tools, and how “auto” plays out across modalities beyond tabular/text?

Today observability tools like Monte Carlo, Metaplane, or even databand and metadata layers like OpenLineage, Marquez, or Feast for features are becoming core to how preprocessing is validated and governed

The $6-10bil TAM makes sense if you include all ETL, wrangling, and data integration efforts. But if we’re strictly talking about ML-specific preprocessing, the addressable market narrows fast!

Still, a good thesis kickoff man!

Expand full comment
2 more comments...

No posts