In this episode we speak with Paul Singman Developer Advocate at Treeverse / LakeFS. LakeFS is an open source project that allows you to transform your object storage into a Git-like repository.
Top 3 takeaways
- LakeFS enables use cases like debugging to quickly view historical versions of your data at a specific point in time and running ML experiments over the same set of data with branching..
- The current data landscape is very fragmented with many tools available.. Over the coming years there will most likely be consolidation of tools that are more open and integrated.
- Data quality and observability continue to be key components of successful data lakes and having visibility into job runs.