📈Mo' data mo' problems — The rise of increasingly sophisticated tools (data warehousing, data ingestion, transformation, etc ...) has empowered modern data teams to work with large, complex, and disparate datasets.

1/
🏭Large enterprises that lack a centralized data catalog find it challenging to:

- harness and curate massive data loads
- understand the provenance of the metadata on which their reports are built on
- derive meaningful insights from the data
- trust the data

2/
🛠️To this date, many tools (open source and others) have emerged out of this space from @NetflixOSS, @AirbnbData, @UberEng, @lyfteng, @LinkedInEng, @MarquezProject, and more recently, @googlecloud.

3/
🚀Next-gen tools will not only serve as the foundational framework for data governance but will also:

- improve internal operational efficiency
- promote transparency and fairness
- provide users of all skill levels access to the data they need, when they need it.

4/
📩 I continue to care a lot about the broader metadata management / data catalog space and if you're an enterprise startup tackling this problem, I'd love to chat!

fin.
You can follow @psomrah.
Tip: mention @twtextapp on a Twitter thread with the keyword “unroll” to get a link to it.

Latest Threads Unrolled: