#QConSF -  @criccomini shares that WePay data infrastructure is based on Airflow, Kafka and BigQuery.
                        
                        
                        
                        
                                                
                    
                    
                                    
                    
                    
                                    
                    
                        
                        
                        This looks nuts to people who know databases, but at first it works well! Very real time. But soon users and reports get in each other& #39;s way
                        
                        
                        
                        
                                                
                        
                                                
                    
                    
                                    
                    
                        
                        
                        So you give users their own data warehouse. But now you have lots of loading jobs and that gets complex. Data quality may be an issue.
                        
                        
                        
                        
                                                
                        
                                                
                    
                    
                                    
                    
                    
                                    
                    
                    
                                    
                    
                        
                        
                        But now there are many systems and there is operational pain. We need more integration.
                        
                        
                        
                        
                                                
                    
                    
                                    
                    
                    
                                    
                    
                    
                                    
                    
                        
                        
                        You need automation! Not just for operations (everyone knows about that), but also automating data management! Do you have a data catalog?
                        
                        
                        
                        
                                                
                        
                                                
                    
                    
                                    
                    
                        
                        
                        "we use terraform to manage Kafka topics and connectors. This topic has compaction policy, which is an exciting policy to have when you system evolves".  @criccomini at   #QConSF
                        
                        
                        
                        
                                                
                    
                    
                                    
                    
                        
                        
                        You need a data catalog, or you& #39;ll spend all your time chasing compliance issues.
                        
                        
                        
                        
                                                
                        
                                                
                    
                    
                                    
                    
                        
                        
                        Shout out to Amundsen data catalog. But there are many others ( @mark_grover)
                        
                        
                        
                        
                                                
                        
                                                
                    
                    
                                    
                    
                        
                        
                        You need all your systems talking to data catalog. You, the data engineer, shouldn& #39;t enter the data yourself.
                        
                        
                        
                        
                                                
                    
                    
                                    
                    
                        
                        
                        You can monitor for sensitive data in wrong data sets and alert if this happens. GCP has tools to set this up.
                        
                        
                        
                        
                                                
                    
                    
                                    
                    
                        
                        
                        Are you ready to decentralize your data flows? Can you let users spin their own micro-dwh, load and populate them on their own.
                        
                        
                        
                        
                                                
                        
                                                
                    
                    
                                    
                    
                        
                        
                        My biggest take away from  @criccomini talk
                        
                        
                        
                        
                                                
                        
                                                
                    
                    
                                    
                    
                    
                
                 
                         Read on Twitter
Read on Twitter 
                             
                             
                             
                             
                             
                             
                             
                             
                             
                             
                             
                             
                             
                                     
                                    