A large agri-business has multiple business verticals, each vertical working in silos, and each vertical has their own project team. Many times they end up having separate data architecture and each project team adopts and follows different approaches for implementing data lake solutions. We standards for architecture and tools to be used by the different teams. We defined data ingestion approach, transformation approach and data access security to migrate and upgrade the infrastructure applications, thereby creating an effective transformation process that helped build a data lake on CDH.

Implementing Data Lake On CDH