Thank you for Subscribing to CIO Applications Weekly Brief
Key Role of AI and ML in Building a Local Data Fabric
A real-time, flexible, and augmented data integration pipeline, combined with comprehensive data management capabilities, guarantees to serve both the most data-savvy as well as least data-savvy consumers within an organization.
Fremont, CA: In today's data-first economy, it is not uncommon for some organizations to have multiple data engineering, and data platform teams dealing with pricing, supply-chain, and in-store shopping-related advanced data analytics as well as data science to propel their business and gain a competitive advantage. As a result, one of the most difficult challenges that IT teams face these days is serving a diverse set of data consumers with varying skill levels. That is why the logical data fabric approach is gaining traction.
A real-time, flexible, and augmented data integration pipeline, combined with comprehensive data management capabilities, guarantees to serve both the most data-savvy as well as least data-savvy consumers within an organization. This latest data integration and data management approach supports faster and more automated data access and sharing by leveraging knowledge graphs, data catalogues, and AI/ML on active metadata.
A data fabric is a modular architecture, which means that components such as a data catalogue, data preparation layer, knowledge graph, recommendation engine, DataOps, and orchestration can be combined with disparate tools. While this is true, some of the best-in-class data fabrics are a single platform that provides all of the important data fabric capabilities. The logical data fabric is a concept of a unified data delivery platform that gives access to various data systems for business consumers, hiding complexity and exposing data in business-friendly formats while ensuring data delivery as per predefined semantics and governance rules. In present digital world, it is not a stretch to say that will make every CIO's dream come true.
The ability of business users, such as data scientists, citizen analysts, and LoB developers, to discover what datasets are accessible in the data delivery layer and determine which ones are relevant for their information needs is a key criterion of today's self-service strategy. For personas such as citizen analysts and data scientists, relying on the IT team for data search and discovery has become a bottleneck. Worse, these limited resources cannot afford to be squandered wrangling data instead of building models or analyzing data. A data catalogue that is integrated with the data delivery layer and enhanced with an AI/ML-based recommendation engine allows users to quickly discover and explore data. Business stewards can build a catalogue of business views based on metadata, categorize them based on business categories, and tag them for easy access. A logical data fabric with improved collaboration features enables all users to endorse datasets or register comments or warnings about datasets, allowing them to better contextualize dataset usage and understand how their peers experience them.
While data search, discovery, classification, and tagging make it easier for users to find the right data at the right time, they can be significantly improved with the help of a powerful AI/ML engine. Past user activity can be analyzed in an AI/ML-powered logical data fabric to provide personalized recommendations and shortcuts to select datasets, accelerating data science projects and advanced analytics. Other enhancements could include more detailed profiling information about datasets and columns, as well as improvements to smart search, which is the smart ranking of results, similar to how Google search works but in the context of enterprise data access.