Databricks and Immuta Partner to Provide End-to-End Data Governance for Machine Learning
Databricks, a data processing and analytics platform with a strong focus on artificial intelligence (AI) and machine learning (ML), has partnered with Immuta to deliver automated end-to-end data governance for AI, data science, and ML projects.
The partnership is addressing the old self-service dilemma: how does one balance speed and agility with control when a typical organization has hundreds – if not thousands – of policies that vary based on business or functional groups involved, use cases, data types, geographies, technologies, and more.
Source: Data Governance and Data Security for Cloud Analytics webinar
Immuta, whose mission is to ensure “the legal and ethical use of data,” approaches this from three complementary angles:
- Policy orchestration
- Policy enforcement
- Self-service data catalog
Source: Data Governance and Data Security for Cloud Analytics webinar
The company was founded in 2015 by a team who spent ten years working with the US intelligence community. It is a Series B startup whose investors include Citi, DFJ, Dell Technologies, Daimler, Greycroft, Drive Capital, and others.
Immuta’s automated data governance solution integrates natively with Databricks’ Unified Data Analytics Platform. Its metadata-driven policy orchestration provides users with easy-to-manage, fine-grained end-to-end data governance controls for their data lake so that they can meet the data stewardship requirements of their organization.
Source: Data Governance and Data Security for Cloud Analytics webinar
Our take
As organizations are embarking on – and in some cases expanding – their ML and AI projects, the same old challenge not only persists but becomes more prevalent and acute: data governance.
How do you ensure that you have the right controls in place, that the right people have access to the right data, that the data is secure, that the organization is in compliance with existing regulations, and, more importantly, that you can trust the data? Because if you can’t trust your data, you can’t trust your AI. (Or machine learning, or data analytics output.)
And these challenges are magnified as organizations acquire more data – some of it sensitive – and store it in the same place, whether a warehouse or data lake, thereby increasing the risks.
Meanwhile, governments are tightening regulations and consumers are getting more educated. In addition, privacy watchdogs are pressuring governments and businesses alike to strengthen and enforce data protection laws, implement regulations around algorithmic decision-making systems (i.e. anything that uses ML), and ensure that the business model organizations are respectful of human rights.
(For more information on why this is important, see our earlier note Amnesty International Calls Google and Facebook a Threat to Human Rights.)
As a result, many organizations are slowing down their AI/ML initiatives to reopen, resume, and rejuvenate data governance. Where do you stand with your data governance initiative?
Want to Know More?
To start with data governance or to optimize it, consult the Info-Tech blueprint Enable Shared Insights With an Effective Data Governance Engine..
To learn what else you need to govern – in addition to data – when deploying AI/ML-powered solutions, reads the CIO Note Are You Ready For AI?
To get a quick introduction to why human rights matter for your business model, watch our two-minute brief Google and Facebook Called a Threat to Human Rights or read our tech note Amnesty International Calls Google and Facebook a Threat to Human Rights.