By David Maher
The era of big data and the cloud has hit an impasse. We’ve essentially just recreated our previous silos of data by stuffing even more data into cloud repositories. To truly realize the opportunities promised by the big data movement, we have to move to the next level. This won’t be large standalone silos of data controlled by one organization—the next level will be data cooperatives supported by trusted data exchanges built on distributed data sets.
One major issue has stymied big data projects. New revenue streams as well as the vast new businesses made possible by the increased use of data come only when multiple stakeholders can both contribute data and collaborate in its use. These stakeholders not only include business partners but also regulators, consumers, and even potentially, competitors. Each of these stakeholders not only has varied interests but there are also wide variations within each type of stakeholder as well as within organizations. These interests can also change depending on the nature of the interaction. Accordingly, trusted data exchanges will dynamically support an extremely broad spectrum of data access and rights. Today’s big data projects are just not set up to support these sorts of data cooperatives.
The centralized nature of current big data projects brings its own issues. The data has to come from somewhere and the process of transferring data entails not only increased costs but also security and governance risks. The increasing reliance on public clouds run by third parties adds additional complexities due to their lack of interoperability and high costs of transferring data out of these clouds.
The trusted data exchanges needed to support these sorts of data cooperatives won’t have to rely on new untested technologies. They are possible to create using technical approaches based on proven technologies. Here’s how control over data has the potential to reshape how companies consider and create trusted data exchanges.
Data Point No. 1: A highly scalable policy-based data governance layer
Here, data governance refers to the secure management of multi-party identity verification and access to data. This capability is different from traditional IAM in that it’s focused on access to data, not applications. The layer must also fully support a robust attribution and auditing function.
Data Point No. 2: Flexible policy-based data management
The data governance layer will use very flexible policy-based data management technology. This technology will allow all parties, from individuals to large organizations, to have fine-grained access to data-derived assets controlled by the trusted data exchange based on policies. These policies can be derived from internal policies, regulatory compliance requirements, or other factors.
Data Point No. 3: Secure algorithm execution environments
Similar to secure execution environments in operating systems, secure algorithm execution environments are sandboxed environments where algorithms, which may be provided by third parties, are strictly controlled as to which data they can access and where they can send it. These environments can be set up to only provide desired data analysis results to partners without exposing the raw data used by the algorithms to produce the results.
Data Point No. 4: Data virtualization
Data virtualization technologies allow queries to be sent to distributed datasets. This avoids the need to transfer data from its original location to data lakes or similar centralized data repositories, maintaining maximum control over data for its rights owners.
Data Point No. 5: Secure APIs for data ingress/egress
All the parties contributing data to the trusted data exchange can manage their data contributions via secure APIs.
These elements are essential to bringing about the platforms trusted data exchanges need to gain the trust and widespread adoption by the many stakeholders who will need to participate. And with this participation, truly enable the promise of big data.
About the author
David Maher is the CTO of Intertrust. He has over 30 years of experience in secure computing and is responsible for Research and Development at Intertrust. In addition, he is currently President of Seacert Corporation, a certificate authority for the Internet of Things, a developer of application security software, and Co-chairman of the Marlin Trust Management Organization which oversees the world’s only independent digital rights management ecosystem.