Abstract: Hadoop and big data technologies can unlock the true potential of your business. But due to the organic evolution of different components of the data stack, security controls have become siloed in applications. Drawing from his experience solving these problems for Fortune 500 companies, Pratik Verma shares three principles of a data-centric security architecture for Hadoop that protects sensitive data without disrupting users:
Modifying requests to filter content makes security transparent to users. Pratik shares the benefits of this approach when compared with modifying data, along with some performance benchmarks collected from real-life scenarios.
Centralizing data-access decisions and distributing enforcement makes security scalable. The assumption that coupling “who can see what data” decisions and “how to protect and hide the data that is not allowed” enforcement to improve performance is no longer true for distributed systems. Pratik shares examples of how this approach does not provide enough flexibility to protect the rapidly evolving Hadoop ecosystem.
Using metadata instead of files/tables ensures systematic protection of sensitive data. Pratik shares examples of how with the appropriate use of data, metadata, and business knowledge in policy rules, it can be easy for business users to specify what they want protected and for security teams to actually accomplish that without becoming data scientists themselves.
Pratik Verma is founder and currently chief product officer at BlueTalon. Pratik founded BlueTalon to accelerate big data deployments and remove security as a barrier to adoption. Previously, he led AgeTak, a healthcare startup build on technologies created by Rakesh Verma. He is an angel investor in several tech startups. Pratik’s scientific work has been published in peer-reviewed journals. He holds a PhD from Stanford.