How SQL / NoSQL Databases Enable Blockchain Applications to Become GDPR Compliant
Blockchain, being a decentralized & dis-intermediated data store, is being considered for rapid adoption, in several industries like Supply Chain Management, P2P Global Transactions, Internet of Things, Electoral Voting, Stock Exchanges etc...
GDPR, the General Data Protection Regulation of European Union (EU), enables EU Residents to exercise several rights like Right to Access, Right to be Forgotten and Data Portability Rights, with Businesses (European or otherwise), who act as Data Controllers and / or Data Processors. Also, GDPR mandates Data Controllers to ensure Data Processors hold and handle Sensitive Personal Data of EU Residents, on a need to hold and need to access basis, overall moving forward with the goal of up-keeping the Security & Privacy of EU Residents, in this data encroaching world.
This article is the gist of my attempts to practically understand the concept of immutability, and about how it applies in the scope of Database (as in Datomic and even MySQL, with Restricted User privileges), Cloud Storage (as in Wasabi), Message Queue (as in Apache Kafka) and last but not least the Blockchain, in the scope of Public (as in Bitcoin, Ethereum) to Private & Federated options (as in BigchainDB, Multichain).
As a next step, the study continued to evaluate the above understanding of immutability w.r.t. Blockchain, in the scope of "Encryption Key Life Cycle Management" mandate, of both PCI-DSS & HIPAA Compliance, then came the EUGDPR, and the "Right to be Forgotten" Clause of it, helped me to perceive the underlying technical challenges and to thereby come up with a thought process, that can enable anyone to use Blockchain in a more logical approach, while still staying in compliance with all popular legal frameworks, that includes the recent EUGDPR and any similar Privacy Oriented Regulations of other countries in near future.
I. Blockchain and Immutable Data Structures
The Data in the Blockchain is by default considered to be immutable, and while the levels of immutability depends on the type of blockchain implementation and corresponding Data Validator choices, the purpose of the Blockchain can be realized to the maximum extent, only after the core issues like scalability, interoperability and usability are clearly addressed, while staying compliant with relevant industry specific mandates in the process.
Scope of prevailing Immutability
To understand better, the major types of blockchain classifications with corresponding levels of immutability are listed below:
1. Public Blockchain (Mutable, once a group of miners who collectively achieved 51% hashrate, decides to recreate Blocks, from a chosen block and all subsequent ones).
2. Private Blockchain (Mutable, once a group of 51% Validators who collectively decides to recreate Blocks, from a chosen block and all subsequent ones).
3. Federated / Consortium Blockchain (Mutable, once a group of 51% Validators who collectively decides to recreate Blocks, from a chosen block and all subsequent ones).
II. Anonymity & KYC
While the Blockchain is considered to offer natural anonymity, irrespective of the context it is implemented in,
1. Be it as a Cryptocurrency system (example: Bitcoin) or
2. As a Decentralized Platform (example: Ethereum) or
3. When creating a Conglomerate, that interconnects one or more Blockchain Platforms with / with out existing Enterprise Database Systems, there is a visible need for SQL / NoSQL Databases, to compliment Blockchain implementations, w.r.t. storage of User's Know Your Customer (KYC) info, for Banks / Fintech Companies / Government Organizations and other businesses to deeply adopt BlockChain, in their context, in to long run.
III. Confidentiality & Compliance
While Security First Design is always a suggested one, to any industry, there exists few compliance mandates like PCI-DSS for Payments and HIPAA for Healthcare applications, that enforces compliance mandates with a primary goal of protecting User's Sensitive Data with utmost priority.
Some of the highlights include:
1. Secure Data at Rest i.e., by using well recommended Encryption Algorithms, when securing sensitive data, in the data store.
2. Secure Data in Transit i.e, by using Secure Protocols, recommended Encryption Algorithms, proper encryption strength, trusted keys / certificates, when transmitting data over public networks.
3. Information to be made available to the service provider's personnel, on a business need to know basis.
4. Periodic Vulnerability Scans of External facing Applications, Security Audits to all major Software Versions and usage of Certified Hardware Devices in the mix.
5. Encryption Key Life Cycle Management, that will have to be triggered either on periodic basis (quarterly / semi-annually / annually) and / or whenever an existing employee, who have access to Keys, changes to a different team / leaves the service provider's company. Important Observations
Considering the prevailing immutable nature of Blockchain, any User Sensitive Data when stored on Blockchain creates a mandated situation of regenerating the chain, from it's genesis block / specific block (from which data change is required), every time one of the following event happens:
1. The "Encryption Key Life Cycle Management" Procedures are executed.
2. A User exercises the "Right to be Forgotten" Clause of GDPR Regulation, w.r.t. the way, the User's Critical Personal Data is handled by the Data Controller / Data Processor, and when the User's Request has a clear legal standing.
This requirement to regenerate blocks in the chain, can only be prevented when the Blockchain is complimented by SQL / NoSQL Databases, in a sensible way, such that, Blockchain Stays as an enabler of Trust (by storing the Digital Signature of corresponding data), while SQL / NoSQL Databases takes the responsibility of storing the User's Sensitive Data and / or Critical Personal Data.
IV. Summary
It's the Scalability aspect in the database tier, that made people to innovate, and the result is a Performance & Scalability centric NoSQL Databases (those that are flexible enough to handle semi-structured / unstructured data) when compared to Data Integrity & Consistency oriented SQL Databases (those that are primarily meant to handle structured data). While on other hand, the implementation of Bitcoin as Cryptocurrency highlighted the potential of the underlying concept of Decentralized & Dis-intermediated Blockchain solution, that later became the base for many other second generation blockchain solutions starting with Ethereum and third generation blockchain solutions like Eosio, Cardano and Quant's Overledger.
Summarizing the above, a lot of R&D activity is yet to happen to create innovative Blockchain Solutions, that can scale better, all while being inter-operable and usable for addressing different business requirements and when complimented with SQL and / or NoSQL Databases, the combination will become a Holy Grail, to Businesses / Governments when launching Solutions that are Compliant in both Security & Privacy perspectives overall.
This GDPR kind of Privacy Enhancing Regulations are the need of the hour, that many other countries across the Globe, are starting to realize, that includes India, and this will soon lead to a new era of Privacy Oriented Legal Frameworks, in force, in different countries, all being implemented in respective country's context.
Please share your thoughts and I welcome everyone's feedback and comments on this post...
About the Author
Raghuveer Dendukuri is an Application Architect, Product Manager & Tech Entrepreneur, who is passionate about executing Fintech & Digital Transformation Projects, in Enterprise & SAAS Deployment Models, advocates Secure & Scalable Application Development Practices in PHP, while ensuring appropriate Cryptographic Algorithm Choices are in place, for the application to stay in compliance with all industry specific mandates like PCI-DSS, HIPAA, GDPR etc... in the process.
Raghuveer works as a Consultant and serves his clients in Application Architect & Product Management roles. This study is made out of his personal thirst to differentiate the hype Vs pragmatic advantages of Blockchain, for him to thereby make educated technology choices among data stores like SQL / NoSQL / Blockchain Systems, either as a single option or as a combination, for the long term advantage of his client applications.
Contact him for more info on his Raghu LinkedIn Profile.