Big data has become a popular term for the massive amount of data that is generated which has the potential to be mined for information. Information is valuable to companies, governments and other private players. An analyst with Gartner, Doug Laney defined the three dimensions of big data as volume, velocity and variety. The sheer amount of data, the high speed at which it is produced and the range of data types make data analytics an extremely dynamic field today. Big data, if used responsibly, has immense potential for inclusive and informed development. The 2015 Global Information Technology report recognises this possibility and points to the need to leverage big data to benefit all sectors of society, and use it for better decision making and governance. However, the predictive analysis and algorithms inherent in big data analytics pose a problem for inclusive growth, as they might lead to the reinforcement of stereotypes and discrimination and threaten privacy and anonymity.
Surveillance
Digital surveillance targets groups and communities, individuals by seeking what is known as a ‘pattern of life’ which indicate residential, educational and occupational profiles. The European Data Protection Supervisor’s report gives the example of banking and insurance sectors, which have an interest in gaining granular information into the risk posed by individuals. This could be revealed by combining datasets from online activity as well as other personal information from connected devices. In a process called reverse redlining, financial institutions target communities based on race and income by purchasing metadata on minority neighbourhoods, to gain financial leverage. In an effort to build a ‘social credit’ system, China is working on building a national database that compiles fiscal and government information, into a single number ranking for citizens. As part of this, financial companies such as Sesame Credit are encouraging consumers to share their good credit scores to move up the social credit ladder. With the smart cities initiatives a big part of Digital India, some of these concerns about data aggregation become more relevant to the Indian context.
Big Data and Threats to Privacy
Privacy is obviously is a major concern in contexts where anonymity gets eroded. As we get more and more connected, interacting datasets could reveal the identity of a person even if the data is anonymised. There is a real need for transparency from the end of organisations who process and analyse the data of users. This imbalance of power is reflected in Richards and King’s conception of the three paradoxes of big data.
- The transparency paradox: big data collects all kinds of personal data about individuals, but its own operations are completely opaque.
- The identity paradox: big data seeks to identify at the expense of individual and collective identity.
- The power paradox: while seen as useful for public interest, big data privileges governments and corporate entities at the expense of ordinary individuals.
Implications for Policy
The increasing role that big data plays in our social, economic and political lives has prompted the need for articulating policy frameworks that can address problems of surveillance, privacy and discrimination. Current policy concerns around big data span a number of issues. These include,
- Resolving ownership, control and dissemination issues over data generated from everyday activities and the Internet of things.
- Enforcing data retention policies
- Creating effective legal regulation for the cloud
- Optimizing big data while maintaining safeguards for privacy, anonymity and security
- Articulating clear safeguards against creating corporate or state back doors with surveillance potential.
- Push for increased transparency and openness in the big data process. Provisions for pro-active disclosure and more accountability to users.
- Creation of opt-out laws