In the present era of technology, the security of data is a very important concept. Information technology development has transformed the way data is processed and nearly all real-life data is now processed electronically. This has resulted in a tremendous amount of digital information being generated on a daily basis and led to the emergence of a very popular technology and concept is known as Big Data.
It has been utilized by large companies, e-commerce giants as well as government projects. There can be many sources of large sets of data which includes the data arising from online transactions, pictures, posts, videos, audios, emails, medical records, search interrogations, science applications, and social networking interfaces. Another source can be the data resulting from the transfer management of devices such as smartphones.
So what actually is Big Data? The term is used to describe any massive amounts of structured, semi-structured, or unstructured data. It essentially means data that is large in volume. Companies like Amazon, Facebook, Google, Twitter, and government projects like digital identity cards maintain large volumes of data. All these entities maintain a large volume of user database and it continues to grow. Therefore, the security of big data is considered to be very important and a lot of research work is actively conducted to tackle the security issues in these projects. The use of biometric technologies in these projects will help to create a secure database.
The emergence and the security challenges of big data
Until 2010, the term “Big Data” was quite unknown but today it is touted as the latest technology trend. Its role as a factor in production, market competitiveness, and growth is continuously increasing and is now being adopted by everyone from product vendors to large-scale outsourcing and cloud service providers. It is making inroads into all areas of our digital life and transforming our day-to-day online activities.
The big data concept is based on extracting business value from a high volume, variety, and velocity of data in a timely and cost-effective manner. This 3-v model is used by most analytics to define it. The first aspect is the volume which includes the large quantities of data that need to be harnessed to improve decision-making across the organization. Variety involves handling the complexities of a range of new and emerging data sources such as the data generated from social media, location data from smartphones, public data available online, etc. The third aspect is velocity which refers to the speed at which the data is disseminated or the speed at which the data gets updated or refreshed during cyclical variations.
A number of challenges must be overcome to reap the benefits of large data sets. As it handles large amounts of data with varying data structures and real-time processing, the most important challenge is to maintain data security and adopt proper data privacy policies. There is an urgent need to implement strict mechanisms that ensure data security as well as conform to the rising quality expectations of the involved stakeholders.
The risks to implementation should not be underestimated. The environment consists of any number of stakeholders compiling, storing, and analyzing data for any number of different reasons. Thus there is a strong need to prevent the misuse of data so that people’s trust in digital channels is not broken. Biometrics is one such strategy that can respond to this digital revolution and reduce security risks significantly.
How can biometrics help to secure big data?
We will highlight in this section how biometrics can be leveraged to secure a huge amount of data. We will also discuss the process of biometric scans and the technology that is used in these projects. Recent years have seen a steady increase in the number of computer and cyber-crimes and it has become a serious problem for the government, the public, and businesses. In order to counteract these cyber-crimes, it is crucial to capture digital evidence and thus the focus is on how to obtain such evidence. However, users do not host their data themselves and it has become increasingly popular among users to use a third-party data service provider to store their data and emails. In some cases, a large server could also be shared among many different users which increases the difficulty in capturing data for investigation.
The problem is further amplified in a big data environment. The technology stores data in a distributed manner that may involve a large number of servers and storage devices. Furthermore, these storage devices could be remote as well and therefore traditional forensic techniques may not be applied very easily. For example, the huge volume of data and the distributed manner of the storage devices makes it extremely difficult to clone a copy of data from the storage devices. So this introduction of this technology and the need to move such information throughout an organization has also exposed a lot of vulnerabilities. This influx of huge data that is valuable to organizations has now become a massive target for hackers and other cybercriminals.
This data which was earlier unusable is now valuable to organizations. Hence such data is subjected to privacy laws and compliance regulations and must be protected. Biometrics offers the highest form of security, accuracy, and privacy due to the very fact that it is based on the inherent characteristics of individuals. These characteristics may include iris, fingerprints, voice, etc., and acts as a strong deterrent to hacking attacks. Biometric traits are extremely difficult to replicate and are the most accurate method known to verify individual identity. We can thus conclude that biometrics can play a vital role to maintain privacy and provide a highly secure environment for large data projects.
What biometric technologies can be used to enhance security in big data projects?
To understand how different biometric modalities can be used in large data projects, we will look at one of the most ambitious big data projects. The Aadhar project is a combination of enormous data and biometrics that is working to build an identity verification database of the billion Indian residents. This is the world’s largest biometric database and aims to provide residents with a unique identification number that will help them access government welfare and services. This project uses a combination of iris scans and digital fingerprints for each resident.
Let us understand how retinal scan works in big data projects. Iris technology is the primary technology behind retinal scan and the whole scanning process is subdivided into three key processes namely image/ signal gaining and dispensation, picture and identical process.
The first stage of retinal scan is image acquisition in which the user spots his or her eye adjacent to the lens and also must remain perfectly still at this point. The user’s retinal image will be captured and thereafter converted to the desired digital format. In the picture phase, unique feature of the retina is extracted and stored in a template database which contains the unique information for each user. In the identical process, the existing template is matched with the current data captured by the sensors. If there is match between both the scans, the user is authenticated successfully.
Fingerprint scan is a very widely used biometric technique and plays a key role in maintaining privacy and security in various industries. Fingerprint biometrics has also been extremely helpful in the field of forensic science to identify criminals. In our example of a large data project i.e. Aadhar, fingerprint biometrics is used to store the information of each citizen by scanning their fingers and thumbs for future reference.
The fingerprint scanning process that is used in the Aadhar project has two parts – enrollment and authentication. A common database is maintained in both of these key processes.
The enrollment process captures snapshots of the user’s fingerprints, extracts the special features of fingerprints from these snapshot images, and stores this information in a database. A fingerprint sensor is used to capture the images along with a feature extraction function. The feature extraction works like encryption for each object and the data of each object is associated with a unique id and stored in the database.
The main key process in authentication is the matching of data objects. When an already enrolled user of the Aadhar scheme wants to access a service, he or she scans his fingerprint. This information again goes to the feature extraction phase and is decrypted which then gets checked in the existing template database by the matching function. If a match is found with the current fingerprint scan, the user is successfully authenticated.
In this article, we have discussed how biometric devices and technologies can be used in big data projects for improving security and privacy. Fingerprint and retina scan are the popular biometric technologies that are adopted by many organizations to maintain their user data.