Cloud Data Security Using Group multi-keyword top K similarity search using asymmetric encryption
Cloud computing is emerging computing model where the data owners are outsourcing their data into the cloud storage and has come up as the disruptive trend in both IT industries and research communities with number of salient characteristics like high scalability and pay-as-you-go fashion that have allowed the cloud consumers to purchase the powerful computing resources as the services as per their requirements, such that cloud users have no longer need to worry about the complexity on hardware platform management. By outsourcing the data files into the cloud, it gives many benefits to the large enterprises as well as individual users because they can dynamically increase their storage space as and when required without buying any storage devices (Armbrust et al., 2009). They are:
(1) The users can access the remotely stored data at any time, from anywhere and gives permission to authorized users to share the data.
(2) The users can be relieved from the burden of storage management at locally,
(3) Avoidance of capital expenditure on hardware and software costs etc.
To date, there are a number of cloud storage services such as Amazon simple storage Space (S3), Rack space, Google, Microsoft, etc. Besides, all of these advantages of outsourced data in the Cloud, there is a number of significant issues.
One of the major issues is the privacy of outsourced data in the cloud i.e., sensitive information such as e-mail, health records, and government data may leak to unauthorized users or even be hacked. Since the cloud is an open platform; it is subject to attacks from both malicious insiders and outsiders. Cloud service providers (CSPs) usually offer data security through mechanisms like virtualizations and firewalls. However, these mechanisms do not protect users’ privacy from the CSP itself due to remote cloud storage servers are untrusted. A natural approach to the privacy of sensitive data is to encrypt data before outsourcing it into the cloud and retrieves the data back through the keyword dependent search over encrypted data. Though encryption provides protection from the illegal accesses, it pointedly increases the computation overhead on the data owners especially when they are having resource-constrained mobile devices and large size of data files.
Cloud computing framework is a promising new innovation and incredibly fastens the advancement of the extensive scale information stockpiling, handling and dispersion. Security and protection wind up to be the real concerns when information owners outsource their private information onto open cloud servers that are not inside their put stock in administration areas. To maintain a strategic distance from data spillage, delicate information must be encoded before transferring onto the cloud servers, which makes it a major test to help proficient catchphrase based questions and rank the coordinating outcomes on the scrambled information. Most present works just consider single watchword inquiries without proper positioning plans.
So, in order to allow the search over encrypted data, much Searchable Encryption (SE) schemes have been proposed in recent years. The SE solutions include building an index that is searchable such that the contents are hidden from the remote cloud server, yet permitting document search. The index is a data structure that keeps track of a stored document collection while supporting the efficient keyword search, i.e., given a keyword, the index returns a pointer to the documents that contain the keyword. These solutions differ as they allow single keyword search or multi-keyword search and types of techniques used to build the search query. A few of them, allow the notion of similarity search. The similarity search problem comprises a collection of data items that are characterized by some features, a query that specifies a value for a particular feature, and a similarity metric to measure the relevance between the query and data items. Nevertheless, these techniques either do not permit searching on multiple keywords and ranking retrieved documents in terms of similarity scores or are very computationally intensive.
Get In Touch