Demo: Combination and fragmentation, an alternative to encryption for secure outsourced data storage

11 March 2012

New Image

Outsourcing data storage is a common practice nowadays, and it will become even more popular with the emergence of the cloud computing paradigm which aims at providing computing resources as utilities and services. One potential issue is that all the stored data can be accessed by default by the cloud operator who can potentially use it for its own benefits. Furthermore, even if the cloud operator is honest, she can be compromised by attackers who have great interest in attacking data centers which aggregate data of several companies and users. The challenge is therefore to protect the confidentiality of data in an outsourced storage scenario and in particular protect documents stored in the cloud against curious or compromised storage service operators. As the number of cloud offerings is increasing, we explore the possibility of leveraging on a multi-cloud architecture to enhance security. In this line of thought, the first application we looked at is secure data storage: we propose to fragment data in n chunks and store fragments at different cloud operators to prevent a single cloud operator from having access to the whole data. This is not sufficient from a confidentiality perspective though because each chunk still contains part of the information in clear. We thus propose to further perform random linear combinations of the chunks seen as vectors in a Galois field. With high probability, n such random linear combinations form a full rank matrix and allow decoding the data if the associated coefficients are known, while it preserves the confidentiality of data if the coefficients are not known. It is also possible to generate k additional chunks for resilience, by performing additional random linear combinations of either original chunks or already combined chunks. This latter option means that it is possible for a mandated honest-but-curious intermediary to monitor the available chunks and generate additional ones on the fly should some clouds fail, thus maintaining resilience to a given level without compromising confidentiality of the data. This solution is thus interesting because it protects both confidentiality and resilience with the same mechanism.