Combining Chunk Boundary and Chunk Signature Calculations for Deduplication
Litwin, Witold; Long, Darrell; Schwarz, Thomas (2012), Combining Chunk Boundary and Chunk Signature Calculations for Deduplication, Revista IEEE Latin America, 10, 1, p. 1305-1311. 10.1109/TLA.2012.6142477
TypeArticle accepté pour publication ou publié
Journal nameRevista IEEE Latin America
MetadataShow full item record
Laboratoire d'analyse et modélisation de systèmes pour l'aide à la décision [LAMSADE]
Abstract (EN)Many modern, large-scale storage solutions offer deduplication, which can achieve impressive compression rates for many loads, especially for backups. When accepting new data for storage, deduplication checks whether parts of the data is already stored. If this is the case, then the system does not store that part of the new data but replaces it with a reference to the location where the data already resides. A typical deduplication system breaks data into chunks, hashes each chunk, and uses an index to see whether the chunk has already been stored. Variable chunk systems offer better compression, but process data byte-for-byte twice, first to calculate the chunk boundaries and then to calculate the hash. This limits the ingress bandwidth of a system. We propose a method to reuse the chunk boundary calculations in order to strengthen the collision resistance of the hash, allowing us to use a faster hashing method with fewer bytes or a much larger (256 times by adding two bytes) storage system with the same high assurance against chunk collision and resulting data loss.
Subjects / KeywordsAlgebraic Signatures; Deduplication
Showing items related by title and author.
Constantin, Camelia; du Mouza, Cedric; Litwin, Witold; Rigaux, Philippe; Schwarz, Thomas (2016) Article accepté pour publication ou publié
Cumulative Algebraic Signatures for Fast String Search, Protection Against Incidental Viewing and Corruption of Data in an SDDS Litwin, Witold; Mokadem, Riad; Schwarz, Thomas (2007) Communication / Conférence