Reducing Fragmentation for in Line De Duplication Backup Storage via Exploiting Backup History and Cache Knowledge

0
410
Reducing Fragmentation for In line De duplication Backup Storage via Exploiting Backup History and Cache Knowledge

Reducing Fragmentation for in Line De Duplication Backup Storage via Exploiting Backup History and Cache Knowledge

Abstract

In backup systems, after De Duplication, the chunks of each backup are physically scattered, causing a challenging fragmentation problem. We observe that the fragmentation enters sparse and out-of-order containers. Reducing Fragmentation for In line De duplication Backup Storage via Exploiting Backup History and Cache Knowledge The sparse container decreases the efficiency of restore performance and garbage collection while the out – of-order container decreases restore performance if the restore cache is small. We propose History-Aware Rewriting algorithm (HAR) and Cache-Aware Filter (CAF) to reduce fragmentation.
Reducing Fragmentation for In line De duplication Backup Storage via Exploiting Backup History and Cache Knowledge HAR exploits historical information in backup systems to accurately identify and reduce sparse containers, and CAF exploits restore cache knowledge to identify out – of-order containers that hurt performance. In datasets where out – of-order containers are dominant, CAF efficiently complements HAR.

System Configuration

H/W System Configuration
Speed                   : 1.1 GHz
RAM                      : 256 MB(min)
Hard Disk              : 20 GB
Floppy Drive          : 1.44 MB
Key Board             : Standard Windows Keyboard
Mouse                  : Two or Three Button Mouse
Monitor                : SVGA
S/W System Configuration

Platform                     :  cloud computing

Operating system       : Windows Xp,7,
Server                       : WAMP/Apache
Working on                : Browser Like Firefox, IE

Conclusion

Hybrid cloud storage is useful to further improve performance in data sets where out – of-charge containers are dominant. To avoid a significant decrease in the hybrid scheme’s de duplication ratio, we develop a two algorithm such as container marker algorithm and history-conscious rewriting algorithm to exploit backup history and cache knowledge. With the help of CMA, the hybrid scheme significantly improves the deduplication ratio without reducing the restore performance. Note that CMA can be used to optimize existing rewriting algorithms.