Improved Backup Scheduling With Data Deduplication Techniques For Saas In Cloud

The cloud computing is a technology which is used to provide resources as a service. There are many services provided by cloud provider. They are Software-as-a-Service, Platformas-a-Service, Infrastructure-as-a-Service, etc. The cloud computing provides the Storage-as-a-Service which is used to backup the users data into cloud. The Storage-as-a- Service is provided by Storage Service Provider or Cloud Service Provider. This service is provided by Cloud Service Provider which is effective, reliable and cost-effective. The existing backup scheduling provides the reliability by maintaining the same copy of the data twice. The existing backup scheduling provides the reliability and backup speed, but the redundancy of data is not considered. The redundancy of data leads to more storage space consumption for the same file. The existing backup scheduling do not take into consideration the security issues. The limitations of the existing backup scheduling algorithm is improved by proposing a backup scheduling algorithm(IBSD) which aims at reducing redundancy without compromising on availability. The IBSD algorithm reduces redundancy by deduplication techniques. The deduplication is a technique which is used to identify the duplicate data. The de-duplication identifies the duplicate data and eliminates it, by storing only one copy of the original data. If the duplicate occurs then the link will be added to the existing data. There are many techniques reported in literature for de-duplication which includes Whole File Chunking, Fixed Length Partition, and Content Defined Chunking. The Improved Backup Scheduling with Deduplication (IBSD) provides the backup scheduling with data de-duplication at file level and at chunk level. The IBSD reduces the bandwidth consumption by determining the duplicate data at consumer side itself. The new metric Storage Utilization Ratio (SUR) is proposed.