shawnrossouw via Pixabay
The internet never forgets, which means data that should have been deleted doesn’t always stay deleted. Call it “zombie data,” and unless your organization has a complete understanding of how your cloud providers handle file deletion requests, it can come back to haunt you.
Ever since the PC revolution, the concept of data deletion has been a bit misunderstood. After all, dragging a file to the Recycle Bin simply removed the pointer to the file, freeing up disk space to write new data. Until then, the original data remained on the disk, rediscoverable using readily accessible data recovery tools. Even when new data was written to that disk space, parts of the file often lingered, and the original file could be reconstructed from the fragments.
[ The cloud storage security gap—and how to close it. | 5 ways Microsoft has improved SharePoint security. ]
Desktop—and mobile—users still believe that deleting a file means the file is permanently erased, but that’s not always the case. That perception gap is even more problematic when it comes to data in the cloud.
Cloud service providers have to juggle retention rules, backup policies, and user preferences to make sure that when a user deletes a file in the cloud, it actually gets removed from all servers. If your organization is storing or considering storing data in the cloud, you must research your service provider’s data deletion policy to determine whether it’s sufficient for your needs. Otherwise, you’ll be on the hook if a data breach exposes your files to a third party or stuck in a regulatory nightmare because data wasn’t disposed of properly.
With the European Union General Data Protection Regulation expected to go into effect May 2018, any company doing business in Europe or with European citizens will have to make sure they comply with rules for removing personal data from their systems—including the cloud—or face hefty fines.
Data deletion challenges in the cloud
Deleting data in the cloud differs vastly from deleting data on a PC or smartphone. The cloud’s redundancy and availability model ensures there are multiple copies of any given file at any given time, and each must be removed for the file to be truly deleted from the cloud. When a user deletes a file from a cloud account, the expectation is that all these copies are gone, but that really isn’t the case.
Consider the following scenario: A user with a cloud storage account accesses files from her laptop, smartphone, and tablet. The files are stored locally on her laptop, and every change is automatically synced to the cloud copy so that all her other devices can access the most up-to-date version of the file. Depending on the cloud service, previous file versions may also be stored. Since the provider wants to make sure the files are always available for all devices at all times, copies of the file live across different servers in multiple datacenters. Each of those servers are backed up regularly in case of a disaster. That single file now has many copies.
“When a user ‘deletes’ a file [in the cloud], there could be copies of the actual data in many places,” says Richard Stiennon, chief strategy officer of Blancco Technology Group.
Deleting locally and in the user account simply takes care of the most visible version of the file. In most cases, the service marks the file as deleted and removes it from view but leaves it on the servers. If the user changes his or her mind, the service removes the deletion mark on the file, and it’s visible in the account again.
In some cases, providers adopt a 30-day retention policy (Gmail has a 60-day policy), where the file may no longer appear in the user’s account but stay on servers until the period is up. Then the file and all its copies are automatically purged. Others offer users a permanent-delete option, similar to emptying the Recycle Bin on Windows.
Service providers make mistakes. In February, forensics firm Elcomsoft found copies of Safari browser history still on iCloud, even after users had deleted the records. The company’s analysts found that when the user deleted their browsing history, iCloud moved the data to a format invisible to the user instead of actually removing the data from the servers. Earlier, in January, Dropbox users were surprised to find files that had been deleted years ago reappearing in their accounts. A bug had prevented files from being permanently deleted from Dropbox servers, and when engineers tried to fix the bug, they inadvertently restored the files.
The impact for these incidents was limited—in Dropbox’s case, the users saw only their files, not other people’s deleted files—but they still highlight how data deletion mistakes can make organizations nervous.
There are also cases in which the user’s concept of deletion doesn’t match the cloud provider’s in practice. It took Facebook more than three years to remove from public view photographs that a user had deleted back in 2009; even then, there was no assurance, given that the photographs aren’t still lurking in secondary backups or cloud snapshots. There are stories of users who have removed their social media accounts entirely and find the photos they’ve shared remain accessible to others.
Bottom line, between backups, data redundancy, and data retention policies, it’s tricky to assume that data is ever completely removed from the cloud.
What deleting data from the cloud looks like
Stiennon declined to speculate on how specific cloud companies handle deleting files from archives but said that providers typically store data backups and disaster recovery files in the cloud and not as offsite tape backups. In those situations, when a file is deleted from the user’s account, the pointers to the file in the backup get removed, but the actual files remain in that blob. While that may be sufficient in most cases, if that archive ever gets stolen, the thief would be able to forensically retrieve the supposedly deleted contents.
“We know that basic deletion only removes pointers to the data, not the data itself, and leaves data recoverable and vulnerable to a data breach,” says Stiennon.
Some service providers wipe disks, Stiennon says. Typically in those situations, when the user sends a deletion command, the marked files are moved to a separate disk. The provider relies on normal day-to-day operations to overwrite the original disk space. Considering there are thousands of transactions per day, that’s a reasonable assumption. Once the junk disk is full or the retention time period has elapsed, the provider can reformat and degauss the disk to ensure the files are truly erased.
Most modern cloud providers encrypt data stored on their servers. While some ahead-of-the-game providers encrypt data with the user’s private keys, most go with their own keys, frequently the same one to encrypt data for all users. In those cases, the provider might remove the encryption key and not even bother with actually erasing the files, but that approach doesn’t work so well when the user is trying to delete a single file.
Here’s another reason to be paranoid in the likely event that not every copy of a file gets scrubbed from the cloud: There are forensics tools capable of looking into cloud services and recovering deleted information. Elcomsoft used such a tool on iCloud to find the deleted browser history, for example. Knowing that copies of deleted files exist somewhere in the cloud, the question becomes: How safe are these orphaned copies from government investigators and other snoops?
The bits left behind
Research has shown that companies struggle to properly dispose of disks and the data stored on them. In a Blancco Technology Group research, engineers purchased more than 200 drives from third-party sellers and found personal and corporate data could still be recovered, despite previous attempts to delete it. A separate Blancco Technology Group survey found that one-third of IT teams reformat SSDs before disposing them but don’t verify that all the information has been removed.
“If you do not overwrite the data on the media, then test to see if it has been destroyed, you cannot be certain the data is truly gone,” Stiennon says.
While there have always been concerns about removing specific files from the cloud, enterprise IT teams are only now beginning to think about broader data erasure requirements for cloud storage. Many compliance regimes specify data retention policies in years, ranging from seven years to as long as 25 years, which means early cloud adopters are starting to think about how to remove the data that, per policy, now have to be destroyed.
GDPR is also on the way, with its rules that companies must wipe personal data belonging to EU residents from all its systems once the reasons for having the data expire. Thus, enterprises have to make sure they can regularly and thoroughly remove user data. Failure to do so can result in fines of up to 4 percent of a company’s global annual revenue.
That’s incentive, right there, for enterprises to make sure they are in agreement with their service providers on how to delete data.
How to protect your organization from “zombie” cloud data
Given these issues, it’s imperative that you ask to see your service provider’s data policy to determine how unneeded data is removed and how your provider verifies that data removal is permanent. Your service-level agreement needs to specify when files are moved and how all copies of your data are removed. A cloud compliance audit can review your storage provider’s deletion policies and procedures, as well as the technology used to protect and securely dispose of the data.
Considering all the other details to worry about in the cloud, it’s easy to push concerns about data deletion aside, but if you can’t guarantee that data you store in the cloud is effectively destroyed when needed, your organization will be out of compliance. And if supposedly deleted data is stolen from the cloud—or your storage provider mistakenly exposes data that should have been already destroyed—your company will ultimately pay the price.
“It’s more of a false sense of security than anything else when the wrong data removal method is used,” Stiennon says. “It makes you think the data can never be accessed, but that’s just not true.”