Protecting Against Data Recovery

Problem Definition

If your laptop, computer, memory stick or PDA is sold, stolen or lost, hackers and identity thieves can easily recover the files that you think were deleted. These files are not necessarily erased from your computer despite the Recycle Bin being empty. When your computer saves a file, it writes data on a disk (or memory stick) and takes note of the file location. When you delete the file, it is transferred to the Recycle Bin. When you empty the Recycle Bin, the computer erases only the note about the previous file location, but file content is left intact on the disk until some other file is eventually written over it. The entire content of an erased file can remain untouched anywhere from a few seconds to a few years. Hackers can easily uncover and read it using public data recovery software.

Many computer users trust the Recycle Bin. When they empty it, they think their files are completely removed from the computer. Some users, from individuals to international banks, have experienced the error of this common belief when their sensitive data was recovered by data thieves who had bought the old hard drive and easily recovered the "deleted" files.

Most people would never throw a credit card into a public trashcan, but they do something similar with their computers. A 2003 study found that about 35% of the hard drives on the used market contained easily accessible confidential information (see Garfinkel and Shelat).

Emptying the Recycle Bin does not actually remove a files contents from the disc. When a computer saves a file it writes the data on a disc (or memory stick) and makes a note of the file's location. When you put the file in the Recycle Bin and "empty" it, the computer erases its note of the file's location, freeing the disc space for the further use. The actual data is left intact on the disc until some other file is eventually written over it. On a large hard drive, it may be a long time before the space is used again, so a lot of data can accumulate on it over the years.

A computer hard drive is like a library with a lazy librarian. When the director tells her to remove certain books, she removes them from the library catalogue but leaves the books on the shelves. Since her Director only checks the catalogue, he never discovers the difference, but people with the patience to look through the shelves can read the officially nonexistent books.

There is a world market for used computers and hard drives. Discarded, donated and stolen computers are shipped to developing countries where they are stripped for reusable parts. Data thieves can cheaply buy large quantities of hard drives and examine them quickly. They use publicly available data recovery software that was designed to help people retrieve mistakenly deleted files or rescue data from damaged discs. Such software looks for old versions of the file catalogues and scans the disc for file signatures. Unless new files have been saved on top of the old ones, the data can be easily recovered. This is good if it was an important file that was accidentally deleted, but it is bad if it is confidential information that could be used for fraud or blackmail.

Even files that have been overwritten by new files can be read with a machine called a magnetic force microscope. Used by police, spies, and private security consultants, these instruments can detect the traces of previous data in the tiny spaces between the individual bits of the current data. The magnetic head that writes on the disc does not always line up in exactly the same place, so small magnetized areas of the previously written data are visible at the edges of the new, like the blurred edges of a poorly printed color picture. With specialized equipment and patience the original data can be reconstructed.

If you are selling or recycling an old computer you could protect your confidential information by removing the hard drive and physically destroying it. This is the safest method, but it substantially reduces the value of the computer.

A better solution is to use data recovery prevention software. It is also a good idea to regularly clean laptop hard discs and memory sticks that could be lost or stolen.

Discs cannot actually be erased. All the bits can be written as zeros, but traces of the files beneath will still be there. Cleaning a disc involves overwriting data on it several times, until no traces could be recovered. Usually alternating sequences of numbers are used along with random numbers, instead of just zeros.

Data recovery protection programs must figure out which parts of the disc must be overwritten and which parts must be left alone, so the program does not damage current files. Two common methods exist. The first one, called low-level access, scans the disc and examines everything it finds to see if it is current or inactive. The second method, called high-level access, uses Windows' (or another operating system's) catalogue of files. High-level access has less risk of damaging existing files because it uses the same index as the computer itself.

How many overwrites are enough to make erased data unrecoverable? Security must be balanced with speed, practicality with paranoia. Overwriting a large hard drive can take hours, depending on the disc's size, its connection to the computer and the fragmentation of the file structure. On most of the hard drives made after 2000, 3 passes provides complete protection against ordinary data recovery programs, and 5 passes is very good protection against magnetic field microscopes. One hundred percent security is of course unobtainable, but each additional pass adds a little more. In 1996 Peter Gutmann of the University of Auckland calculated that as many as 35 overwrites were needed to achieve a high level of security on older discs. But as discs have become smaller and faster, the surface area and depth used to store each 'word' of data has been reduced, so there is less area between bits for blurs of previously written data. This means that modern hard drives need far fewer than 35 passes. US Department of Defense has published a standard of 7 passes, which is more than enough for ordinary users.

Ultimately, each user can decide how many passes are appropriate in his case. Good data recovery protection software should give users the ability to customize the work and help them to make a correct choice.

References

Author

LAZgroup - Business and Technology Solutions. "LAZgroup S.A." is software research and development company located in the city Geneva of Switzerland. Our primary focus is the creation of new algorithms and methods to improve various business processes, through the improvements in security, efficiency and speed.