Continuing with our digital forensics series, today we cover how to find and preserve digital evidence.

One of the most important innovations in the field of digital forensics is the hash function. A hash function maps a sequence of characters (called a string) to a binary number of a specific size (i.e., a fixed number of bits). For example, a 16-bit hash function can produce 216 = 65,536 possible values; a 32-bit hash function can produce 232 = 4,294,967,296 possible values; and so on. Some strings might have the same hash value (known as hash collision), but the more bits a hash function has, the less the likelihood of hash collision. Hash functions are a means for ensuring the integrity of forensic data and for recognizing specific files.

For example, since every sentence in a document can be treated as a string, you can quickly determine if the same paragraph is repeated in a long document. Simply compute the hash value for each paragraph, put them into a list, and see if any number occurs more than once. If not, then there is no duplication. If so, then you have to look at the corresponding paragraphs and see if there was actually duplication or if there was hash collision.

In 1979, a Stanford doctoral student named Ralph Merkle invented a way to use hash functions for computer security. His idea was to use a hash function that produced over 100 bits of data and was one-way. In other words, it would be easy to compute the hash function of a string, but extremely difficult to find the corresponding string of a given hash function. Therefore, one could use a 100-bit one-way hash function as a stand-in for the document itself and use that to certify the document. Because there are so many possible values for a 100-bit hash function, an attacker cannot take the digital signature from one document and use it to certify a second document, because this would require both documents to have the same hash value.

This innovation has been used to protect credit card numbers sent over the Internet, certify the authenticity of codes run on an iPhone, and to establish chains of custody for forensic data. In the case of forensic data, a hash function can be applied to an entire disk image. Law enforcement organizations can create two disk images of a drive and compute the hash function of each. If they match, the two copies are assumed to be a true copy of the data that were on the drive. Because it is unlikely that two files will have the same hash value, hash functions can also be used to label specific files, much in the same way that people can be identified by their fingerprints.

For assistance with any digital forensics investigations, contact Forletta. Our Pittsburgh and Cleveland private investigators are knowledgeable and are ready to help you with any of your investigation needs.

Source: