Javascript – Getting md5sum of a file through Crypto.js

cryptojsfileapijavascriptlinuxmd5

I am trying to get the md5sum of a tar file to produce the same value when using the md5sum linux command and CryptoJS's MD5 method.

In JavaScript I do (after a file has been put in an HTML form):

var reader = new FileReader();

reader.onloadend = function () {
     text = (reader.result);
}

reader.readAsBinaryString(document.getElementById("firmware_firmware").files[0]);

var hash = CryptoJS.MD5(text);

hash.toString();

In Linux I do:

md5sum name_of_file.tar

Currently these two produce different results. How am I able to get JavaScript to get the contents of the tar file to be MD5ed in the same way that md5sum does on Linux?

For a simple String, md5sum and CryptoJS produce the same value.

Edit: With a file called Fred.txt, with content the content: "Fred", both md5sum and CryptoJS produce the same value: c624decb46fa3d60e824389311b252f6.

On the update.tar file, the md5sum on linux gives me: 1f046eedb7d8279953d233e590830e4f, on CryptoJS it gives me: f0c3730e5a9863cffa0ba3fadd531788

Edit2: Further testing shows that this is actually a problem due to large file size such as 7 MegaBytes

Best Answer

All strings in JavaScript - even "binary strings" - are actually UTF-16 characters. A "binary string" is one that chooses to use only the first 256 code points. Since the Latin-1 encoding also uses exactly the first 256 code points, you can convert the string to bytes using Latin-1.

var hash = CryptoJS.MD5(CryptoJS.enc.Latin1.parse(text));