MD5 Sums in Java

William John Holden
2011-01-30
  1. Abstract
  2. Requirement
  3. Hashing a file
  4. Hashing a file as it downloads
  5. Hashing a byte array
  6. Building the XML Package
  7. Reconstructing the XML Package
  8. Reconstructing the XML Package over the Web
  9. Conclusion

Abstract

This tutorial should provide a reasonable launch point for anyone wanting to use MD5 sums in Java to hash files, streams, and data structures.

Requirement

I needed a reasonably good downloader program. It needs to transfer large files in separate chunks and each chunk needed to be hashed along the way to ensure maximum reliability. Although I'm not finished with the project itself, I thought this was an interesting enough project that I would share my findings and my code with the world through a tutorial. Read, enjoy, and e-mail me if you have any constructive criticisms.
The final product created in this tutorial is a simple JAR accessed via command-line:
C:\Users\John\Desktop\md5test>java -classpath md5.jar com.wjholden.md5test.Md5Test
java -classpath md5test.jar com.wjholden.md5test.Md5Test [-file|-http|-wget] [filename|uri]
java -classpath md5test.jar com.wjholden.md5test.Md5Test [-xmlpkg] [input] [output] [chunk size in bytes]
java -classpath md5test.jar com.wjholden.md5test.Md5Test [-xmlasm] [filename]
java -classpath md5test.jar com.wjholden.md5test.Md5Test [-xmlhttp] [uri]

Hashing a file

Notes

Hashing a file is relatively easy; simply create a MessageDigest object and attach it to your InputStream. As you read bytes in the signature is automatically maintained. Notice The use of BigInteger - this is to ensure the digests bytes are read in Big Endian order. toString(16) just means output in hexidecimal (which is what you're used to seeing with programs such as md5sums).

Code

  1. private static String hashFile(final String filename, boolean showOutput) throws NoSuchAlgorithmException, IOException {
  2.   MessageDigest md = MessageDigest.getInstance("MD5");
  3.   InputStream is = new FileInputStream(filename);
  4.   is = new DigestInputStream(is, md);
  5.   while (is.read() != -1)
  6.     ;
  7.   is.close();
  8.   String signature = new BigInteger(1, md.digest()).toString(16);
  9.   while (signature.length() < 32) {
  10.     signature = "0" + signature;
  11.   }
  12.   if (showOutput)
  13.     System.out.println("Digest = " + signature);
  14.   return (signature);
  15. }

Result

C:\Users\John\Desktop\md5test>md5sums abc.txt

MD5sums 1.2 freeware for Win9x/ME/NT/2000/XP+
Copyright (C) 2001-2005 Jem Berkes - http://www.pc-tools.net/
Type md5sums -h for help

[Path] / filename                              MD5 sum
-------------------------------------------------------------------------------
[C:\Users\John\Desktop\md5test\]
abc.txt                                        3620fd5aa097423f5aabe60d0bacdff7

C:\Users\John\Desktop\md5test>java -classpath md5.jar com.wjholden.md5test.Md5Test -file abc.txt
Digest = 3620fd5aa097423f5aabe60d0bacdff7

Hashing a file as it downloads

Notes

This algorithm is very similar to the last, only instead of opening an InputStream from a FileInputStream it instead uses URL.openStream(). This method also gives the user an option as to whether to save the file as it downloads or to discard that data.

Code

  1. private static String hashHttp(final String uri, final boolean writeFile, boolean showOutput) throws NoSuchAlgorithmException, IOException {
  2.   MessageDigest md = MessageDigest.getInstance("MD5");
  3.   URL url = new URL(uri);
  4.   InputStream is = url.openStream();
  5.   is = new DigestInputStream(is, md);
  6.   if (writeFile) { // save output into some file.
  7.     String filename = uri.substring(uri.lastIndexOf('/') + 1);
  8.     if (showOutput)
  9.       System.out.println("Writing file to " + filename);
  10.     FileOutputStream out = new FileOutputStream(filename);
  11.     int i;
  12.     while ((i = is.read()) != -1) {
  13.       out.write(i);
  14.     }
  15.     out.close();
  16.   } else {
  17.     while (is.read() != -1)
  18.       ; // discard data downloaded
  19.   }
  20.   is.close();
  21.   String signature = new BigInteger(1, md.digest()).toString(16);
  22.   while (signature.length() < 32) {
  23.     signature = "0" + signature;
  24.   }
  25.   if (showOutput)
  26.     System.out.println("Digest = " + signature);
  27.   return signature;
  28. }

Result

C:\Users\John\Desktop\md5test>curl -o s1.png http://wjholden.com/nmap/s1.png
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100  114k  100  114k    0     0   108k      0  0:00:01  0:00:01 --:--:--  134k

C:\Users\John\Desktop\md5test>md5sums s1.png

MD5sums 1.2 freeware for Win9x/ME/NT/2000/XP+
Copyright (C) 2001-2005 Jem Berkes - http://www.pc-tools.net/
Type md5sums -h for help

[Path] / filename                              MD5 sum
-------------------------------------------------------------------------------
[C:\Users\John\Desktop\md5test\]
s1.png                                         fc381f0763021cc474fe962e5c0c82f9

C:\Users\John\Desktop\md5test>java -classpath md5.jar com.wjholden.md5test.Md5Test -http http://wjholden.com/nmap/s1.png
Digest = fc381f0763021cc474fe962e5c0c82f9

Hashing a byte array

Notes

Hashing a byte array is easy, however, a caveat to this process is that you must be careful about the size of your byte array. Hashing byte tooBig[8] = "Hello".getBytes() will produce a very different hash from byte justRight[6] = "Hello".getBytes().

Code

  1. private static String hashByteArray(byte array[]) throws NoSuchAlgorithmException {
  2.   MessageDigest md = MessageDigest.getInstance("MD5");
  3.   String signature = new String(Hex.encodeHex(md.digest(array)));
  4.   return signature;
  5. }

A basic XML package

Notes

This is an example of the XML package that I will create. This XML file shows the original files name and the MD5 sums of it's 'chunks' once a split-style operation has completed.

Code

  1. <?xml version='1.0' encoding='ISO-8859-1' standalone='yes'?>
  2. <SecureDownload>
  3.   <file name='abc.txt'>
  4.     <chunk>26b6350aacc2c3105b7c5628080bae0b</chunk>
  5.     <chunk>ab070fbcc4aeb75e742a3d0e71ab03db</chunk>
  6.   </file>
  7. </SecureDownload>

Building the XML Package

Notes

This algorithm performs an ugly but functional 'split' operation (notice the tricky double casting), then begins writing the XML file. Chunks resulting from the split are saved into files where the filename is equal to the MD5 hash. The MD5 hash of the file written is equal to the MD5 hash of the byte array before writing (precisely what we need).

Code

  1. private static void createXmlPackage(final String inputFile, final String outputFile, final int chunksize, boolean showOutput) throws IOException,
  2.   File in = new File(inputFile);
  3.   InputStream is = new FileInputStream(in);
  4.   long filesize = in.length();
  5.  
  6.   double numberOfChunksDouble = ((double) filesize) / ((double) chunksize);
  7.   int numberOfChunks = (int) (Math.ceil(numberOfChunksDouble));
  8.  
  9.   if (showOutput)
  10.     System.out.println(inputFile + " is " + filesize + " bytes, with chunk size " + chunksize + " this results in " + numberOfChunks + " chunks.");
  11.  
  12.   FileOutputStream xmlout = new FileOutputStream(outputFile);
  13.   xmlout.write("<?xml version='1.0' encoding='ISO-8859-1' standalone='yes'?>\n".getBytes());
  14.   xmlout.write("<SecureDownload>\n".getBytes());
  15.   xmlout.write(("\t<file name='" + inputFile + "'>\n").getBytes());
  16.  
  17.   for (int i = 0; i < numberOfChunks; i++) {
  18.     if (showOutput)
  19.       System.out.print("Chunk " + i + "\t");
  20.     int readBytes = 0;
  21.     int c;
  22.     byte chunk[] = new byte[chunksize];
  23.    
  24.     while (readBytes < chunksize && (c = is.read()) != -1) {
  25.       // notice that the cast does not occur until here.
  26.       // If c was type byte then the check for -1 causes
  27.       // strange behavior on binary files.
  28.       chunk[readBytes++] = (byte) c;
  29.     }
  30.     if (showOutput)
  31.       System.out.print(readBytes + " bytes\t");
  32.  
  33.     // due to caveat with hashByteArray method, create new byte array of length equal to number of bytes read.
  34.     byte hashArray[] = new byte[readBytes];
  35.     for (int k=0; k<readBytes; k++)
  36.     {
  37.       hashArray[k] = chunk[k];
  38.     }
  39.     String signature = hashByteArray(hashArray);
  40.     if (showOutput)
  41.       System.out.println(signature);
  42.  
  43.     xmlout.write(("\t\t<chunk>" + signature + "</chunk>\n").getBytes());
  44.    
  45.     int writeBytes = 0;
  46.     FileOutputStream out = new FileOutputStream(signature);
  47.     while (writeBytes < hashArray.length) {
  48.       out.write(hashArray[writeBytes++]);
  49.     }
  50.     out.close();
  51.   }
  52.  
  53.   is.close();
  54.   xmlout.write("\t</file>\n</SecureDownload>\n\n".getBytes());
  55.   xmlout.close();
  56. }

Result

  1. C:\Users\John\Desktop\md5test>java -classpath md5.jar com.wjholden.md5test.Md5Test -xmlpkg s1.png s1.xml 20000
  2. s1.png is 117608 bytes, with chunk size 20000 this results in 6 chunks.
  3. Chunk 0 20000 bytes     6de1a7f670ea9fcfba815ae638e59a3b
  4. Chunk 1 20000 bytes     95db88aa33c8553566c784e8b309725e
  5. Chunk 2 20000 bytes     e6123866650a1a545124da921934f137
  6. Chunk 3 20000 bytes     92d1cc26e3cfb09c6e37571be9f7c2a3
  7. Chunk 4 20000 bytes     95985139b60373118010813960aae632
  8. Chunk 5 17608 bytes     b36d6bd2ef406f9242655eca42fdc6bf

Reconstructing the XML Package

Nodes

Now you've got an XML file and corresponding split data file where each 'chunk' has a filename equal to it's MD5 sum. Time to reconstruct the original file. Line 46 is where the computed hash of the file is checked against it's filename provided in XML. I'm not expert on parsing XML in Java but this code I hacked together using bits and pieces on the internet works.

Code

  1. private static void reassembleXmlPkg(final String xmlpkg, boolean showOutput) throws ParserConfigurationException, SAXException, IOException,
  2.   DocumentBuilderFactory dbf = DocumentBuilderFactory.newInstance();
  3.   DocumentBuilder db = dbf.newDocumentBuilder();
  4.   Document doc = db.parse(new File(xmlpkg));
  5.  
  6.   doc.getDocumentElement().normalize();
  7.   if (showOutput)
  8.     System.out.println("Root element of " + xmlpkg + " is " + doc.getDocumentElement().getNodeName());
  9.  
  10.   NodeList listOfFiles = doc.getElementsByTagName("file");
  11.   int totalFiles = listOfFiles.getLength();
  12.   if (showOutput)
  13.     System.out.println("Total no of files = " + totalFiles);
  14.  
  15.   for (int s = 0; s < totalFiles; s++) {
  16.     Node fileNode = listOfFiles.item(s);
  17.  
  18.     if (fileNode.getNodeType() == Node.ELEMENT_NODE) {
  19.       if (showOutput)
  20.         System.out.println("Node " + s + " is an ELEMENT_NODE.");
  21.       Element e = (Element) fileNode;
  22.       if (showOutput)
  23.         System.out.println("\t" + e.getAttribute("name"));
  24.       FileOutputStream out = new FileOutputStream(e.getAttribute("name"));
  25.  
  26.       NodeList chunksList = e.getElementsByTagName("chunk");
  27.       int numberOfChunks = chunksList.getLength();
  28.       if (showOutput)
  29.         System.out.println("\t\tNumber of chunks = " + numberOfChunks);
  30.  
  31.       for (int t = 0; t < numberOfChunks; t++) {
  32.         /*
  33.          * This is pretty lame. You have to get a list of the
  34.          * 'chunk' items before you can get the value out of the
  35.          * Node. I guess there's something I don't yet understand
  36.          * about XML (or maybe just the SAX parser).
  37.          */
  38.         Element chunkElement = (Element) chunksList.item(t);
  39.         NodeList chunkListFinal = chunkElement.getChildNodes();
  40.         Node chunk = chunkListFinal.item(0);
  41.         if (showOutput)
  42.           System.out.println("\t\t" + chunk.getNodeValue());
  43.  
  44.         String myMd5 = chunk.getNodeValue();
  45.         if (myMd5.equals(hashFile(myMd5, false))) {
  46.           int c;
  47.           InputStream is = new FileInputStream(chunk.getNodeValue());
  48.           while ((c = is.read()) != -1) {
  49.             out.write(c);
  50.           }
  51.         } else {
  52.           System.err.println(myMd5 + " failed hashing check. Aborting write.");
  53.           return;
  54.         }
  55.       }
  56.       out.close();
  57.     }
  58.   }
  59. }

Result

  1. C:\Users\John\Desktop\md5test>move s1.png s1_original.png
  2.         1 file(s) moved.
  3.  
  4. C:\Users\John\Desktop\md5test>java -classpath md5.jar com.wjholden.md5test.Md5Test -xmlasm s1.xml
  5. Root element of s1.xml is SecureDownload
  6. Total no of files = 1
  7. Node 0 is an ELEMENT_NODE.
  8.         s1.png
  9.                 Number of chunks = 6
  10.                 6de1a7f670ea9fcfba815ae638e59a3b
  11.                 95db88aa33c8553566c784e8b309725e
  12.                 e6123866650a1a545124da921934f137
  13.                 92d1cc26e3cfb09c6e37571be9f7c2a3
  14.                 95985139b60373118010813960aae632
  15.                 b36d6bd2ef406f9242655eca42fdc6bf
  16.  
  17. C:\Users\John\Desktop\md5test>md5sums s1.png s1_original.png
  18.  
  19. MD5sums 1.2 freeware for Win9x/ME/NT/2000/XP+
  20. Copyright (C) 2001-2005 Jem Berkes - http://www.pc-tools.net/
  21. Type md5sums -h for help
  22.  
  23. [Path] / filename                              MD5 sum
  24. -------------------------------------------------------------------------------
  25. [C:\Users\John\Desktop\md5test\]
  26. s1.png                                         fc381f0763021cc474fe962e5c0c82f9
  27. s1_original.png                                fc381f0763021cc474fe962e5c0c82f9

Reconstructing the XML Package over the Web

Notes

We now have the necessary components to slightly modify reassembleXmlPkg to:
  1. Download the XML file.
  2. Use hashHttp instead of hashFile to download and hash the chunks.
One caveat here is FTP may tamper binary files if you transfer them in ASCII mode. Be careful with uploading! I believe (but cannot immediately confirm) SCP doesn't suffer this problem.

Code

  1. private static void reassembleXmlPkgFromURI(final String uri, boolean showOutput) throws ParserConfigurationException, SAXException, IOException,
  2.   // Download the XML file, then parse what you receive.
  3.   URL url = new URL(uri);
  4.   InputStream is = url.openStream();
  5.   String xmlpkg = uri.substring(uri.lastIndexOf('/') + 1);
  6.   FileOutputStream out = new FileOutputStream(xmlpkg);
  7.   int c;
  8.   while ((c = is.read()) != -1)
  9.   {
  10.     out.write(c);
  11.   }
  12.   out.close();
  13.   is.close();
  14.  
  15.   DocumentBuilderFactory dbf = DocumentBuilderFactory.newInstance();
  16.   DocumentBuilder db = dbf.newDocumentBuilder();
  17.   Document doc = db.parse(new File(xmlpkg));
  18.  
  19.   doc.getDocumentElement().normalize();
  20.   if (showOutput)
  21.     System.out.println("Root element of " + xmlpkg + " is " + doc.getDocumentElement().getNodeName());
  22.  
  23.   NodeList listOfFiles = doc.getElementsByTagName("file");
  24.   int totalFiles = listOfFiles.getLength();
  25.   if (showOutput)
  26.     System.out.println("Total no of files = " + totalFiles);
  27.  
  28.   for (int s = 0; s < totalFiles; s++) {
  29.     Node fileNode = listOfFiles.item(s);
  30.  
  31.     if (fileNode.getNodeType() == Node.ELEMENT_NODE) {
  32.       if (showOutput)
  33.         System.out.println("Node " + s + " is an ELEMENT_NODE.");
  34.       Element e = (Element) fileNode;
  35.       if (showOutput)
  36.         System.out.println("\t" + e.getAttribute("name"));
  37.       FileOutputStream out1 = new FileOutputStream(e.getAttribute("name"));
  38.  
  39.       NodeList chunksList = e.getElementsByTagName("chunk");
  40.       int numberOfChunks = chunksList.getLength();
  41.       if (showOutput)
  42.         System.out.println("\t\tNumber of chunks = " + numberOfChunks);
  43.  
  44.       for (int t = 0; t < numberOfChunks; t++) {
  45.         Element chunkElement = (Element) chunksList.item(t);
  46.         NodeList chunkListFinal = chunkElement.getChildNodes();
  47.         Node chunk = chunkListFinal.item(0);
  48.         if (showOutput)
  49.           System.out.println("\t\t" + chunk.getNodeValue());
  50.  
  51.         String myMd5 = chunk.getNodeValue();
  52.         // Use hashHttp to download and hash chunk at once.
  53.         String myUri = uri.substring(0, uri.lastIndexOf('/') + 1) + myMd5;
  54.         if (showOutput)
  55.           System.out.println("Downloading and hashing " + myUri);
  56.         if (myMd5.equals(hashHttp(myUri, true, false))) {
  57.           int c1;
  58.           InputStream is1 = new FileInputStream(chunk.getNodeValue());
  59.           while ((c1 = is1.read()) != -1) {
  60.             out1.write(c1);
  61.           }
  62.         } else {
  63.           System.err.println(myMd5 + " failed hashing check. Aborting write.");
  64.           return;
  65.         }
  66.       }
  67.       out1.close();
  68.     }
  69.   }
  70. }

Result

C:\Users\John\Desktop\md5test>curl -o s1.png http://wjholden.com/nmap/s1.png
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100  114k  100  114k    0     0  33961      0  0:00:03  0:00:03 --:--:-- 36040

C:\Users\John\Desktop\md5test>md5sums s1.png

MD5sums 1.2 freeware for Win9x/ME/NT/2000/XP+
Copyright (C) 2001-2005 Jem Berkes - http://www.pc-tools.net/
Type md5sums -h for help

[Path] / filename                              MD5 sum
-------------------------------------------------------------------------------
[C:\Users\John\Desktop\md5test\]
s1.png                                         fc381f0763021cc474fe962e5c0c82f9

C:\Users\John\Desktop\md5test>java -classpath md5.jar com.wjholden.md5test.Md5Test -file s1.png
Digest = fc381f0763021cc474fe962e5c0c82f9

C:\Users\John\Desktop\md5test>java -classpath md5.jar com.wjholden.md5test.Md5Test -http http://wjholden.com/nmap/s1.png
Digest = fc381f0763021cc474fe962e5c0c82f9

C:\Users\John\Desktop\md5test>java -classpath md5.jar com.wjholden.md5test.Md5Test -
java -classpath md5test.jar com.wjholden.md5test.Md5Test [-file|-http|-wget] [filename|uri]
java -classpath md5test.jar com.wjholden.md5test.Md5Test [-xmlpkg] [input] [output] [chunk size in bytes]
java -classpath md5test.jar com.wjholden.md5test.Md5Test [-xmlasm] [filename]
java -classpath md5test.jar com.wjholden.md5test.Md5Test [-xmlhttp] [uri]

C:\Users\John\Desktop\md5test>java -classpath md5.jar com.wjholden.md5test.Md5Test -xmlpkg s1.png s1.xml 20000
s1.png is 117608 bytes, with chunk size 20000 this results in 6 chunks.
Chunk 0 20000 bytes     6de1a7f670ea9fcfba815ae638e59a3b
Chunk 1 20000 bytes     95db88aa33c8553566c784e8b309725e
Chunk 2 20000 bytes     e6123866650a1a545124da921934f137
Chunk 3 20000 bytes     92d1cc26e3cfb09c6e37571be9f7c2a3
Chunk 4 20000 bytes     95985139b60373118010813960aae632
Chunk 5 17608 bytes     b36d6bd2ef406f9242655eca42fdc6bf

C:\Users\John\Desktop\md5test>del s1.png

C:\Users\John\Desktop\md5test>java -classpath md5.jar com.wjholden.md5test.Md5Test -xmlasm s1.xml
Root element of s1.xml is SecureDownload
Total no of files = 1
Node 0 is an ELEMENT_NODE.
        s1.png
                Number of chunks = 6
                6de1a7f670ea9fcfba815ae638e59a3b
                95db88aa33c8553566c784e8b309725e
                e6123866650a1a545124da921934f137
                92d1cc26e3cfb09c6e37571be9f7c2a3
                95985139b60373118010813960aae632
                b36d6bd2ef406f9242655eca42fdc6bf

C:\Users\John\Desktop\md5test>java -classpath md5.jar com.wjholden.md5test.Md5Test -file s1.png
Digest = fc381f0763021cc474fe962e5c0c82f9

C:\Users\John\Desktop\md5test>del s1.png

C:\Users\John\Desktop\md5test>del s1.xml

C:\Users\John\Desktop\md5test>REM I've just uploaded all these files to my webserver.

C:\Users\John\Desktop\md5test>REM Note that you can really screw yourself over with FTP if you don't switch to BIN or ASCII.

C:\Users\John\Desktop\md5test>java -classpath md5.jar com.wjholden.md5test.Md5Test -xmlhttp http://wjholden.com/md5/s1.xml
Root element of s1.xml is SecureDownload
Total no of files = 1
Node 0 is an ELEMENT_NODE.
        s1.png
                Number of chunks = 6
                6de1a7f670ea9fcfba815ae638e59a3b
Downloading and hashing http://wjholden.com/md5/6de1a7f670ea9fcfba815ae638e59a3b
                95db88aa33c8553566c784e8b309725e
Downloading and hashing http://wjholden.com/md5/95db88aa33c8553566c784e8b309725e
                e6123866650a1a545124da921934f137
Downloading and hashing http://wjholden.com/md5/e6123866650a1a545124da921934f137
                92d1cc26e3cfb09c6e37571be9f7c2a3
Downloading and hashing http://wjholden.com/md5/92d1cc26e3cfb09c6e37571be9f7c2a3
                95985139b60373118010813960aae632
Downloading and hashing http://wjholden.com/md5/95985139b60373118010813960aae632
                b36d6bd2ef406f9242655eca42fdc6bf
Downloading and hashing http://wjholden.com/md5/b36d6bd2ef406f9242655eca42fdc6bf

C:\Users\John\Desktop\md5test>java -classpath md5.jar com.wjholden.md5test.Md5Test -file s1.png
Digest = fc381f0763021cc474fe962e5c0c82f9

Conclusion

Acknowledgements

This software was written in Eclipse with Oracle Java SE 6.
This software depends on the Apache Commons Codec.
Syntax highlighting for this web page is provided by Quick Highlighter.

Downloads

Here is the source code, executable JAR, and JavaDoc.
I hope this is, in any way, useful to you. It's been a pleasure to write.

Code

  1. package com.wjholden.md5test;
  2.  
  3. import java.io.File;
  4. import java.io.FileInputStream;
  5. import java.io.FileOutputStream;
  6. import java.io.IOException;
  7. import java.io.InputStream;
  8. import java.math.BigInteger;
  9. import java.net.URL;
  10. import java.security.DigestInputStream;
  11. import java.security.MessageDigest;
  12. import java.security.NoSuchAlgorithmException;
  13.  
  14. import javax.xml.parsers.DocumentBuilder;
  15. import javax.xml.parsers.DocumentBuilderFactory;
  16. import javax.xml.parsers.ParserConfigurationException;
  17.  
  18. import org.apache.commons.codec.binary.Hex;
  19. import org.w3c.dom.Document;
  20. import org.w3c.dom.Element;
  21. import org.w3c.dom.Node;
  22. import org.w3c.dom.NodeList;
  23. import org.xml.sax.SAXException;
  24.  
  25. /**
  26.  * &nbsp;&nbsp;&lt;?xml version='1.0' encoding='ISO-8859-1' standalone='yes'?&gt;<br>
  27.  * &nbsp;&nbsp;&lt;SecureDownload&gt;<br>
  28.  * &nbsp;&nbsp;&nbsp;&nbsp;&lt;file name='abc.txt'&gt;<br>
  29.  * &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&lt;chunk&gt;26b6350aacc2c3105b7c5628080bae0b&lt;/chunk&gt;<br>
  30.  * &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&lt;chunk&gt;ab070fbcc4aeb75e742a3d0e71ab03db&lt;/chunk&gt;<br>
  31.  * &nbsp;&nbsp;&nbsp;&nbsp;&lt;/file&gt;<br>
  32.  * &nbsp;&nbsp;&lt;/SecureDownload&gt;<br>
  33.  * @author William John Holden (wjholden@gmail.com)
  34.  * @version 1
  35.  */
  36. public class Md5Test {
  37.  
  38.   private static final String USAGE = "java -classpath md5test.jar com.wjholden.md5test.Md5Test [-file|-http|-wget] [filename|uri]\n"
  39.       + "java -classpath md5test.jar com.wjholden.md5test.Md5Test [-xmlpkg] [input] [output] [chunk size in bytes]\n"
  40.       + "java -classpath md5test.jar com.wjholden.md5test.Md5Test [-xmlasm] [filename]\n"
  41.       + "java -classpath md5test.jar com.wjholden.md5test.Md5Test [-xmlhttp] [uri]";
  42.  
  43.   /**
  44.    * Calculate the MD5 hash of a file.
  45.    * @param filename
  46.    * @param showOutput
  47.    * @return
  48.    * @throws NoSuchAlgorithmException
  49.    * @throws IOException
  50.    */
  51.   private static String hashFile(final String filename, boolean showOutput) throws NoSuchAlgorithmException, IOException {
  52.     MessageDigest md = MessageDigest.getInstance("MD5");
  53.     InputStream is = new FileInputStream(filename);
  54.     is = new DigestInputStream(is, md);
  55.     while (is.read() != -1)
  56.       ;
  57.     is.close();
  58.     String signature = new BigInteger(1, md.digest()).toString(16);
  59.     while (signature.length() < 32) {
  60.       signature = "0" + signature;
  61.     }
  62.     if (showOutput)
  63.       System.out.println("Digest = " + signature);
  64.     return (signature);
  65.   }
  66.  
  67.   /**
  68.    * Grabs a single file from the world wide web and calculates it's hash as it downloads.
  69.    * @param uri
  70.    * @param writeFile If true, actually save the file to the current working directory, otherwise discard data.
  71.    * @param showOutput
  72.    * @throws NoSuchAlgorithmException
  73.    * @throws IOException
  74.    */
  75.   private static String hashHttp(final String uri, final boolean writeFile, boolean showOutput) throws NoSuchAlgorithmException, IOException {
  76.     MessageDigest md = MessageDigest.getInstance("MD5");
  77.     URL url = new URL(uri);
  78.     InputStream is = url.openStream();
  79.     is = new DigestInputStream(is, md);
  80.     if (writeFile) { // save output into some file.
  81.       String filename = uri.substring(uri.lastIndexOf('/') + 1);
  82.       if (showOutput)
  83.         System.out.println("Writing file to " + filename);
  84.       FileOutputStream out = new FileOutputStream(filename);
  85.       int i;
  86.       while ((i = is.read()) != -1) {
  87.         out.write(i);
  88.       }
  89.       out.close();
  90.     } else {
  91.       while (is.read() != -1)
  92.         ; // discard data downloaded
  93.     }
  94.     is.close();
  95.     String signature = new BigInteger(1, md.digest()).toString(16);
  96.     while (signature.length() < 32) {
  97.       signature = "0" + signature;
  98.     }
  99.     if (showOutput)
  100.       System.out.println("Digest = " + signature);
  101.     return signature;
  102.   }
  103.  
  104.   /**
  105.    * Doesn't actually perform the hashes, this was just me developing a method to parse the XML.
  106.    * Use reassembleXmlPkg.
  107.    * @param filename
  108.    * @param showOutput
  109.    * @throws ParserConfigurationException
  110.    * @throws SAXException
  111.    * @throws IOException
  112.    */
  113.   private static void hashXml(final String filename, boolean showOutput) throws ParserConfigurationException, SAXException, IOException {
  114.     DocumentBuilderFactory dbf = DocumentBuilderFactory.newInstance();
  115.     DocumentBuilder db = dbf.newDocumentBuilder();
  116.     Document doc = db.parse(new File(filename));
  117.  
  118.     doc.getDocumentElement().normalize();
  119.     if (showOutput)
  120.       System.out.println("Root element of " + filename + " is " + doc.getDocumentElement().getNodeName());
  121.  
  122.     NodeList listOfFiles = doc.getElementsByTagName("file");
  123.     int totalFiles = listOfFiles.getLength();
  124.     if (showOutput)
  125.       System.out.println("Total no of files = " + totalFiles);
  126.  
  127.     for (int s = 0; s < totalFiles; s++) {
  128.       Node fileNode = listOfFiles.item(s);
  129.  
  130.       if (fileNode.getNodeType() == Node.ELEMENT_NODE) {
  131.         if (showOutput)
  132.           System.out.println("Node " + s + " is an ELEMENT_NODE.");
  133.         Element e = (Element) fileNode;
  134.         if (showOutput)
  135.           System.out.println("\t" + e.getAttribute("name"));
  136.  
  137.         NodeList chunksList = e.getElementsByTagName("chunk");
  138.         int numberOfChunks = chunksList.getLength();
  139.         if (showOutput)
  140.           System.out.println("\t\tNumber of chunks = " + numberOfChunks);
  141.  
  142.         for (int t = 0; t < numberOfChunks; t++) {
  143.           /*
  144.            * This is pretty lame. You have to get a list of the
  145.            * 'chunk' items before you can get the value out of the
  146.            * Node. I guess there's something I don't yet understand
  147.            * about XML (or maybe just the SAX parser).
  148.            */
  149.           Element chunkElement = (Element) chunksList.item(t);
  150.           NodeList chunkListFinal = chunkElement.getChildNodes();
  151.           Node chunk = chunkListFinal.item(0);
  152.           if (showOutput)
  153.             System.out.println("\t\t" + chunk.getNodeValue());
  154.         }
  155.       }
  156.     }
  157.   }
  158.  
  159.   /**
  160.    * Calculate the MD5 sum of a byte array.<br><br>
  161.    * Warning: this method has a very tricky bug. If you're using this before writing a
  162.    * byte array to file/network, make sure you don't give this method a longer array than
  163.    * you need or the hash will be different.<br>
  164.    * &nbsp;&nbsp;&nbsp;byte tooBig[8] = "Hello".getBytes()<br>
  165.    * &nbsp;&nbsp;&nbsp;byte justRight[6] = "Hello".getBytes()<br>
  166.    * Hashing the two arrays shown above will result in different results. Stay safe.
  167.    * @param array
  168.    * @return
  169.    * @throws NoSuchAlgorithmException
  170.    */
  171.   private static String hashByteArray(byte array[]) throws NoSuchAlgorithmException {
  172.     MessageDigest md = MessageDigest.getInstance("MD5");
  173.     String signature = new String(Hex.encodeHex(md.digest(array)));
  174.     return signature;
  175.   }
  176.  
  177.   /**
  178.    * Build an XML file AND split inputFile into files where filename is the same as their MD5 hash.
  179.    * @param inputFile
  180.    * @param outputFile
  181.    * @param chunksize
  182.    * @param showOutput
  183.    * @throws IOException
  184.    * @throws NoSuchAlgorithmException
  185.    */
  186.   private static void createXmlPackage(final String inputFile, final String outputFile, final int chunksize, boolean showOutput) throws IOException,
  187.     File in = new File(inputFile);
  188.     InputStream is = new FileInputStream(in);
  189.     long filesize = in.length();
  190.  
  191.     double numberOfChunksDouble = ((double) filesize) / ((double) chunksize);
  192.     int numberOfChunks = (int) (Math.ceil(numberOfChunksDouble));
  193.  
  194.     if (showOutput)
  195.       System.out.println(inputFile + " is " + filesize + " bytes, with chunk size " + chunksize + " this results in " + numberOfChunks + " chunks.");
  196.  
  197.     FileOutputStream xmlout = new FileOutputStream(outputFile);
  198.     xmlout.write("<?xml version='1.0' encoding='ISO-8859-1' standalone='yes'?>\n".getBytes());
  199.     xmlout.write("<SecureDownload>\n".getBytes());
  200.     xmlout.write(("\t<file name='" + inputFile + "'>\n").getBytes());
  201.  
  202.     for (int i = 0; i < numberOfChunks; i++) {
  203.       if (showOutput)
  204.         System.out.print("Chunk " + i + "\t");
  205.       int readBytes = 0;
  206.       int c;
  207.       byte chunk[] = new byte[chunksize];
  208.      
  209.       while (readBytes < chunksize && (c = is.read()) != -1) {
  210.         // notice that the cast does not occur until here.
  211.         // If c was type byte then the check for -1 causes
  212.         // strange behavior on binary files.
  213.         chunk[readBytes++] = (byte) c;
  214.       }
  215.       if (showOutput)
  216.         System.out.print(readBytes + " bytes\t");
  217.  
  218.       // due to caveat with hashByteArray method, create new byte array of length equal to number of bytes read.
  219.       byte hashArray[] = new byte[readBytes];
  220.       for (int k=0; k<readBytes; k++)
  221.       {
  222.         hashArray[k] = chunk[k];
  223.       }
  224.       String signature = hashByteArray(hashArray);
  225.       if (showOutput)
  226.         System.out.println(signature);
  227.  
  228.       xmlout.write(("\t\t<chunk>" + signature + "</chunk>\n").getBytes());
  229.      
  230.       int writeBytes = 0;
  231.       FileOutputStream out = new FileOutputStream(signature);
  232.       while (writeBytes < hashArray.length) {
  233.         out.write(hashArray[writeBytes++]);
  234.       }
  235.       out.close();
  236.     }
  237.  
  238.     is.close();
  239.     xmlout.write("\t</file>\n</SecureDownload>\n\n".getBytes());
  240.     xmlout.close();
  241.   }
  242.  
  243.   /**
  244.    * On the local filesystem, reassemble a file that was previously broken
  245.    * up by createXmlPackage, where xmlpkg is the filename of the XML file.
  246.    * @param xmlpkg
  247.    * @param showOutput
  248.    * @throws ParserConfigurationException
  249.    * @throws SAXException
  250.    * @throws IOException
  251.    * @throws NoSuchAlgorithmException
  252.    */
  253. private static void reassembleXmlPkg(final String xmlpkg, boolean showOutput) throws ParserConfigurationException, SAXException, IOException,
  254.   DocumentBuilderFactory dbf = DocumentBuilderFactory.newInstance();
  255.   DocumentBuilder db = dbf.newDocumentBuilder();
  256.   Document doc = db.parse(new File(xmlpkg));
  257.  
  258.   doc.getDocumentElement().normalize();
  259.   if (showOutput)
  260.     System.out.println("Root element of " + xmlpkg + " is " + doc.getDocumentElement().getNodeName());
  261.  
  262.   NodeList listOfFiles = doc.getElementsByTagName("file");
  263.   int totalFiles = listOfFiles.getLength();
  264.   if (showOutput)
  265.     System.out.println("Total no of files = " + totalFiles);
  266.  
  267.   for (int s = 0; s < totalFiles; s++) {
  268.     Node fileNode = listOfFiles.item(s);
  269.  
  270.     if (fileNode.getNodeType() == Node.ELEMENT_NODE) {
  271.       if (showOutput)
  272.         System.out.println("Node " + s + " is an ELEMENT_NODE.");
  273.       Element e = (Element) fileNode;
  274.       if (showOutput)
  275.         System.out.println("\t" + e.getAttribute("name"));
  276.       FileOutputStream out = new FileOutputStream(e.getAttribute("name"));
  277.  
  278.       NodeList chunksList = e.getElementsByTagName("chunk");
  279.       int numberOfChunks = chunksList.getLength();
  280.       if (showOutput)
  281.         System.out.println("\t\tNumber of chunks = " + numberOfChunks);
  282.  
  283.       for (int t = 0; t < numberOfChunks; t++) {
  284.         /*
  285.          * This is pretty lame. You have to get a list of the
  286.          * 'chunk' items before you can get the value out of the
  287.          * Node. I guess there's something I don't yet understand
  288.          * about XML (or maybe just the SAX parser).
  289.          */
  290.         Element chunkElement = (Element) chunksList.item(t);
  291.         NodeList chunkListFinal = chunkElement.getChildNodes();
  292.         Node chunk = chunkListFinal.item(0);
  293.         if (showOutput)
  294.           System.out.println("\t\t" + chunk.getNodeValue());
  295.  
  296.         String myMd5 = chunk.getNodeValue();
  297.         if (myMd5.equals(hashFile(myMd5, false))) {
  298.           int c;
  299.           InputStream is = new FileInputStream(chunk.getNodeValue());
  300.           while ((c = is.read()) != -1) {
  301.             out.write(c);
  302.           }
  303.         } else {
  304.           System.err.println(myMd5 + " failed hashing check. Aborting write.");
  305.           return;
  306.         }
  307.       }
  308.       out.close();
  309.     }
  310.   }
  311. }
  312.  
  313.   /**
  314.    * This is mostly a copy-and-paste job from my earlier hashHttp and reassembleXmlPkg methods.
  315.    * @param uri
  316.    * @param showOutput
  317.    * @throws ParserConfigurationException
  318.    * @throws SAXException
  319.    * @throws IOException
  320.    * @throws NoSuchAlgorithmException
  321.    */
  322. private static void reassembleXmlPkgFromURI(final String uri, boolean showOutput) throws ParserConfigurationException, SAXException, IOException,
  323.   // Download the XML file, then parse what you receive.
  324.   URL url = new URL(uri);
  325.   InputStream is = url.openStream();
  326.   String xmlpkg = uri.substring(uri.lastIndexOf('/') + 1);
  327.   FileOutputStream out = new FileOutputStream(xmlpkg);
  328.   int c;
  329.   while ((c = is.read()) != -1)
  330.   {
  331.     out.write(c);
  332.   }
  333.   out.close();
  334.   is.close();
  335.  
  336.   DocumentBuilderFactory dbf = DocumentBuilderFactory.newInstance();
  337.   DocumentBuilder db = dbf.newDocumentBuilder();
  338.   Document doc = db.parse(new File(xmlpkg));
  339.  
  340.   doc.getDocumentElement().normalize();
  341.   if (showOutput)
  342.     System.out.println("Root element of " + xmlpkg + " is " + doc.getDocumentElement().getNodeName());
  343.  
  344.   NodeList listOfFiles = doc.getElementsByTagName("file");
  345.   int totalFiles = listOfFiles.getLength();
  346.   if (showOutput)
  347.     System.out.println("Total no of files = " + totalFiles);
  348.  
  349.   for (int s = 0; s < totalFiles; s++) {
  350.     Node fileNode = listOfFiles.item(s);
  351.  
  352.     if (fileNode.getNodeType() == Node.ELEMENT_NODE) {
  353.       if (showOutput)
  354.         System.out.println("Node " + s + " is an ELEMENT_NODE.");
  355.       Element e = (Element) fileNode;
  356.       if (showOutput)
  357.         System.out.println("\t" + e.getAttribute("name"));
  358.       FileOutputStream out1 = new FileOutputStream(e.getAttribute("name"));
  359.  
  360.       NodeList chunksList = e.getElementsByTagName("chunk");
  361.       int numberOfChunks = chunksList.getLength();
  362.       if (showOutput)
  363.         System.out.println("\t\tNumber of chunks = " + numberOfChunks);
  364.  
  365.       for (int t = 0; t < numberOfChunks; t++) {
  366.         Element chunkElement = (Element) chunksList.item(t);
  367.         NodeList chunkListFinal = chunkElement.getChildNodes();
  368.         Node chunk = chunkListFinal.item(0);
  369.         if (showOutput)
  370.           System.out.println("\t\t" + chunk.getNodeValue());
  371.  
  372.         String myMd5 = chunk.getNodeValue();
  373.         // Use hashHttp to download and hash chunk at once.
  374.         String myUri = uri.substring(0, uri.lastIndexOf('/') + 1) + myMd5;
  375.         if (showOutput)
  376.           System.out.println("Downloading and hashing " + myUri);
  377.         if (myMd5.equals(hashHttp(myUri, true, false))) {
  378.           int c1;
  379.           InputStream is1 = new FileInputStream(chunk.getNodeValue());
  380.           while ((c1 = is1.read()) != -1) {
  381.             out1.write(c1);
  382.           }
  383.         } else {
  384.           System.err.println(myMd5 + " failed hashing check. Aborting write.");
  385.           return;
  386.         }
  387.       }
  388.       out1.close();
  389.     }
  390.   }
  391. }
  392.  
  393.   public static void main(String[] args) {
  394.     try {
  395.       if (args.length == 2 && "-file".equals(args[0])) {
  396.         hashFile(args[1], true);
  397.       } else if (args.length == 2 && "-http".equals(args[0])) {
  398.         hashHttp(args[1], false, true);
  399.       } else if (args.length == 2 && "-wget".equals(args[0])) {
  400.         hashHttp(args[1], true, true);
  401.       } else if (args.length == 2 && "-xml".equals(args[0])) {
  402.         hashXml(args[1], true);
  403.       } else if (args.length == 4 && "-xmlpkg".equals(args[0])) {
  404.         createXmlPackage(args[1], args[2], Integer.parseInt(args[3]), true);
  405.       } else if (args.length == 2 && "-xmlasm".equals(args[0])) {
  406.         reassembleXmlPkg(args[1], true);
  407.       } else if (args.length == 2 && "-xmlhttp".equals(args[0])) {
  408.         reassembleXmlPkgFromURI(args[1], true);
  409.       } else {
  410.         System.out.println(USAGE);
  411.       }
  412.     } catch (Exception e) {
  413.       e.printStackTrace();
  414.     }
  415.   }
  416. }

Valid XHTML 1.0 Strict Valid CSS!