R – How to extract the .jpg/.png components of an .hpi file

binaryfiles

I stumbled across my rather ancient photo objects disks, and sadly found out the company (hemera) doesn't provide support for it anymore. this has left me with a whole pile of .hpi files. Luckily, I found this information on extracting the jpg and png components of the file.

Unfortunately, I haven't been able to get it to work. Can anyone figure out what's wrong with this code? I'd be happy with a PHP or Python solution if Perl isn't your thing. 🙂

open(I, "$name") || die;
binmode(I);
$_ = <I>;
close(I);

my ($j, $p) = m|^.{32}(.*)(\211PNG.*)$|s;
open(J, ">$name.jpg") &&
    do { binmode(J); print J $j; close J; };
open(P, ">$name.png") &&
    do { binmode(P); print P $p; close P; };

The hexdump of the current test file I snagged off a CD is here, if it helps at all:

0000000 89 48 50 49 0d 0a 1a 0a 64 00 00 00 20 00 00 00
0000010 45 89 00 00 65 89 00 00 0a 21 00 00 00 d0 d0 00

Best Solution

It seems the regexp is wrong. That's why I wrote a little C program to do it for me:

#include <stdio.h>
#include <stdlib.h>

#define MAX_SIZE 1048576

char stuff[MAX_SIZE];

int main (int argc, char **argv)
{
    unsigned int j_off, j_len, p_off, p_len;
    FILE *fp, *jp, *pp;
    fp = fopen (argv[1], "r");
    if (!fp)    goto error;
    if (fseek (fp, 12, SEEK_SET))   goto error;
    if (!fread (&j_off, 4, 1, fp))  goto error;
    if (!fread (&j_len, 4, 1, fp))  goto error;
    if (!fread (&p_off, 4, 1, fp))  goto error;
    if (!fread (&p_len, 4, 1, fp))  goto error;
    fprintf (stderr, "INFO %s \t%d %d %d %d\n",
        argv[1], j_off, j_len, p_off, p_len);
    if (j_len > MAX_SIZE || p_len > MAX_SIZE) {
        fprintf (stderr, "%s: Chunk size too big!\n", argv[1]);
        return EXIT_FAILURE;
    }

    jp = fopen (argv[2], "w");
    if (!jp)    goto error;
    if (fseek (fp, j_off, SEEK_SET))    goto error;
    if (!fread (stuff, j_len, 1, fp))   goto error;
    if (!fwrite (stuff, j_len, 1, jp))  goto error;
    fclose (jp);

    pp = fopen (argv[3], "w");
    if (!pp)    goto error;
    if (fseek (fp, p_off, SEEK_SET))    goto error;
    if (!fread (stuff, p_len, 1, fp))   goto error;
    if (!fwrite (stuff, p_len, 1, pp))  goto error;
    fclose (pp);
    fclose (fp);
    return EXIT_SUCCESS;

error:
    perror (argv[1]);
    return EXIT_FAILURE;
}

It works with the command line parameters input.hpi output.jpg output.png. The error handling is not 100% correct, but it is good enough to always tell you if something's wrong, and most times what it is. For large files, you will have to enlarge MAX_SIZE.

Here is a shell script which you can call with *.hpi:

#!/bin/bash

dest=<destination-folder>

for arg in "$@"
do
  base=`echo $arg | cut -d'.' -f1`
  <executable> $arg $dest/original/$base.jpg $dest/mask/$base.png 2>>$dest/log
  #composite -compose CopyOpacity $dest/mask/$base.png $dest/original/$base.jpg $dest/rgba/$base.png
done

The optional composite command (comes with ImageMagick) will create a new PNG image which has the mask applied as alpha channel. Note that this file will be about 5 times larger than the original files.

Note that some HPI files come without mask. In this case, my program will still work, but give an empty PNG file.