Wednesday 22 February 2012

Making "exif2map.pl" recursively search


Recently Doppiamunnezza commented that it might be helpful if we could point the exif2map.pl script at a folder and have it automagically search all files below that for EXIF geotag data.

Being the code-monkey hack that I am, here's my quick/dirty solution ...

Code Listing

# START CODE


#!/usr/bin/perl -w

# Perl script to take the output of exiftool and conjure up a web link
# to google maps if the image has stored GPS lat/long info.

use strict;

use Image::ExifTool;
use Image::ExifTool::Location;
use Getopt::Long;
use HTML::QuickTable;
use File::Find;

# commented out for now - apparently File:Find can issue some weird warnings
#no warnings 'File::Find';

my $version = "exif2map.pl v2012.02.21";
my $help = ''; # help flag
my $htmloutput = ''; #html flag
my @filenames; # input files from -f flag
my @directories; # input directories from -dir flag (must use absolute paths)

my %file_listing; # stored results

GetOptions('help|h' => \$help,
    'html' => \$htmloutput,
    'f=s@' => \@filenames,
    'dir=s@' => \@directories);

if ($help||(@filenames == 0 && @directories == 0))
{
    print("\n$version\n");
    print("Perl script to take the output of exiftool and conjure up a web link\n");
    print("to google maps if the image has stored GPS lat/long info.\n");

    print("\nUsage: exif2map.pl [-h|help] [-f filename] [-html]\n");
    print("-h|help .......... Help (print this information). Does not run anything else.\n");
    print("-f filename ...... File(s) to extract lat/long from\n");
    print("-dir directory ... Absolute path to folder containing file(s) to extract lat/long from\n");
    print("-html ............ Also output results as a timestamped html file in current directory\n");

    print("\nExample: exif2map.pl -f /cases/galloping-gonzo.jpg");
    print("\nExample: exif2map.pl -f /cases/krazy-kermit.jpg -dir /cases/rockin-rowlf-pics/ -html\n\n");
    print("Note: Outputs results to command line and (if specified) to a timestamped html file\n");
    print("in the current directory (e.g. exif2map-output-TIMESTAMP.html)\n\n");
   
    exit;
}

# Main processing loop
print("\n$version\n");

# Process filenames specified using the -f flag first
if (@filenames)
{
    foreach my $name (@filenames)
    {
        ProcessFilename($name);
    }
}

# Process folders specified using the -dir flag
# Note: Will NOT follow symbolic links to files
if (@directories)
{
    find(\&ProcessDir, @directories);
}

# If html output required AND we have actually retrieved some data ...
if ( ($htmloutput) && (keys(%file_listing) > 0) )
{   
    #timestamped output filename
    my $htmloutputfile = "exif2map-output-".time.".html";

    open(my $html_output_file, ">".$htmloutputfile) || die("Unable to open $htmloutputfile for writing\n");

    my $htmltable = HTML::QuickTable->new(border => 1, labels => 1);

    # Added preceeding "/" to "Filename" so that the HTML::QuickTable sorting doesn't result in
    # the column headings being re-ordered after / below a filename beginning with a "\".
    $file_listing{"/Filename"} = "GoogleMaps Link";

    print $html_output_file "<HTML>";
    print $html_output_file $htmltable->render(\%file_listing);
    print $html_output_file "<\/HTML>";

    close($htmloutputfile);
    print("\nPlease refer to \"$htmloutputfile\" for a clickable link output table\n\n");
}

sub ProcessFilename
{
    my $filename = shift;

    if (-e $filename) #file must exist
    {
        my $exif = Image::ExifTool->new();
        # Extract all info from existing image
        if ($exif->ExtractInfo($filename))
        {
            # Ensure all 4 GPS params are present
            # ie GPSLatitude, GPSLatitudeRef, GPSLongitude, GPSLongitudeRef
            # The Ref values indicate North/South and East/West
            if ($exif->HasLocation())
            {
                my ($lat, $lon) = $exif->GetLocation();
                print("\n$filename contains Lat: $lat, Long: $lon\n");
                print("URL: http://maps.google.com/maps?q=$lat,+$lon($filename)&iwloc=A&hl=en\n");
                if ($htmloutput) # save GoogleMaps URL to global hashmap indexed by filename
                {
                    $file_listing{$filename} = "<A HREF = \"http://maps.google.com/maps?q=$lat,+$lon($filename)&iwloc=A&hl=en\"> http://maps.google.com/maps?q=$lat,+$lon($filename)&iwloc=A&hl=en</A>";
                }
                return 1;
            }
            else
            {
                print("\n$filename : No Location Info available!\n");
                return 0;
            }
        }
        else
        {
            print("\n$filename : Cannot Extract Info!\n");
            return 0;
        }
    }
    else
    {
        print("\n$filename does not exist!\n");
        return 0;
    }
}

sub ProcessDir
{
    # $File::Find::dir is the current directory name,
    # $_ is the current filename within that directory
    # $File::Find::name is the complete pathname to the file.
    my $filename = $File::Find::name; # should contain absolute path eg /cases/pics/krazy-kermit.jpg

    if (-f $filename) # must be a file not a directory name ...
    {
        my $exif = Image::ExifTool->new();
        # Extract all info from existing image
        if ($exif->ExtractInfo($filename))
        {
            # Ensure all 4 GPS params are present
            # ie GPSLatitude, GPSLatitudeRef, GPSLongitude, GPSLongitudeRef
            # The Ref values indicate North/South and East/West
            if ($exif->HasLocation())
            {
                my ($lat, $lon) = $exif->GetLocation();
                print("\n$filename contains Lat: $lat, Long: $lon\n");
                print("URL: http://maps.google.com/maps?q=$lat,+$lon($filename)&iwloc=A&hl=en\n");
                if ($htmloutput) # save GoogleMaps URL to global hashmap indexed by filename
                {
                    $file_listing{$filename} = "<A HREF = \"http://maps.google.com/maps?q=$lat,+$lon($filename)&iwloc=A&hl=en\"> http://maps.google.com/maps?q=$lat,+$lon($filename)&iwloc=A&hl=en</A>";
                }
                return 1;
            }
            else
            {
                print("\n$filename : No Location Info available!\n");
                return 0;
            }
        }
        else
        {
            print("\n$filename : Cannot Extract Info!\n");
            return 0;
        }
    }
}

# END CODE


Code Summary

The code mostly works as before - I've just added some extra code to handle any user specified folders.
It could probably be re-written so only one function was required but I reckon the code would become a bit harder to explain/understand. Plus, I can be a pretty lazy monkey ;)
Anyhoo, I've added a function called "ProcessDir" which gets called for each file/directory under the user specified folder(s). It is essentially the same code as "ProcessFilename" except it derives the filenames from the File::Find module. The File::Find's "find" function (ie "find(\&ProcessDir, @directories);") will search the given directories array and then call "ProcessDir" for each directory/file found. Consequently, "ProcessDir" should only call our EXIF checks if it's looking at a file (ie that's what the "if (-f $filename)" condition is for). The good news is the "File::Find" module is already loaded on the SIFT VM so we don't need to explicitly install it.
Apart from those changes, the command line output now also tells the user the output HTML filename (if the -html flag is set).
One caveat I noticed during testing was that folder names MUST use ABSOLUTE paths (eg "/home/sansforensics/testpics"). Otherwise, the file test mentioned above fails.


Testing:

Here's the file/folder structure I set up for testing:

/home/sansforensics/wheres-Cheeky4n6Monkey.jpg
/home/sansforensics/testpics/case1/Vodafone710.jpg
/home/sansforensics/testpics/case1/wheres-Cheeky4n6Monkey.jpg
/home/sansforensics/testpics/case1/subpics/GPS_location_stamped_with_GPStamper.jpg
/home/sansforensics/testpics/case2/wheres-Cheeky4n6Monkey.jpg

The various "wheres-Cheeky4n6Monkey.jpg" copies have GPS Lat/Long info as does "GPS_location_stamped_with_GPStamper.jpg". I also added in "Vodafone710.jpg" which has no GPS Lat/Long data. Both of our tests will be run from "/home/sansforensics/".

For the first test, we shall specify a single file from the current directory ("wheres-Cheeky4n6Monkey.jpg") and also the "case1" folder which contains some images ("Vodafone710.jpg" and another copy of "wheres-Cheeky4n6Monkey.jpg").  The "case1" folder also contains a sub folder containing another image ("case1/subpics/GPS_location_stamped_with_GPStamper.jpg").

Here's the command line input/output ...

sansforensics@SIFT-Workstation:~$ exif2map.pl -dir /home/sansforensics/testpics/case1/ -f wheres-Cheeky4n6Monkey.jpg -html


exif2map.pl v2012.02.21


wheres-Cheeky4n6Monkey.jpg contains Lat: 36.1147630001389, Long: -115.172811
URL: http://maps.google.com/maps?q=36.1147630001389,+-115.172811(wheres-Cheeky4n6Monkey.jpg)&iwloc=A&hl=en


/home/sansforensics/testpics/case1/Vodafone710.jpg : No Location Info available!


/home/sansforensics/testpics/case1/wheres-Cheeky4n6Monkey.jpg contains Lat: 36.1147630001389, Long: -115.172811
URL: http://maps.google.com/maps?q=36.1147630001389,+-115.172811(/home/sansforensics/testpics/case1/wheres-Cheeky4n6Monkey.jpg)&iwloc=A&hl=en


/home/sansforensics/testpics/case1/subpics/GPS_location_stamped_with_GPStamper.jpg contains Lat: 41.888948, Long: -87.624494
URL: http://maps.google.com/maps?q=41.888948,+-87.624494(/home/sansforensics/testpics/case1/subpics/GPS_location_stamped_with_GPStamper.jpg)&iwloc=A&hl=en


Please refer to "exif2map-output-1329903003.html" for a clickable link output table


sansforensics@SIFT-Workstation:~$


The output file "exif2map-output-1329903003.html" looks like:

Results for "case1" Folder + 1 x Local File

For the second test, we now specify the parent "testpics" directory (plus the local "wheres-Cheeky4n6Monkey.jpg" file). The script should pick up both "case1" and "case2" directories.

sansforensics@SIFT-Workstation:~$ exif2map.pl -dir /home/sansforensics/testpics/ -f wheres-Cheeky4n6Monkey.jpg -html


exif2map.pl v2012.02.21


wheres-Cheeky4n6Monkey.jpg contains Lat: 36.1147630001389, Long: -115.172811
URL: http://maps.google.com/maps?q=36.1147630001389,+-115.172811(wheres-Cheeky4n6Monkey.jpg)&iwloc=A&hl=en


/home/sansforensics/testpics/case1/Vodafone710.jpg : No Location Info available!


/home/sansforensics/testpics/case1/wheres-Cheeky4n6Monkey.jpg contains Lat: 36.1147630001389, Long: -115.172811
URL: http://maps.google.com/maps?q=36.1147630001389,+-115.172811(/home/sansforensics/testpics/case1/wheres-Cheeky4n6Monkey.jpg)&iwloc=A&hl=en


/home/sansforensics/testpics/case1/subpics/GPS_location_stamped_with_GPStamper.jpg contains Lat: 41.888948, Long: -87.624494
URL: http://maps.google.com/maps?q=41.888948,+-87.624494(/home/sansforensics/testpics/case1/subpics/GPS_location_stamped_with_GPStamper.jpg)&iwloc=A&hl=en


/home/sansforensics/testpics/case2/wheres-Cheeky4n6Monkey.jpg contains Lat: 36.1147630001389, Long: -115.172811
URL: http://maps.google.com/maps?q=36.1147630001389,+-115.172811(/home/sansforensics/testpics/case2/wheres-Cheeky4n6Monkey.jpg)&iwloc=A&hl=en


Please refer to "exif2map-output-1329903236.html" for a clickable link output table


sansforensics@SIFT-Workstation:~$

This produces the following output HTML file:

Results for "testpics" Parent Folder + 1 x Local File

We can see that our newly modified "exif2map.pl" script can now also recursively search given folders for files with EXIF GPS Lat/Long info. Hooray!
The good thing about Perl is that someone has probably already thought of what you might need and has already written a module for it (eg File::Find). I highly recommend searching CPAN  before starting any new project.