CarLibrary.org - Metadata

May 30, 2016

"Metadata" has been introduced in other CarLibrary.org webpages. Those pages also explain how car historians, collectors, museums and collections can use several types of software to create and maintain traditional inventories or digital archives.

This webpage describes using the open-source ExifTool software and a separate open-source add-on program, the ExifToolGUI (Graphical User Interface), to improve the identification - and eventual classification - of digital photos and scanned documents in many image formats and PDF files through the use of "embedded metadata".

The ExifTool provides a very powerful tool to read metadata from an entire folder of images or documents- or any set of folders and subfolders of digital files - into a Comma Separated Value (CSV) file. This CSV file is easily opened in Excel or other spreadsheet program.  In the Excel (spreadsheet) file, incorrect or missing metadata of each file/record/image can be corrected or added.  That Excel file can be used to import this "inventory of digital assets" into digital library software (Greenstone), a database or a Collections Management System (CMS).  It can also be used to "write" the updated metadata back into the image (or other) digital files. Although the ExifTool must be used from Windows (or Linux) through the "command line", which may be daunting for users accustomed to a Windows or Mac graphical interface, its power to create archive/collections inventories - and files with useful embedded metadata - should justify the effort necessary to master its use.  Highly recommended!

This guide's sections are:

1. ExifTool - Reading and Writing Embedded Metadata

a. More Complex Directories with Thousands of Photos

b. Writing Metadata into Photos

2. ExifToolGUI

3. Summary of the ExifTool and ExifToolGUI

a. Videos

b. Recommendations

1. ExifTool - Reading and Writing Embedded Metadata

Phil Harvey's ExifTool, is described to have the ability to extract, add, copy or move nearly all types of metadata, not only EXIF metadata.  It will display metadata in PDF, Word, Excel and MP3 files and edit metadata in image and PDF files.  The basic program is used from a "command line", but with the correct configuration, it is very powerful and has proven to "do everything" to image metadata. Downloads for Windows, Macs and Linux systems and very complete explanation of its functions are here

This webpage is an introduction to using a few of the command line functions.

Download the ExifTool and install as follows:

A.  Go to Phil Harvey's website and use one of the download links near the top of the page.  The "Windows Executable" is recommended for Windows systems and there is a Mac OS X package just below.

B.  Installation instructions are on a link just below the download links.  On Windows systems, follow the "exiftool.exe" move instructions carefully so the program will work from all directories.

C.  As noted in these instructions, you must first get to the Windows command prompt by selecting "Run..." from the Windows "Start" menu, then type "cmd" and press Return. It appears that you have time-traveled to the '80s world of DOS!  Then type "exiftool".

D.  The "exiftool" expects arguments/inputs otherwise it will only display information about the program.  This is a sample command:

exiftool -csv -r -DateTimeOriginal -Description -Subject e:\cars\Vespa\Vespa-Pics > Vespa.csv

Figure 1 shows what this looks like:

Figure 1 - The ExifTool in a Command Window

which gives the result in Figure 2, below.  The "inputs", above, to the tool are:

1.  "-csv"  set the option to produce a "Comma Separated Value" file.  This format can be directly read by Excel.

2.  "-r"  do this recursively (for all files in the chosen directory)

3.  "-DateTimeOriginal" write the metatag "DateTimeOriginal"

4. "-Description" write the metatag "Description" which is the Picasa caption on the image

5. "-Subject" write the metatag "Subject", which is also a Picasa "tag".  Multiple tags can be added in Picasa

6. "e:\cars\Vespa\Vespa-Pics" do this action for all the files in this directory

7. "> Vespa.csv" sends the output to the CSV named file

Figure 2 - the Excel file output from the Vespa.csv file

8.  If the CSV file is opened with Excel, it will appear similar to the one above.  The columns and headings have been formatted to improve readability.  The "Subject" field column will have multiple "tags" separated by commas.  Highlight the entire column and use Excel's "Text to Columns" function in the "Data" menu and choose "delimited" separation and check the "comma" box.  Sort the 'tags" columns one at a time to group similar data together - such as the ID numbers and move them to the relevant "correct" column.

9.  If you will be using ExifTool on a regular basis, you can keep a "library" of commands in a Word or Excel file that can be copied and pasted into the command Window.  To copy or paste, right click in the title bar of the command window and choose "Edit".  Options will pop up, including "Paste".  To adjust the appearance of the command window, choose the "Default" menu item and make desired changes.

For example, "a." below is the original example command; ""b." will produce a similar file from the "Vespa-VBB" directory, but add the "FileName" to the "Vespa-VBB.csv" output file. 

a. exiftool -csv -r -DateTimeOriginal -Description -Subject e:\cars\Vespa\Vespa-Pics > Vespa.csv

b. exiftool -csv -r -FileName -DateTimeOriginal -Description -Subject e:\cars\Vespa\Vespa-VBB > Vespa-VBB.csv

Complete documentation for the ExifTool is on Mr. Harvey's website.  The program may seem complex, but it is potentially powerful.  There are many metadata categories, with many options to read and extract data from files.  However, it you have an extensive photo collection, or similar files to manage, much benefit can be gained by "reading" selected metadata from the files, as shown above, and creating lists in Excel or other database program.

These are the XMP metadata values in the "Dublin Core" category that the ExifTool can read and modify; there are many more categories and hundreds of metadata values:

Tag Name

Writable

Contributor

string+

Coverage

string

Creator

string+

Date

date+

Description

lang-alt

Format

string

Identifier

string

Language

string+

Publisher

string+

Relation

string+

Rights

lang-alt

Source

string/

Subject

string+

Title

lang-alt

Type

string+

These are the metadata values in PDF files that the ExifTool can read and modify (XMP category):

Tag Name

Writable

Author

string

Copyright

string/

CreationDate

date

Creator

string/

Keywords

string

Marked

boolean/

ModDate

date

PDFVersion

string

Producer

string

Subject

string/

Title

string/

a.  More Complex Directories with Thousands of Photos

The example above used a single directory with a small number of files that have been prepared with significant metadata,  How does the ExifTool work with a complex folder/directory with many subdirectories and thousands of photos?

This ExifTool command (from the command line) was used (from the command line) on the "AFNpics" (photos of Frazer Nash cars) directory which had 53 subdirectories and an unknown number of photos/files before its execution. As an experiment, the command was  first used with the "All" parameter

exiftool -csv -r -All f:\AFN\AFNPpics > AFNPics.csv

1046 categories of metadata were returned!  More than 95% of these categories were "camera/photo technical type" data, with many category variations from different camera manufacturers of the same basic data.  The CSV output file had far too many columns to display in Excel. 

Figure 2a - Command to run the ExifTool with the "All" parameter.  Note it has "scanned" 53 directories and extracted data from 2058 image files.

To get more useful results, the metadata categories were trimmed down to 43. The command below is the result:

exiftool -csv -r -Artist -Author -Caption-Abstract -Categories -Category -Comment -Comments -Company -Contributor -Copyright -Coverage -CreateDate -CreationDate -Creator -Date -DateCreated -DateTimeOriginal -Description -Directory -DocumentID -FileID -FileName -FileNumber -FileSize -FileSource -FileType -Format -GPSLatitude -GPSLongitude -ImageHeight -ImageSize -Keywords -Label -Location -ModifyDate -ObjectName -PDFVersion -Producer -Relation -Source -SourceFile -Subject -Title f:\afn\afnpics > afnpics4.csv

When it was executed, it produced a very useful Excel file, after some formatting for better display.  Further trials should confirm that the alphabetical order of the metadata categories, the default shown above, can be changed.  This would make the category/column order of CSV output file with a close or exact match to an existing spreadsheet file, therefore eliminating many potential "cut and paste" steps.

Figure 2b - Command to run the ExifTool with a trimmed down group of 43 parameters.  Not all parameters produced useful data.

The screen shot in Figure 2a shows there are more than 2000 photos in the entire "AFNPics" directory.  After running the command shown in Figure 2b, the result was an Excel file with 43 columns and 2058 rows.  From this file, it's easier to pick out the more useful categories, delete the columns with no data (or data not considered useful) and start adding data in the empty cells!

b.  Writing Metadata into Photos

The ExifTool also has the ability to "write" metadata to single files or entire directories.  This is a preliminary guide to this "-tagsFromFile" option:

1.   Extract the metadata from your photos by making a CSV/Excel file as described in the section above.

2.   Inspect the cells for each photo's metadata in the in the relevant rows and columns.

3.   After deciding what needs to be added, fill in the desired metadata in the appropriate cells.  The Excel "copy" and "paste" functions can speed this process.

4.   You can optionally delete rows and columns that have no new data to be added to the photos or for photos that are not to be changed.  In our example from above, the "Description" column/metadata was changed to "Title" and missing data filled in.   New columns were added for "Creator" and "Author".  The Excel file below was saved in "CSV" format.

5.   When saving the Excel file planned for input in the "CSV" format, it's a good idea to use a new file name, such as "Vespa-input.csv", to prevent confusion with the CSV file which extracted the metadata from your photos.

6.  Open a Command window (see step 1.C. above) and change to the drive and directory where your photo to be changed are located.

7.  Run a ExifTool command similar to this:

a. exiftool -csv=vespa-input.csv -ext jpg e:\cars\vespa\vespa-pics\

8.  When the command runs correctly (no error messages), the new metadata will be added to the chosen photos and the original photos will be saved with the "_original" extension added - just in case!   If you get error messages, check your input file to make sure the full path to the "SourceFile" category is all lower case and the cases of the file names match.  If the command still fails to execute, change the output path to lower case.  For example from above, it should be "e:\cars\vespa\vespa-pics\"

This is a screen shots of this command before running:

And the result after running - not overly dramatic!

See Figure 3a below for a screen shot of the ExifToolGUI after this operation.

In October, 2014 this "write" function was used (from a CSV file) to add 14 categories of metadata to 675 travel photos.  This included a unique accession number for each photo.  See the "Videos" section below for more detail.  Very useful and impressive!

The ExifTool has another function to add geo-data to digital photos using a track recorded on a GPS.  The function was tested in early December, 2014 with great success.  Although the mapping accuracy depends on the camera time and date setting be the same as on the GPS device, the photo date and times can be adjusted in program such as Picasa or a sync parameter can be used directly in the ExifTool.  See the full "Geotagging with ExifTool" documentation.

2. ExifToolGUI

If using the command line for ExifTool seems too "technical" (it can daunting the first few times of use!), developer Bogdan Hrastnik has written a GUI (Graphical User Interface, Windows only) for ExifTool, which allows very easy access to many of the ExifTool functions.  The "how to" page for ExifToolGUI is here and you can download the zipped file here.

A.  Figure 3 below shows the ExifToolGUI open on the same directory of Vespa photos used for the example above before new data was added from an Excel file.

Figure 3 - ExifToolGUI open on the "Vespa-pics" photos directory.  Note the output files from the prior example also appear.

Figure 3a - The ExifToolGUI open on the "Vespa-pics" directory after an Excel file was used to "write" metadata into each photo.  The "Workspace" tab shows the new Title, Author and Creator data  files from the prior example also appear.

B.  This program is an excellent starting point to understand how much metadata is contained in digital files, especially digital photographs.  

All digital files contain metadata: Word, Excel, MP3, video, etc. files can be examined and lists of these files can be created, as shown in section 1. above.  One step towards reducing chaos! 

The panel on the right of the screen offers tabbed choices to examine specific metadata categories. "All" has been selected.  Note the metadata in the "XMP" category and the IPTC category show the metatags that are the same in the prior example, although the metadata sub-categories have different names.

C.  The GUI has a function to add or modify metadata to a single file or group of files.  This is done in the "Workspace" tab of the same right panel.  Chose a category to add new data and type in the new data in the window at the bottom of this panel.   When complete, press the "Enter" key and click on the "Save" button.

The metadata categories for editing in the right panel's "Workspace Manager" can be customized, but must be setup.   The desired categories/options are created and saved in a "ini" file: "ExifToolGUIv5.ini"

It can be slightly tedious to setup a Workspace Manager. The ".ini" file can be edited directly in a text editor (Notepad).  Below is a sample of the Workspace tags section - not the full "ini" file - from an "ini" file for common operations:

[WorkspaceTags]

Accession/ID Number=-Identifier^Accession or ID Number

DC:Title=-Title^DC Title (name of object)

DC:Description=-Description^DC Description

DC:Subject=-Subject^DC subject

IPTC:Keywords=-IPTC:Keywords^Keywords/tags (use comma)

DC:Resource Identifier=-Identifier^DC Resource Identifier

DC:Relation=-Relation^DC Relation (to primary object)

DC:Source=-Source^DC source (is part of)

DC:Creator=-Creator^DC creator

DC:Date=-Date^DC date

DC:Contributor=-Contributor^DC contrib

DC:Coverage=-Coverage^DC coverage

DC:Format=-Format^DC format

DC:Type=-Type^DC type

DC:Language=-Language^DC language

DC:Publisher=-Publisher^DC publisher

DC:Rights=-Rights^DC rights

Artist/Author=-Author^Bob Schmitt

Location=-Location^Where created

Primary Object Number=-XMP:Relation^Relation to primary Object

PDF Title=-pdf:Title^Document title

PDF Subject=-pdf:Subject^Document subject

PDF Keywords=-pdf:Keywords^Document keywords

CreateDate=-exif:CreateDate^[2012:01:14 20:00:00]

DateTimeOriginal=-exif:DateTimeOriginal^[2012:01:14 20:00:00]

FileAccessDate=-FileAccessDate

FileName=-FileName

FileSize=-FileSize

FileType=-FileType

ImageSize=-ImageSize

PhotoShop: TextLayer=-TextLayerText^Copyright

[TagList]

The newly customized "ExifToolGUIv5.ini" can be copied and transferred between computers.  This could be useful to set up a workgroup which will contribute to the metadata for an archive.

D.  The GUI can also run ExifTool commands, as in 1.D., above, by clicking on the "ExifTool direct" button at the bottom of the middle panel. If the file output goes to a screen "log" instead of being written to a file, copy the entire log (Ctrl-A, then Ctrl-C) and paste it into an open spreadsheet (Ctrl-V).

For a further trial with a different directory of images, the ExifToolGUI was used to embed "captions" and an "ID number" in each image.  This was done in the "IPTC edit function" window.  These images were then brought into a new Greenstone collection, titled "Vespa" and the (new) captions were extracted from the "IPTC.Caption-Abstract" metadata item.  The trial ID numbers were extracted and displayed from the "IPTC.ObjectName" metadata category.  

However, since this 2013 trial, the widely-accepted standard "accession number" format is now used for the ID number.  In the second photo below, this would be "2012.9.1" - the first photo in the 9th month of 2012.  The "DC:Identifier" category is the standard now used for the accession number.

Figure 4 - Note the file now has "Object ID", added by by using the "ExifToolGUI". This screen shot shows browse results on "Alternate Captions" starting with "1".  The "Alternate Caption" is from "IPTC.Caption-Abstract" and the "Caption" is from "XMP.Description", as described above.

The ExifTool and ExifToolGUI will display and edit the metadata for many file types.  PDF files, Word and Excel documents, music and video files all have embedded metadata.  Check your files and you may be surprised!

In a further example, the screen shot below (Figure 5) shows the particular metadata embedded in these test images, extracted by ExifTool from the command line, through the GUI.  Note the "SourceFile" shows that these are image files in a Greenstone digital library collection.  

Although there are only image (JPG) files in this sample directory, this GUI function shows the file name and metadata (if correctly specified) for all files.  This is a handy way to make a list in CSV, then in Excel, to annotate or mark files for further action.

From this type of list, the photo unique ID - here "ObjectName" - can be reviewed and updated if desired.  The FAQ on the ExifTool website, under question 13, shows examples of using the command parameters to make this export.

Figure 5 - The ExifToolGUI was used to request a list of the "FileName", "Caption-Abstract", "ObjectName" and "Description" metadata from a single directory.  This is only a small subset of the metadata in any digital image.

The ExifToolGUI was used to correct a few old captions embedded by Picasa.

3. Summary of the ExifTool and ExifToolGUI

The ExifTool ability to add and/or replace specific metadata in images (and other file types) with new data from a CSV file (a standard file type export from Excel) can be VERY useful.  To conform to "best museum practice", an accession number as a "unique ID" can be very easily added as a column in the Excel table (as above), which is "saved as" a CSV file and then used to update a full directory of images.

Another ExifTool function can copy each image's metadata from one category to another, when the program is used from the command line.  This may be the solution to add metadata to images that can copied/transferred from/to other metadata descriptors, making those files conform to any "standard".  

The ExifTool and ExifToolGUI promise to be a powerful combination to improve embedded metadata of digital images. I use these tools (also with Picasa) for my collections and archives, primarily with the Greenstone digital library software, as discussed on other webpages.

a. Video

We've made an introductory video guide on YouTube for the downloading, setup and basic operations of the ExifTool and ExifToolGUI (also: http://youtu.be/CB6F19zRqSw). 

A second video on YouTube explains the steps for reading the metadata to an Excel file from a directory with hundreds of photos. adding metadata in the Excel file and then writing it back to the same group of 650+ photos (also: http://youtu.be/T4gTQEwUwwY).

These are the ExifTool commands used in the video to read the data from the "England-2014" directory:

exiftool -csv -r -Caption-Abstract -Coverage -Creator -Date -Description -FileType -Format -Identifier -Keywords -Location -Relation -Source -SourceFile -Subject -Title f:\england-2014 > england-2014.csv

After metadata updates were made in Excel, this command wrote the new metadata into the photos:

exiftool -csv=england2014-input.csv -ext jpg f:\england-2014

These commands really work!  You can use these commands directly if you change the red italic file names and locations to your file locations and names. 

b. Recommendations:

1.  Use the ExifTool from the command line to read entire folders/subfolders of photos.

2.  Review the resulting Excel file ("save as" from the CSV output file) to determine which metadata categories will help organize these collection assets. Using Dublin Core categories should be a high priority.

3.  Use the Excel copy/paste functions to fill in missing metadata.  Use the Excel "data fill" command to create accession numbers in the DC:Identifier category.

4.  Use the ExifTool from the command line to write the Excel file ("save as") to the CSV format and then back to the entire set of photo folders/subfolders.

5.  The ExifTool may be used again at any time from the command line to read the metadata-updated photo folders/subfolders, creating "data-rich" Excel files for import into a collections management, content management or digital library (e.g. Greenstone) software program.

This Metadata/ExifTool webpage will be updated after more experience with these very useful tools.

Email me with any comments, suggestions or questions: Bob Schmitt, rgschmitt@gmail.com