531 CAPTION AND KEYWORD INFORMATION
January 9, 2003
Another advantage of digital capture is the ease with which important textual
information about the subject can be stored with the images. If attention is
paid to this
feature it can make tracking and finding images in the future much easier. In
the film environment we have had to deal with problems, such as:
- - How do I store the image so I can find it in the future?
- - What should be the primary category where the image will be stored?
- - How do I keep track of information related to the image so I can make it
available when needed?
- - How do I get all the necessary caption information on a small label that
fits on the slide?
- - Will my agent have to re-type information on the slide to integrate it
into his system and what kind of errors or loss of data are likely to occur in
When images are used for advertising, information like who, what, when or where
the image was created may be unimportant. On the other hand, for editorial uses
this type of
additional information is often necessary in order to make the sale. (Keep in
mind the volume of sales that are editorial in nature. See 528
). Even with
keywords are an important aid in locating a particular image among millions.
Thus, the basic principle is that in order to market digital images effectively
it is necessary to have certain types of textual information connected to every
Images that are captured digitally have what is called an IPTC header which
automatically stores, at the time of capture, a great deal of information about
how and when that
image was created in terms of shutter speed, f/stop, ISO, time, date, etc.
However, what is really interesting for the stock shooter is the additional
to captions and keywords that can be placed in this IPTC space as well.
Information can be added when the images are copied from the Compact Flash Card
to a hard drive. And
by using a browser additional information may be added to selected groups of
Developing effective workflow processes for connecting textual information to
the images can greatly enhance the tracking and marketing of the images. In the
environment photographers have developed procedures for storing, tracking and
getting images to market. When the photographer moves to digital capture
entirely new workflow
process must be developed to maximize the efficiency of the technology. If these
processes are well thought out and the photographer takes advantage of all that
offers, it should be possible to more effectively market stock images with less
after shoot effort.
In my researches, I have found photographers who are using some of the available
tools to streamline parts of their workflow process, but so far I haven't found
one who has
been able to maximize the workflow potential all the way from capture to stock
photo customer. Part of the reason for this is that to really maximize the
potential an agent,
portal or other intermediary is often needed between the creator and the
customer to help manage the workflow process. As yet, most of these service
providers have not
developed systems that allow them to take advantage of what their photographers
Steps In the Workflow Process
While photographers can do all the work from creation to delivering the finished
image to the customer, most will find that when it comes to licensing stock
usages it will
be more efficient to delegate some of the work. In the remainder of this article
I will discuss how that process might work. There are at least three elements to
process that takes the image from creation to market.
1 - The photographer will need to be diligent in supplying basic before and Photographer's Job
after shoot information that is difficult for others to reconstruct later.
2 - Then the digital files go to an assistant who takes the photographer's
output and edits, add keywords and further prepares the work for delivery to;
3 - Some type of portal that consolidates the images of many photographers and
makes them available to the customers.
One of the first things a photographer needs to do is develop a system to
provide a unique number for every image created. This is important because the
to build a thumbnail catalog of every image shot. In many analog filing systems
the number often supplies information about the category where the image would
That is not necessary in a digital system. In digital a simple sequence
numbering system is satisfactory. The images can be filed in sequence and
located by using keywords
to search a digital catalog rather than filing them by category.
I've talked to some digital photographers who are still trying to store and
track images in a folder system similar to the file drawers that worked well for
them in the
analog environment. They have the folders organized by shoots and usually in
date order, but if someone wants to use an image that was taken a year or more
ago they have to
remember approximately when that image was taken. That works as long as the
photographer is doing the searching, but how does a new assistant, or the
find an image? A folder system is fine for initially getting the images out of
the camera and preliminary editing, but it fails to take advantage of the power
The camera's numbering system is not satisfactory. If you are using two cameras
on the same shoot you may get duplicate numbers. Your system can't be tied to a
camera or flash card. It needs to provide unique numbers for every image you
will create in the future.
A simple solution is to have a date, or some shorthand for the date, and a
sequence number for everything shot on that date. For example: "2JAN1234". The
"2" is for the
year 2002, then the month, and a 4 digit sequence number. The images are then
named, not at the time they are shot, but as they are downloaded from the
Compact Flash Card
onto your computer hard drive.
The full size files may be stored on a hard drive or CD-ROM's and organized
according to number. From that point on they are found by searching the
catalog using keywords. Once a thumbnail has been found the image number is used
to locate the full size image.
Having images numbered and keyworded makes it possible to construct a large
searchable database of thumbnails of everything created. Using a cataloging
Extensis Portfolio [www.extensis.com/assetman/] makes creating such a database
relatively simple. The key to being able to find images is to have a few
attached to every thumbnail. The date and the name of the project may be
keywords. The more keywords, the more different ways the image can be located.
Recognize that there
are different types of keywords. Some may simply serve the purpose of helping
the photographer or his staff identify different groups of images within his own
These may not be the kind of "keywords" that would be listed on a portal to
enable buyers to locate certain types of imagery. Don't be afraid to use
keywords that only have
meaning to you and your staff. More about keywording later.
By using this system, instead of the image being in one file drawer, it is
effectively available in as many drawers as you choose to define using unique
you put in the keyword field doesn't have to be a legitimate word. Take for
example someone who covers Golf and wants to be able to find images from a
easily. Part of the problem is that you have a Masters, a US Open and a British
Open every year and the photographer knows he is going to be covering these same
the next decade or more. The photographer may want to use shorthand keywords
like masters02, usopen03, bopen04 (02, 03 and 04 being the years). As long as
or his staff will be the only ones searching this database the made up keywords
work fine. If the public is going to be doing the searching then you must use
The other way to solve this problem is to add the year as a keyword. Thus, if
you want something from the 2002 US Open you do a search for "2002 and US and
Open". As long
as we're on this example when your keywording such images you should also add
United States as a keyword so it is spelled out in addition to having the
will help when it comes time to prepare the keywords that will be used to
identify the image on a portal. Thus, if I were shooting the U.S. Open, I would
attach at least the
following keywords to every image I shot as they were being uploaded to the hard
drive from the Compact Flash Card. They are: 2002, United States, Open, U.S.,
PGA Tour, Professional Golfers Association, (plus the name and location of the
course), and any shortcut search codes I had decided to use for my own purposes
Photographers need to think in terms of storing in the IPTC header any textual
information that is easy to capture. Such information may be useful in the
future and takes up
an insignificant amount of digital storage space. There is information about
what was happening during the shoot that only the photographer is likely to know
the who, what, where, when and why information. The photographer should take
responsibility for getting this information into the IPTC header.
In my opinion, at this initial stage the more information the better. The
material can always be edited and polished later. If it is not there it may be
difficult for anyone
to reconstruct months or years later. I wish I had more information about some
of the images I shot in Vietnam. Notes I took about what was going on at that
time have been
lost. Had today's digital equipment been available I could have easily stored in
the IPTC header basic notes about what was happening, and it would have been
use or review today. Often I get simple questions from buyers like "When or
where was the image shot." I can't give precise answers, and yet the answer
could make the
difference as to whether or not the image will be used when it comes time to
Keep in mind that it is often easier to add large blocks of pre-prepared
information than to take the time to edit and polish. Many in the industry are
having too many keywords and captions that are too long. My strategy would be to
store extensive caption information in the IPTC header, but to later supply an
version of that information to be used by the portal for the purpose of
searching. The edited information allows for efficient search without bringing
images but at the same time you continue to retain all the information in case
it is needed later.
One of the critical elements in making the IPTC header useful as a place to
store information is having a convenient way to export that data to a program
where it can be
easily manipulated and polished.
Some photographers ask event organizers to e-mail them schedules and press
releases before the shoot. Having this information in digital form makes it
simple to cut and
paste into a copy block that will be attached to each image produced on the
shoot. Charlie Mann told me that when he is on a plane on the way to a shoot he
generic copy that will be attached to each image on the shoot. Another digital
resource may be the web sites of organizations being photographed. At the very
the URL in your copy block can provide a resource to refer to later. The danger
in using a URL address is that two years down the road when you need the
information the URL
may no longer be active.
From a stock point of view I recommend storing at least JPEGs of almost
everything shot. This may seem like a massive amount of imagery when a careful
edit might determine
that only a few frames are relevant to the immediate need. However, by adding a
few simple generic keywords to each shoot it becomes possible to easily retrieve
material later if a need arises. And it may be faster to store everything than
to do a careful edit. To drive this point home you only need to think of two
words -- Monica
Lewinsky. If all the irrelevant images of the "Clinton hugging" event had been
dumped after the shoot that image would never have been seen. The same goes for
to any event. Photographers have always saved their film, but many are not
saving their digital files.
Photographers may also want to think about storing at least a JPEG of everything
captured, rather than dumping a lot of RAW files that seem inappropriate for the
use. Storing all the images may take less time than doing a careful edit. Many
photographers seem to think that if they can't afford the space to store the RAW
is no point in storing anything. And yet it has been proved
(See 529 ) that small
JPEG files can produce excellent print output using interpolation software.
Interpolation software will only get better.
Example Of A Shoot
Hereœs how this might work on a shoot at an elementary school. On the way to the
school, the photographer enters into his browser software general information
school, and its location in both the caption and keyword fields. He may also
want to enter special information about why this school was chosen for the shoot
one school in the state in 2002". It may be helpful to include contact
information for who to call to get more details later.
After the shoot the photographer downloads the captured files from the Compact
Flash Card into his computer and all the generic information is automatically,
instantaneously, added to the IPTC header of every image. The images are also
re-named using the naming convention the photographer has developed.
Once the images are on the computer hard drive, the photographer can then add
supplemental information to certain sets of images. Using a browser it is
possible to choose
sequences of images and add additional information to every image in the group.
It may be useful to know whether the class was 1st, 2nd or 3rd grade, or whether
science, social studies or geography and who the teacher was.
Normally, in such a shoot, someone might be taking notes throughout the day to
use later in captioning. Since the images are available for immediate viewing
after the shoot
(no film to process), it may be more productive to sit down with someone from
the school who could assist in identifying the various teachers and activities.
Allowing a half
hour at the end of the shoot for this activity could be easier than trying to
take notes while shooting. Completing the captions while still on location will
to more accurate and complete captioning than setting aside time at a later date
to do it. The same is probably true for a travel shoot. Getting a local to
after the images have been shot is likely to lead to better captions.
The photographer's goal should be: input basic simple data; attach data to all
images as they are downloaded to the hard drive; save a copy and then hand off
Jobs For Assistants
The following are post-camera processing jobs that could be done by an assistant
and thus free the photographer to produce new images. Some photographer may want
responsibility for these things in order to save on overhead, but too often
images with stock potential never get out the door as they wait for the
photographer to find time
to deal with the administrative details.
The extra jobs include:
Editing the take and choosing images that will be sent forward for marketing.
While I recommend storing all images captured, every frame will not be sent to a
customer or put on a portal. In the initial edit images might be grouped in
Some will be needed immediately by a specific customer. Others may be judged to
be worthy of being loaded on a portal and this group will probably require
captioning and keywording. Still others (like the Monica Lewinsky photos) will
simply remain as part of the photographer's internal thumbnail catalog. No
or keyword information will be added to these images, but they can still be
easily located, if necessary, from the basic generic information added to the
image during the
download from the camera to the hard drive.
Once any group of images has been identified a keyword can be added to every
image in the group enabling the entire group to be recovered later. If images
selected to go
immediately to customers should also be placed on a portal at a later point
keywords identifying both selections can be put on those images. In this way
they can be called
up as part of either selection.
Building A Catalog
Once the caption and keyword information has been added to groups of images from
a particular shoot it is time to import the images into the photographer's
This catalog will only store thumbnail and preview images. The large files will
be stored elsewhere and don't need to be linked because they will be located
using the image
number of the thumbnail.
It is much easier to do the basic captioning and keywording while the images are
still grouped according to the way they were shot, but the master catalog will
and a major time saver when it comes to future tracking of images. Many
photographers are not taking this step, but once the workflow process is
organized the work can be
done with scripts overnight and require virtually none of the photographer's
Captioning For Portals
The way a caption is written depends to a great extent on how the Portal search
engine operates. The ideal search engine, in my estimation, would only search
on a defined
keyword block. It would not search the caption. If it is going to search the
caption at the very least it should give the user options of searching for
keywords, captions or
both. A good caption is usually a narrative and will often have general
information that may be useful background, but may not necessarily relate
specifically to the image.
The problem with searching captions that are free form text is that you often
get inappropriate hits. For example, if a caption talks about baby boomers any
"baby" will bring up this image. Another example is a picture captioned "a
couple of tanks" that comes up whenever you search for couple. A lot of this
kind of thing happens
when you are searching captions. If the search engine is looking at captions
you've got to keep them short and very specific.
On the other hand, sometimes when you are looking for something very unique and
specific it is very helpful to have more expansive information about the image.
think the industry -- particularly the editorial side œ will recognize the value
of having this additional general information when trying to locate an image for
use. That's part of the reason why I urge photographers to store this
information now. But, from a search point of view we have to deal with the
realities of how the
current search engines work today. If the photographer plans to put images on
several portals -- which is advisable -- the requirements for each may be
It is likely that the basic captions and keywords may need to be customized and
modified to fit the specific needs of each portal.
In addition to putting a basic caption in the IPTC header, keywords can be added
in a separate field. In my discussions with photographers who are doing digital
who do extensive captions, I haven't been able to find anyone who has given much
thought to using keywords that would help them locate and market the images. The
should include all words from the caption that specifically relate to the image.
This offers tremendous potential.
If you shoot specialist subject matter there may be certain keywords connected
with the specialty that would be useful, but which an agent, or someone further
marketing chain might not think of. If those words are in the IPTC header, and
the agent is encouraged to first look at what's in the header it can help insure
image could be found by a specialist user.
When medical or scientific subjects are photographed it could be very helpful
for the photographer or staff with an understanding of the situation to provide
keywords. Often very specific terms are used to identify certain processes or
procedures. Having these terms as keywords can insure that an expert in the
field would be
able to find your image when he needs something very specific. At the other end
of the spectrum, sometimes keyworders forget to use the very broad terms. They
may identify a
shot in India as being the Taj Mahal, but forget to include "Travel" as a
keyword. If a customer puts in the keywords "Travel and India" this particular
picture of the Taj
Mahal will not appear because it didn't have the keyword "Travel". If the
searcher used "Travel or India" then the image would have appeared because it
had the keyword
"India", but nobody would ever search in this way because they would get
"Travel" from every other part of the world as well as "India".
A photographer who shoots travel exclusively may not think to put such broad
terms into their keywords. It is also unlikely that the term "Travel" will
appear in the
caption. But it is very important to have such broad terms in order to narrow a
search. This is particularly true if the image is eventually put on a general
handles all types of imagery rather than a site devoted exclusively to travel.
Be very careful about compound keywords as they may produce inappropriate hits
and yet sometimes they are absolutely necessary. Washington DC is not
Washington State so
compound keywords are needed although it may be better to include DC and WA as
separate keywords when trying to identify something from one of these locations.
keyword is one that has several different meaning like "turkey" be sure to add
other keywords such as "Middle East", "meat" or "bird" so the searcher can use
to narrow the search.
Some of the selected images may need some color correcting work with Photoshop
and this can be a job for an assistant. Combined these activities can require a
lot of time.
While the photographer could do them it may be more cost effective to pass the
responsibility on to an assistant or someone with skills in keywording.
At this point the weakest link in getting digitally captured images where they
can be seen by customers is the stock agent or portal. Most stock agents and
focused on having large original files and consequently will not accept those
created by 35mm digital cameras. They argue that the customers need BIG files.
In taking this
position they are missing the opportunity to sell to that editorial market we
talked about in ( DC -
Market Segments ) but that is likely to change.
If agents want to stick with offering large files they need to carefully
consider the potential of using Genuine Fractels, or something like it to res up
digitally captured images to the file size they think is needed. In the final analysis
the customer doesn't need a particular file size, he needs a file that will offer sharpness,
detail and color fidelity at a certain reproduction size. If agents examine the science of
digital capture they will discover that they can get better quality from much smaller files
than is possible when the capture method is film
(See DC - Reproduction Quality ).
I've had editorial photographers tell me that they must capture in RAW format
because that gives them the largest file size and the most information. But,
complain that they can't store many images because the file sizes are too large
and take too much time to store. These photographers also resist resing up their
because that would be "manipulation of the image" and somehow "unethical". On
the other hand these photographers have no problem in providing dupe film of an
customers and these film dupes are a lot less accurate to the original than a
resed up digital file. And these photographers have no problem shooting with
35mm instead of
4x5 despite the fact that the 4x5's would have given them a larger original and
more information. Such attitudes are driven by what photographers are hearing
agents and sales representatives. The industry needs to get rid of the myths and
deal with facts. Agents need to educate themselves on the benefits of digital
then start promoting these benefits to their customers, rather than resisting
Alamy.com, TheImageWorks.com and Auroraphotos.com are portals that currently
will accept smaller files. I would expect many others to start to move in this
direction in 2003
and 2004. Agents and portals need to develop procedures to make it easier for
their suppliers to upload images, and they need to work out ways to effectively
caption and keyword data that photographers will be able to supply.
Agents need to develop procedures to extract the caption and keyword data from
the IPTC headers and provide their photographers with guidance as to how to
prepare image for
delivery. They also need to make it easier for photographers to upload images
and data to their sites.