644 VISUAL SEARCH
June 21, 2004
The stock photo industry is on the verge of the next major change in the way images are searched and found online. This change could make it easier for small player to get more of their images online, and significantly change the playing field for larger players depending on how quickly they adapt to, and promote, the new technology.
Visual search, coupled with keyword search, has the potential to make it much easier and quicker for customers to find the right image for their needs in any database that offers a broad range of image options. It is also likely to dramatically change many of the current assumptions about how photo databases should be structured.
At the recent CEPIC International Congress in Copenhagen I saw two examples of this visual search technology in action -- Espion from Idee Inc. ( www.ideeinc.com ) and VIMA Technologies ( www.vimatech.com ). Two other companies with visual search offerings are www.pixlogic.com and www.ltutech.com .
The one that most intrigued me was VIMA's VisualSearch product. Here's how it works. First of all it is designed not as a totally stand alone product, but to work in conjunction " with existing keyword searching software as a type of "Advanced User Option". The larger the database of images the more useful this search technique is likely to be.
For example if you enter a keyword like "couple" on Getty you get 31,225. On Corbis you get several thousands (they don't tell you how many). On Alamy you get 38,955. You may be able to narrow these grouping by adding other keywords to your search such as: senior, young, minority, ethnic, latino, European, happy, beach, etc. However, the user is still likely to get a lot of images to search through. And if some of the images are not keyworded properly will these qualifying words you may miss seeing them altogether.
Rather than scrolling through such a large group of images, page by page, VIMA gives you a side screen with 12 examples of images picked randomly from this entire group. The user then clicks on one or more of these 12 to indicate those images that most nearly have attributes she is looking for. The important thing here is not necessarily to find exactly the image she is looking for, but to identify those images (by not clicking them) that are not at all close to what she wants. In all likelihood none of the images will be close to what the buyer finally chooses - but some will be a lot closer (have similar attributes) than others.
The total selection of images is then automatically re-indexed putting at the top of the search results images with characteristics similar to those chosen. One of the important factors in this process is that images with characteristics similar to those not chosen drop to the bottom of the search results, or they can be removed entirely from the search results.
VIMA has spent a decade researching the most perceptually-sound and statistically useful visual features from over 500 possibilities. The software has narrowed the attributes considered to between 100 and 200 and indexes each image at the time it is added to the database. This index file takes up less than 1K of space. After the customer makes a selection, the software determines in milliseconds all the other images that have any similar characteristics to the images chosen. Some may have over 100 similar attributes, some 90 or 70 all the way down to two or three. The new return order of thumbnails is based on putting those with the highest number of similar characteristics at the top of the images shown and prioritizing the rest based on the number of similar attributes they have.
So suddenly, if an image with a lot of similar characteristics to the ones this user has chosen happened to be at the bottom of pile in the initial search, it is moved near the top and this buyer has a chance to see it.
This "adaptive learning" process uses positive and negative feedback in a quick, intuitive manner to zero in on the best matching images faster than is possible with scrolling or text searching. Using this unique approach, every image searcher molds their search process and the results to reflect their individual and cultural idiosyncrasies.
The next person who comes in to search the same subject matter may have a totally different idea of what they are want for their particular project and the image that was moved to the top in this search may remain at the bottom in the next, or new, searches.
But picking from 12 images is not the end of the refinement that this engine allows. Once you've made your first pick and the database has been re-indexed you get another 12 images selected from those that had a majority of the characteristics in the first selection. This search-within-a-search process makes it possible to further focus the selection. The user can do this as many times as she wants - continually narrowing the returns along the lines of her particular need at the moment. If the results seem to be taking her in the wrong direction she can page back at any time, make additional or different selections and head off in a new direction. Once it appears that many of the images in the return are on target, then it's the time to begin scrolling page by page to find the exact image for the project.
In addition, if the customer has a picture or a comp drawing and would like to find something similar, but different, she can upload that image and search on its characteristics. If customers want to find other similar images in the database that might be licensed by competitors they can easily identify such images using visual search.
Factors To Consider
Important to the success of this technique is the number of attributes being considered.
With too few attributes you may get a less accurate and refined search and more images that do not quite match the characteristics the user was trying to identify. However, the fewer features the faster the processing. Software developers need to exercise great care in determining the attributes to consider and the weight to attach to each of them. In the demonstrations I observed, processing time appeared not to be a factor. Everything came up very quickly, and as fast, or faster, than with basic keyword search. On the other hand, these demonstrations were on laptops. A hookup over the internet might produce different results.
Weighing The Attributes
Currently the systems are designed to find as many images as possible with all the similar attributes. On the other hand, it is possible for an agency to make available other advanced search features that give added weight to certain attributes over others.
Consider a situation where an art director has designed a campaign with an overall color tone, or mood. The subject matter of the picture is not particularly important as long as the basic color tone of the picture is consistent. In fact, the more subject variety the better because the color tone is the driving force of this campaign. Pictures like this will probably be very difficult to find using keywords, but with a weighted visual search they could be identified quickly.
The same might be true for a graphic pattern in pictures. If the designer is developing a campaign that emphasizes a certain graphic pattern, and it is at all complex (not just negative space in the top left corner), chances are it is going to be very difficult to find pictures in all subject areas that fit that pattern. A weighted visual search that lets the user emphasize pattern could be a quick solution.
While there are lots of possibilities developers caution against offering too many options initially that might overwhelm the user. Too many switches, levers, scrolling, reading and decisions can get the user tied in knots and they may never get around to FINDING. I point out these features mostly to indicate that they are currently possible - the technology is there - but not to suggest that they should be part of the initial tool.
As buyers get used to visual search they may be willing to use more tools, in the same way that they started out using single keywords to find images, but now many use complex phrases to narrow their searches.
Another thing to consider is a 100% match. David Telleen-Lawton, CEO of VIMA Technologies says, "100% matching is actually easy. The tough part is a 98% match or a 93% match and 'knowing' that it is an 'exact' match with just some slight color or cropping or whatever changes versus perceptually not so good match that just happens to score well. A very useful application is to use the VIMA ImageSearch to review incoming new images against an existing database to find duplicates."
In evaluating visual search tools it will be important to consider whether the user can narrow the search "within a selection", or whether each time the user makes a selection it re-indexes the larger original keyword selection.
The Idee Sim Search is currently being used by Masterfile's on its Wonderfile RF site (www.wonderfile.com). There is a little Sim Search icon under each picture. After doing a keyword search you can click on this icon under any images and it will re-index the file based on images that have at least a 65% physical match with the image selected. It drops all images with less than a 65% match. I did a search for couples and got 6018 images. After choosing one image I got 863 images with at least a 65% match. Wonderfile has been using Sim Search since September of 2003 and you can expect to see it integrated into Masterfile's main site in the near future.
Right Brain, Left Brain
One of the great advantages of this method of search is that it takes full advantage the right brain intuitive, subjective senses that are the strength of most creative people, rather than the left brain functions that require them to define in words what it is they want.
Art directors will love playing with such search results.
Currently many online sites try to help the user by having their editors pick the images they believe customers will want and prioritizing them in the way they appear on the site so the images of interest to the average user will appear first in the search results. There are two problems with this strategy. First, there are huge variations in what different users want and need at any given moment. Secondly, it is impossible for any editor to know all these different needs, and if they did, it is impossible to organize any collection where the top priorities of ANY user always come up first. Visual search overcomes both of these problems. Each customer organizes the returns based on their needs priority at the moment. It also allows customers to consider a greater number of photos while subjecting them to many fewer inappropriate photos - by the customer's definition of what is inappropriate.
With visual search it is no longer as necessary to have your images at the top of the initial search results because the order will be continually adjusted based on the needs of the specific user. Now each Buyer organizes the research return results based on his or her individual artistic and cultural idiosyncrasies.
Often the newest image is not the one that is of greatest interest to the buyer. In many subject areas where there is nothing in the images (such as clothes) that might identify when the image was shot, an image produced five, ten or fifteen years previously may be the one that best fits the buyers needs. If images are organized in the database based on when they were added older images that might once have been best sellers may get rapidly pushed to the bottom. This is no longer a problem with visual search. Images that have been in the database for a long time are no longer hard to find if they have common characteristics with what the buyer is looking for.
It also becomes unimportant to purge the database of what might be considered old, or outdated images. These images will naturally work their way to the bottom of the pile and will only be seen when they fit the characteristics of what some buyer is looking for. They will not detract from the general experience of most buyers, but will be there on those rare occasions when a buyer is looking for exactly that image.
Some buyers indicate that they will occasionally start searching from the last pages in a group of search returns in an effort to find something different from what everyone else is using. With visual search this technique will no longer be useful or necessary.
Large of Small Database
Once visual search is adopted the entire logic behind keeping a database small and tightly edited so users are not forced to wade through many inappropriate images is no longer valid. When each user can quickly define visually what is appropriate for him it is no longer necessary to be concerned about the total number of images in the database.
I predict that those looking for images will find visual search so attractive that they will only want to use sites that offer it, and will not want to struggle to define in words what they are seeking. If this turns out to be the case tightly edited sites will be at an additional disadvantage because they are not likely to have enough images in any particular category to make visual search worthwhile.
Creative vs. Editorial
Visual search will work much better for people looking for concept illustrations than for those looking for very specific editorial subjects. If you're looking for a specific person, or event keywords will work much better than a visual search. On the other hand if you're looking for a specific location - Paris for example - but there are lots of results, visual search may help in narrowing and prioritizing the images within those results. Offering both visual and keyword search allows the customer to choose the best technique for their particular search at that moment.
Keywording has been a major cost and thus an impediment in getting large numbers of images online. By coupling visual search with keyword search, it may be possible to get by with a simpler set of keywords, focusing on general category words and forgetting about some of the concept words and synonyms. This may make it possible for sellers to cut costs and put a lot more images online quickly while at the same time providing a more efficient and customer friendly search that makes it possible for the buyer to more quickly locate the image she wants.
Part of the problem with keyword search is that often the searcher's vocabulary and the image-owner's vocabulary are not synchronized.
Visual search also makes it much easier to sell into countries where the language spoken is different from that in which the image was originally keyworded. Many English words have very different meanings when they are directly translated into other languages and sometimes even in other English speaking countries. If someone in the UK is looking for a picture of a boot of a car they'll be more likely to find a U.S. keyworded images by using a visual search technique than by inputing the keywords "boot" and "car".
Agencies in non-English speaking countries will be able to add images from English speakers to their sites at a much more rapid pace, and may also find that agencies in English speaking countries are more willing to accept more of their images.