Search Algorithms: How Do They Work?

Posted on 10/19/2016 by Jim Pickerell | Printable Version | Comments (1)

Microstock sites used to surface new images for weeks or months after they were uploaded. Now, photographers are saying that this no longer seems to be happening. It would be nice if photographers had more information and a better understanding about how the search algorithms work.

I have no secret information and have not been let into the confidence of any of the stock agencies. The agencies argue that information about how their algorithms work is proprietary. If they were to reveal their “secret sauce” it would give their competitors an advantage.

Nevertheless, here’s what I think is happening.



We know that the search return order (SRO) includes some combination of new images, and images that have been previously downloaded by customers. Exactly what percentage we don’t know, but for the sake of this discussion lets assume 50/50.

We believe that most customers will not look at more than 500 thumbnails resulting from any search before changing the search parameters.



Shutterstock is uploading more than 100,000 new images a week. Some agencies are uploading a lot less, but still some pretty huge numbers.

It seems reasonable to assume that of that at least 500 (1/200th) of any 100,000 images would have the same or very similar keywords. There may be some with very unusual keywords, but these images tend to be of subjects that are seldom requested by customers.

In any given day there are 1,440 minutes. If a new image is uploaded every 10 seconds, then 8,640 images could be uploaded in 24 hours.



So


Assume that one of your images is uploaded at 8:00am EST on a Monday morning. A new image with the same keywords is uploaded at 8:00:10; another at 8:00:20; another at 8:00:30 and so on throughout the day. By 8:01 your image is sixth in line of the newest images. By 9:00am your image is 360th in line of the newest images added to the collection.

But, remember, the average customer is looking at less than 500 images in a search return and half of those are images that have been downloaded previously. So whenever a customer does a search the last 250 newest images uploaded have a chance of being seen. By 9:00am on that Monday morning the image uploaded at 8:00am has little chance of ever being seen again.

If during that hour a customer, saw the image, liked it and downloaded it then the image moves into the “downloaded” category and has the potential of a longer useful life. But, if that customer happened to sleep in that morning, and not get started searching until 9:00am; tough luck for the photographer.

Now, of course, not every image uploaded in that hour will have the same keywords. Different images will have lots of different keyword. So it may be several hours, even a day or so, before there have been 250 newer images with most of the same keywords as the image uploaded at 8:00am Monday. But, think about the most popular subject matter and how many images with similar keywords must be uploaded on a regular basis.

Suppose, also, that you submit 10 images from the same shoot, all with the same keywords. Those images will be uploaded every 10 seconds one after another. All will have the same very short useful life. If a customer sees one of them and downloads it that image moves into the “used” category. But within a very short period of time the other 9 get buried so deeply in the search return order that it is unlikely they will ever be seen again.

If you do have 10 similar images, it probably makes more sense to upload one every day, or every week rather than uploading them all at once. It that way you have a better chance that 10 different people will see them than if they are uploaded all at once. However, I don’t think the agencies approve of submissions in that manner.

What happens to images uploaded on Saturday and Sunday when very few customers are actually searching the site. If the same volume of images is being uploaded on a Saturday as a Monday, there is much less chance that a customer will be reviewing images on Saturday than on Monday. Thus, there is much less chance that any Saturday image will move into the “used” category. By Monday, none of the Saturday images will still be “new” and there is a big likelihood that almost all of them will now be so deep in the SRO that they will never be seen.

There may be no solution to this problem, but we can be sure that as more and more images are added it will get worse.


Copyright © 2016 Jim Pickerell. The above article may not be copied, reproduced, excerpted or distributed in any manner without written permission from the author. All requests should be submitted to Selling Stock at 10319 Westlake Drive, Suite 162, Bethesda, MD 20817, phone 301-461-7627, e-mail: wvz@fpcubgbf.pbz

Jim Pickerell is founder of www.selling-stock.com, an online newsletter that publishes daily. He is also available for personal telephone consultations on pricing and other matters related to stock photography. He occasionally acts as an expert witness on matters related to stock photography. For his current curriculum vitae go to: http://www.jimpickerell.com/Curriculum-Vitae.aspx.  

Comments

  • Richard Gardette Posted Oct 24, 2016
    When we nowadays analyse the battles of WW1, where 1 000 infantry soldiers were sent on one machine gun, with 10 survivors at the end,
    we call it “criminal”, “stupid”, “primitive”… dated and irrational.
    But for Shutterstock, the same behavior in stock imagery competition is “smart and modern”.

Post Comment

You must log in to post comments.

Stay Connected

Sign up to receive email notification when new stories are posted.

Follow Us

Free Stuff

Stock Photo Pricing: The Future
In the last two years I have written a lot about stock photo pricing and its downward slide. If you have time over the holidays you may want to review some of these stories as you plan your strategy ...
Read More
Future Of Stock Photography
If you’re a photographer that counts on the licensing of stock images to provide a portion of your annual income the following are a few stories you should read. In the past decade stock photography ...
Read More
Blockchain Stories
The opening session at this year’s CEPIC Congress in Berlin on May 30, 2018 is entitled “Can Blockchain be applied to the Photo Industry?” For those who would like to know more about the existing blo...
Read More
2017 Stories Worth Reviewing
The following are links to some 2017 and early 2018 stories that might be worth reviewing as we move into the new year.
Read More
Stories Related To Stock Photo Pricing
The following are links to stories that deal with stock photo pricing trends. Probably the biggest problem the industry has faced in recent years has been the steady decline in prices for the use of ...
Read More
Stock Photo Prices: The Future
This story is FREE. Feel free to pass it along to anyone interested in licensing their work as stock photography. On October 23rd at the DMLA 2017 Conference in New York there will be a panel discuss...
Read More
Important Stock Photo Industry Issues
Here are links to recent stories that deal with three major issues for the stock photo industry – Revenue Growth Potential, Setting Bottom Line On Pricing and Future Production Sources.
Read More
Recent Stories – Summer 2016
If you’ve been shooting all summer and haven’t had time to keep up with your reading here are links to a few stories you might want to check out as we move into the fall. To begin, be sure to complet...
Read More
Corbis Acquisition by VCG/Getty Images
This story provides links to several stories that relate to the Visual China Group (VCG) acquisition of Corbis and the role Getty Images has been assigned in the transfer of Corbis assets to the Gett...
Read More
Finding The Right Image
Many think search will be solved with better Metadata. While metadata is important, there are limits to how far it can take the customer toward finding the right piece of content. This story provides...
Read More

More from Free Stuff