How To Download Pictures From Google On Python

Allow's face up information technology. Trying to search for images based on text and tags sucks.

Whether you are tagging and categorizing your personal images, searching for stock photos for your company website, or simply trying to find the right epitome for your next epic blog post, trying to employ text and keywords to depict something that is inherently visual is a real pain.

I faced this pain myself last Tuesday every bit I was going through some old family photograph albums in that location were scanned and digitized nine years ago.

Yous meet, I was looking for a bunch of photos that were taken along the beaches of Hawaii with my family. I opened upward iPhoto, and slowly made my way through the photographs. Information technology was a painstaking procedure. The meta-information for each JPEG contained wrong dates. The photos were non organized in folders like I remembered — I simply couldn't find the beach photos that I was desperately searching for.

Possibly past luck, I stumbled across one of the embankment photographs. Information technology was a beautiful, almost surreal embankment shot. Puffy white clouds in the sky. Crystal clear sea water, lapping at the golden sands. Yous could literally feel the breeze on your peel and smell the bounding main air.

After seeing this photo, I stopped my manual search and opened up a code editor.

While applications such as iPhoto let you organize your photos into collections and even detect and recognize faces, we can certainly do more.

No, I'k not talking about manually tagging your images. I'yard talking nearly something more powerful. What if yous could actually search your collection of images using an another image?

Wouldn't that exist cool? It would allow you to utilize visual search to your own images, in merely a unmarried click.

And that's exactly what I did. I spent the next half-hour coding and when I was washed I had created a visual search engine for my family unit vacation photos.

I and then took the sole beach image that I found and and then submitted it to my paradigm search engine. Within seconds I had establish all of the other beach photos, all without labeling or tagging a single image.

Sound interesting? Read on.

In the residue of this web log post I'll evidence you how to build an image search engine of your own.

Looking for the source code to this mail?

Bound Correct To The Downloads Department

What's an Image Search Engine?

And then you're probably wondering, what actually is an image search engine?

I mean, nosotros're all familiar with text based search engines such every bit Google, Bing, and DuckDuckGo — yous just enter a few keywords related to the content you desire to find (i.e., your "query"), and then your results are returned to you. Only for paradigm search engines, things work a little differently — y'all're not using text equally your query, yous are instead using an image.

Sounds pretty hard to practise, correct? I mean, how do you lot quantify the contents of an image to make information technology search-able?

We'll encompass the reply to that question in a bit. Merely to outset, let's learn a little more about prototype search engines.

In general, there tend to be three types of epitome search engines: search by meta-data, search by example, and a hybrid approach of the two.

Search by Meta-Information

Figure 1: Example of a search by meta-deta image search engine. Notice how keywords and tags are manually attributed to the image. — **Figure one:** Instance of a search by meta-deta epitome search engine. Observe how keywords and tags are manually attributed to the image.

Searching past meta-data is only marginally different than your standard keyword-based search engines mentioned to a higher place. Search by meta-data systems rarely examine the contents of the image itself. Instead, they rely on textual clues such as (1) manual annotations and tagging performed by humans along with (2) automatic contextual hints, such as the text that appears near the image on a webpage.

When a user performs a search on a search by meta-data system they provide a query, simply like in a traditional text search engine, and and then images that have similar tags or annotations are returned.

Again, when utilizing a search past meta-data arrangement the bodily image itself is rarely examined.

A slap-up instance of a Search by Meta-Information image search engine is Flickr. After uploading an paradigm to Flickr you are presented with a text field to enter tags describing the contents of images you have uploaded. Flickr so takes these keywords, indexes them, and utilizes them to notice and recommend other relevant images.

Search by Example

Search past example systems, on the other paw, rely solely on the contents of the image — no keywords are causeless to be provided. The image is analyzed, quantified, and stored and so that similar images are returned past the system during a search.

Image search engines that quantify the contents of an paradigm are called Content-Based Paradigm Retrieval (CBIR) systems. The term CBIR is commonly used in the academic literature, but in reality, it's but a fancier way of saying "image search engine", with the added poignancy that the search engine is relying strictly on the contents of the epitome and not whatsoever textual annotations associated with the image.

A great case of a Search by Example system is TinEye. TinEye is really a opposite image search engine where you lot provide a query paradigm, and and then TinEye returns nearly-identical matches of the same image, along with the webpage that the original prototype appeared on.

Take a wait at the example image at the top of this section. Here I have uploaded an prototype of the Google logo. TinEye has examined the contents of the paradigm and returned to me the xiii,000+ webpages that the Google logo appears on after searching through an alphabetize of over half dozen billion images.

So consider this: Are you going to manually label each of these vi billion images in TinEye? Of class non. That would take an army of employees and would exist extremely costly.

Instead, y'all utilize some sort of algorithm to extract "features" (i.eastward., a list of numbers to quantify and abstractly stand for the image) from the image itself. And so, when a user submits a query image, you extract features from the query epitome and compare them to your database of features and try to detect similar images.

Again, information technology's of import to reinforce the indicate that Search by Example systems rely strictly on the contents of the image. These types of systems tend to be extremely hard to build and calibration, only let for a fully automated algorithm to govern the search — no man intervention is required.

Hybrid Approach

Figure 3: A hybrid image search engine can take into account both text and images. — **Figure 3:** A hybrid epitome search engine can have into account both *text* and *images*.

Of course, there is a middle ground between the ii – consider Twitter, for example.

On Twitter you lot tin upload photos to accompany your tweets. A hybrid arroyo would exist to correlate the features extracted from the epitome with the text of the tweet. Using this arroyo you could build an image search engine that could have both contextual hints along with a Search by Example strategy.

Note: Interested in reading more near the different types of image search engines? I have an unabridged blog postal service dedicated to comparing and contrasting them, available here.

Let'southward move on to defining some important terms that nosotros'll apply regularly when describing and building paradigm search engines.

Some Important Terms

Earlier we become likewise in-depth, allow's have a little bit of time to ascertain a few important terms.

When building an image search engine we volition showtime have to index our dataset . Indexing a dataset is the procedure of quantifying our dataset past utilizing an prototype descriptor to extract features from each image.

An image descriptor defines the algorithm that we are utilizing to describe our image.

For case:

The hateful and standard difference of each Crimson, Green, and Blue channel, respectively,
The statistical moments of the image to narrate shape.
The gradient magnitude and orientation to draw both shape and texture.

The important takeaway hither is that the image descriptor governs how the image is quantified.

Features , on the other hand, are the output of an image descriptor . When you put an image into an image descriptor, y'all will get features out the other end.

In the most basic terms, features (or feature vectors ) are just a list of numbers used to abstractly represent and quantify images.

Take a look at the case effigy below:

Figure 4: The pipeline of an image descriptor. An input image is presented to the descriptor, the image descriptor is applied, and a feature vector (i.e a list of numbers) is returned, used to quantify the contents of the image. — **Effigy 4:** The pipeline of an image descriptor. An input image is presented to the descriptor, the image descriptor is applied, and a feature vector (i.east a list of numbers) is returned, used to quantify the contents of the prototype.

Here we are presented with an input image, we apply our image descriptor, so our output is a list of features used to quantify the paradigm.

Feature vectors can then exist compared for similarity past using a distance metric or similarity function . Distance metrics and similarity functions accept 2 feature vectors as inputs and then output a number that represents how "like" the two feature vectors are.

The figure beneath visualizes the procedure of comparing 2 images:

Figure 5: To compare two images, we input the respective feature vectors into a distance metric/similarity function. The output is a value used to represent and quantify how "similar" the two images are to each other.. — **Figure v:** To compare 2 images, nosotros input the respective feature vectors into a altitude metric/similarity function. The output is a value used to represent and quantify how "similar" the two images are to each other..

Given ii feature vectors, a distance function is used to determine how similar the two feature vectors are. The output of the distance office is a single floating point value used to represent the similarity between the ii images.

The 4 Steps of Any CBIR System

No matter what Content-Based Image Retrieval System you are building, they all tin exist boiled down into 4 distinct steps:

Defining your image descriptor: At this phase you need to decide what aspect of the prototype you desire to describe. Are y'all interested in the color of the epitome? The shape of an object in the image? Or do you want to characterize texture?
Indexing your dataset: Now that you have your image descriptor defined, your task is to apply this image descriptor to each image in your dataset, extract features from these images, and write the features to storage (ex. CSV file, RDBMS, Redis, etc.) so that they can exist later compared for similarity.
Defining your similarity metric: Cool, now you lot accept a agglomeration of feature vectors. But how are y'all going to compare them? Popular choices include the Euclidean distance, Cosine distance, and chi-squared distance, merely the actual choice is highly dependent on (1) your dataset and (2) the types of features you lot extracted.
Searching: The final step is to perform an actual search. A user volition submit a query image to your system (from an upload course or via a mobile app, for case) and your task will exist to (1) extract features from this query image and then (2) apply your similarity function to compare the query features to the features already indexed. From there, you just return the most relevant results according to your similarity function.

Again, these are the most basic iv steps of any CBIR system. As they get more than complex and utilize different feature representations, the number of steps grow and you'll add a substantial number of sub-steps to each stride mentioned above. Just for the time being, let's keep things simple and utilize just these 4 steps.

Permit'southward take a look at a few graphics to make these high-level steps a little more concrete. The effigy beneath details Steps i and ii:

Figure 6: A flowchart representing the process of extracting features from each image in the dataset. — **Figure half-dozen:** A flowchart representing the process of extracting features from each paradigm in the dataset.

We outset past taking our dataset of images, extracting features from each prototype, and and so storing these features in a database.

We can then move on to performing a search (Steps 3 and 4):

Figure 7: Performing a search on a CBIR system. A user submits a query, the query image is described, the query features are compared to existing features in the database, results are sorted by relevancy and then presented to the user. — **Figure vii:** Performing a search on a CBIR organisation. A user submits a query, the query paradigm is described, the query features are compared to existing features in the database, results are sorted by relevancy and then presented to the user.

Starting time, a user must submit a query paradigm to our prototype search engine. We then take the query image and extract features from it. These "query features" are so compared to the features of the images we already indexed in our dataset. Finally, the results are and so sorted by relevancy and presented to the user.

Our Dataset of Vacation Photos

We'll exist utilizing the INRIA Holidays Dataset for our dataset of images.

This dataset consists of various vacation trips from all over the world, including photos of the Egyptian pyramids, underwater diving with sea-life, forests in the mountains, wine bottles and plates of food at dinner, boating excursions, and sunsets beyond the ocean.

Here are a few samples from the dataset:

Figure 8: Example images from the INRIA Holidays Dataset. We'll be using this dataset to build our image search engine. — **Figure eight:** Example images from the INRIA Holidays Dataset. We'll be using this dataset to build our image search engine.

In full general, this dataset does an extremely expert job at modeling what we would expect a tourist to photograph on a scenic trip.

The Goal

Our goal here is to build a personal paradigm search engine. Given our dataset of vacation photos, nosotros want to make this dataset "search-able" by creating a "more than like this" functionality — this will be a "search by example" image search engine. For instance, if I submit a photo of canvass boats gliding beyond a river, our epitome search engine should exist able to find and remember our vacation photos of when we toured the marina and docks.

Take a look at the example below where I have submitted an photo of the boats on the water and have found relevant images in our vacation photo collection:

Figure 9: An example of our image search engine. We submit a query image containing boats and the sea. The results returned to us are relevant since they too contain both boats and the sea. — **Figure 9:** An instance of our image search engine. We submit a query image containing boats on the ocean. The results returned to us are relevant since they as well contain both boats and the sea.

In lodge to build this organization, we'll be using a simple, nonetheless constructive image descriptor: the color histogram.

By utilizing a color histogram equally our paradigm descriptor, we'll be we'll be relying on the color distribution of the image. Considering of this, we have to make an important assumption regarding our image search engine:

Assumption: Images that take like color distributions will be considered relevant to each other. Even if images have dramatically different contents, they volition withal be considered "similar" provided that their color distributions are like equally well.

This is a really of import assumption , but is normally a off-white and reasonable assumption to make when using color histograms as paradigm descriptors.

Step 1: Defining our Image Descriptor

Instead of using a standard colour histogram, nosotros are going to apply a few tricks and brand information technology a little more robust and powerful.

Our epitome descriptor will exist a 3D color histogram in the HSV color infinite (Hue, Saturation, Value). Typically, images are represented as a 3-tuple of Cherry-red, Greenish, and Bluish (RGB). We ofttimes remember of the RGB color infinite as "cube", as shown below:

Figure 10: Example of the RGB cube. — **Effigy 10:** Example of the RGB cube.

All the same, while RGB values are unproblematic to understand, the RGB color space fails to mimic how humans perceive colour. Instead, we are going to utilize the HSV colour infinite which maps pixel intensities into a cylinder:

Figure 11: Example of the HSV cylinder. — **Effigy 11:** Instance of the HSV cylinder.

In that location are other color spaces that do an even better job at mimicking how humans perceive color, such as the CIE L*a*b* and CIE XYZ spaces, but let'southward keep our color model relatively simple for our first image search engine implementation.

So now that we accept selected a color space, we now demand to define the number of bins for our histogram. Histograms are used to give a (rough) sense of the density of pixel intensities in an image. Essentially, our histogram volition judge the probability density of the underlying part, or in this case, the probability P of a pixel colour C occurring in our prototype I.

It's important to note that in that location is a merchandise-off with the number of bins you select for your histogram. If y'all select too few bins, then your histogram will have less components and unable to disambiguate between images with substantially different color distributions. Likewise, if y'all use likewise many bins your histogram will have many components and images with very similar contents may be regarded and "non similar" when in reality they are.

Here's an example of a histogram with only a few bins:

Figure 12: An example of a 9-bin histogram. Notice how there are very few bins for a given pixel to be placed in. — **Effigy 12:** An example of a 9-bin histogram. Notice how at that place are very few bins for a given pixel to be placed in.

Notice how there are very few bins that a pixel can be placed into.

And hither'south an example of a histogram with lots of bins:

Figure 13: An example of a 128-bin histogram. Notice how there are many bins that a given pixel can be placed in. — **Figure 13:** An case of a 128-bin histogram. Notice how there are many bins that a given pixel can be placed in.

In the higher up instance you lot can run across that many bins are utilized, simply with the larger number of bins, you lose your ability to "generalize" between images with similar perceptual content since all of the peaks and valleys of the histogram volition accept to match in order for two images to be considered "similar".

Personally, I similar an iterative, experimental approach to tuning the number of bins. This iterative arroyo is unremarkably based on the size of my dataset. The more than smaller that my dataset is, the less bins I utilize. And if my dataset is large, I utilize more than bins, making my histograms larger and more discriminative.

In general, you'll want to experiment with the number of bins for your color histogram descriptor as it is dependent on (ane) the size of your dataset and (2) how similar the color distributions in your dataset are to each other.

For our vacation photo image search engine, we'll be utilizing a 3D color histogram in the HSV color infinite with 8 bins for the Hue channel, 12 bins for the saturation channel, and 3 bins for the value channel, yielding a full feature vector of dimension 8 x 12 x 3 = 288.

This means that for every prototype in our dataset, no matter if the image is 36 x 36 pixels or 2000 x 1800 pixels, all images will be abstractly represented and quantified using only a list of 288 floating point numbers.

I call up the best manner to explain a 3D histogram is to use the conjunctive AND. A 3D HSV colour descriptor will enquire a given image how many pixels have a Hue value that fall into bin #1 AND how many pixels have a Saturation value that autumn into bin #1 AND how many pixels have a Value intensity that fall into bin #1. The number of pixels that come across these requirements are then tabulated. This process is repeated for each combination of bins; still, nosotros are able to practice it in an extremely computationally efficient manner.

Pretty absurd, right?

Anyway, enough talk. Let'due south get into some code.

Open a new file in your favorite code editor, name it colordescriptor.py and let'due south get started:

# import the necessary packages import numpy as np import cv2 import imutils  class ColorDescriptor: 	def __init__(cocky, bins): 		# store the number of bins for the 3D histogram 		self.bins = bins  	def describe(self, image): 		# convert the paradigm to the HSV color infinite and initialize 		# the features used to quantify the paradigm 		image = cv2.cvtColor(image, cv2.COLOR_BGR2HSV) 		features = []  		# grab the dimensions and compute the centre of the image 		(h, w) = image.shape[:2] 		(cX, cY) = (int(w * 0.five), int(h * 0.v))

Nosotros'll kickoff by importing the Python packages we'll need. We'll utilize NumPy for numerical processing, cv2 for our OpenCV bindings, and imutils to bank check OpenCV version.

We and so ascertain our ColorDescriptor class on Line half dozen. This class will encapsulate all the necessary logic to extract our 3D HSV colour histogram from our images.

The __init__ method of the ColorDescriptor takes just a unmarried statement, bins , which is the number of bins for our color histogram.

We tin can and then define our depict method on Line eleven. This method requires an image , which is the image we want to describe.

Inside of our describe method we'll convert from the RGB color space (or rather, the BGR color infinite; OpenCV represents RGB images as NumPy arrays, but in opposite lodge) to the HSV color space, followed by initializing our list of features to quantify and stand for our image .

Lines 18 and 19 simply grab the dimensions of the image and compute the centre (10, y) coordinates.

So now the hard work starts.

Instead of computing a 3D HSV color histogram for the entire image, let's instead compute a 3D HSV color histogram for unlike regions of the epitome.

Using regions -based histograms rather than global -histograms allows u.s. to simulate locality in a colour distribution. For example, take a wait at this image below:

Figure 14: Our example query image. — **Effigy 14:** Our instance query epitome.

In this photo nosotros can clearly come across a blue sky at the top of the prototype and a sandy beach at the bottom. Using a global histogram we would be unable to determine where in the image the "blueish" occurs and where the "brown" sand occurs. Instead, we would just know that at that place exists some per centum of blue and some percentage of brown.

To remedy this problem, we tin can compute color histograms in regions of the prototype:

Figure 15: Example of dividing our image into 5 different segments. — **Figure fifteen:** Example of dividing our image into 5 unlike segments.

For our image descriptor, nosotros are going to dissever our image into 5 unlike regions: (1) the height-left corner, (two) the superlative-correct corner, (3) the bottom-right corner, (4) the bottom-left corner, and finally (v) the center of the image.

By utilizing these regions we'll exist able to mimic a crude form of localization, existence able to represent our above beach image equally having shades of blueish heaven in the top-left and top-right corners, brown sand in the bottom-left and bottom-right corners, and so a combination of blue sky and brown sand in the eye region.

That all said, here is the lawmaking to create our region-based color descriptor:

            # separate the image into four rectangles/segments (top-left, 		# top-right, bottom-correct, lesser-left) 		segments = [(0, cX, 0, cY), (cX, w, 0, cY), (cX, w, cY, h), 			(0, cX, cY, h)]  		# construct an elliptical mask representing the middle of the 		# image 		(axesX, axesY) = (int(w * 0.75) // 2, int(h * 0.75) // ii) 		ellipMask = np.zeros(image.shape[:2], dtype = "uint8") 		cv2.ellipse(ellipMask, (cX, cY), (axesX, axesY), 0, 0, 360, 255, -ane)  		# loop over the segments 		for (startX, endX, startY, endY) in segments: 			# construct a mask for each corner of the image, subtracting 			# the elliptical center from it 			cornerMask = np.zeros(image.shape[:ii], dtype = "uint8") 			cv2.rectangle(cornerMask, (startX, startY), (endX, endY), 255, -1) 			cornerMask = cv2.subtract(cornerMask, ellipMask)  			# extract a color histogram from the image, then update the 			# characteristic vector 			hist = self.histogram(image, cornerMask) 			features.extend(hist)  		# excerpt a colour histogram from the elliptical region and 		# update the feature vector 		hist = self.histogram(image, ellipMask) 		features.extend(hist)  		# return the feature vector 		return features

Lines 23 and 24 start by defining the indexes of our pinnacle-left, top-right, lesser-right, and bottom-left regions, respectively.

From at that place, nosotros'll need to construct an ellipse to stand for the eye region of the image. Nosotros'll do this by defining an ellipse radius that is 75% of the width and height of the image on Line 28.

We then initialize a blank image (filled with zeros to represent a black background) with the aforementioned dimensions of the prototype we want to describe on Line 29.

Finally, let's draw the actual ellipse on Line 30 using the cv2.ellipse role. This function requires eight different parameters:

ellipMask : The image we want to draw the ellipse on. We'll be using a concept of "masks" which I'll talk over shortly.
(cX, cY) : A 2-tuple representing the middle (ten, y)-coordinates of the prototype.
(axesX, axesY) : A 2-tuple representing the length of the axes of the ellipse. In this instance, the ellipse will stretch to be 75% of the width and superlative of the prototype that we are describing.
0 : The rotation of the ellipse. In this case, no rotation is required so we supply a value of 0 degrees.
0 : The starting angle of the ellipse.
360 : The catastrophe bending of the ellipse. Looking at the previous parameter, this indicates that we'll exist drawing an ellipse from 0 to 360 degrees (a full "circle").
255 : The color of the ellipse. The value of 255 indicates "white", significant that our ellipse volition be fatigued white on a black background.
-ane : The border size of the ellipse. Supplying a positive integer r volition draw a border of size r pixels. Supplying a negative value for r will make the ellipse "filled in".

Nosotros then allocate memory for each corner mask on Line 36, draw a white rectangle representing the corner of the epitome on Line 37, and and so subtract the center ellipse from the rectangle on Line 38.

If we were to animate this process of looping over the corner segments information technology would look something like this:

Figure 16: Constructing masks for each region of the image we want to extract features from. — **Figure xvi:** Constructing masks for each region of the epitome nosotros want to extract features from.

As this blitheness shows, nosotros examining each of the corner segments individually, removing the center of the ellipse from the rectangle at each iteration.

So you may be wondering, "Aren't we supposed to be extracting color histograms from our epitome? Why are doing all this 'masking' business?"

Nifty question.

The reason is considering we need the mask to instruct the OpenCV histogram function where to extract the color histogram from.

Recollect, our goal is to describe each of these segments individually. The most efficient way of representing each of these segments is to employ a mask. Only (ten, y)-coordinates in the image that has a corresponding (x, y) location in the mask with a white (255) pixel value will be included in the histogram calculation. If the pixel value for an (x, y)-coordinate in the mask has a value of black (0), information technology volition be ignored.

To reiterate this concept of simply including pixels in the histogram with a corresponding mask value of white, take a wait at the post-obit animation:

Figure 17: Applying the masked regions to the image. Notice how only the pixels in the left image are shown if they have a corresponding white mask value on the right. — **Figure 17:** Applying the masked regions to the image. Notice how just the pixels in the *left* image are shown if they take a respective white mask value in the image on the *right*.

As you can see, only pixels in the masked region of the image will exist included in the histogram adding.

Makes sense at present, right?

So at present for each of our segments we brand a call to the histogram method on Line 42, extract the color histogram by using the epitome we want to extract features from equally the first argument and the mask representing the region we want to describe as the second argument.

The histogram method then returns a color histogram representing the current region, which we append to our features list.

Lines 47 and 48 extract a colour histogram for the middle (ellipse) region and updates the features list a well.

Finally, Line 51 returns our feature vector to the calling function.

Now, let's quickly look at our actual histogram method:

            def histogram(self, epitome, mask): 		# extract a 3D color histogram from the masked region of the 		# image, using the supplied number of bins per channel 		hist = cv2.calcHist([image], [0, ane, 2], mask, self.bins, 			[0, 180, 0, 256, 0, 256])  		# normalize the histogram if nosotros are using OpenCV ii.4 		if imutils.is_cv2(): 			hist = cv2.normalize(hist).flatten()  		# otherwise handle for OpenCV 3+ 		else: 			hist = cv2.normalize(hist, hist).flatten()  		# return the histogram 		return hist

Our histogram method requires ii arguments: the outset is the image that nosotros want to describe and the second is the mask that represents the region of the image we want to depict.

Calculating the histogram of the masked region of the image is handled on Lines 56 and 57 by making a call to cv2.calcHist using the supplied number of bins from our constructor.

Our colour histogram is normalized on Line 61 or 65 (depending on OpenCV version) to obtain calibration invariance. This ways that if we computed a color histogram for two identical images, except that 1 was 50% larger than the other, our color histograms would be (almost) identical. It is very important that y'all normalize your color histograms so each histogram is represented by the relative percentage counts for a particular bin and non the integer counts for each bin. Again, performing this normalization will ensure that images with similar content simply dramatically different dimensions will still be "similar" once nosotros apply our similarity function.

Finally, the normalized, 3D HSV color histogram is returned to the calling function on Line 68.

Pace 2: Extracting Features from Our Dataset

At present that we have our image descriptor divers, we can move on to Step two, and extract features (i.eastward. color histograms) from each image in our dataset. The process of extracting features and storing them on persistent storage is normally called "indexing".

Let's go ahead and dive into some code to index our vacation photo dataset. Open up upwardly a new file, name it index.py and let'south get indexing:

# import the necessary packages from pyimagesearch.colordescriptor import ColorDescriptor import argparse import glob import cv2  # construct the statement parser and parse the arguments ap = argparse.ArgumentParser() ap.add_argument("-d", "--dataset", required = Truthful, 	help = "Path to the directory that contains the images to be indexed") ap.add_argument("-i", "--index", required = True, 	assistance = "Path to where the computed index volition be stored") args = vars(ap.parse_args())  # initialize the color descriptor cd = ColorDescriptor((8, 12, 3))

We'll starting time by importing the packages nosotros'll need. You lot'll think the ColorDescriptor class from Pace 1 — I decided to identify it in the pyimagesearch module for organizational purposes.

We'll likewise need argparse for parsing command line arguments, glob from grabbing the file paths to our images, and cv2 for OpenCV bindings.

Parsing our command line arguments is handled on Lines eight-13. Nosotros'll need two switches, --dataset , which is the path to our holiday photos directory, and --index which is the output CSV file containing the image filename and the features associated with each epitome.

Finally, we initialize our ColorDescriptor on Line 16 using 8 Hue bins, 12 Saturation bins, and 3 Value bins.

At present that everything is initialized, nosotros can extract features from our dataset:

# open up the output index file for writing output = open(args["index"], "west")  # utilize glob to grab the image paths and loop over them for imagePath in glob.glob(args["dataset"] + "/*.png"): 	# extract the epitome ID (i.e. the unique filename) from the image 	# path and load the image itself 	imageID = imagePath[imagePath.rfind("/") + 1:] 	image = cv2.imread(imagePath)  	# draw the epitome 	features = cd.describe(image)  	# write the features to file 	features = [str(f) for f in features] 	output.write("%s,%s\n" % (imageID, ",".join(features)))  # close the alphabetize file output.close()

Let'southward open our output file for writing on Line 19, then loop over all the images in our dataset on Line 22.

For each of the images we'll extract an imageID , which is simply the filename of the image. For this example search engine, we'll presume that all filenames are unique, but nosotros could simply as easily generate a UUID for each prototype. Nosotros'll then load the paradigm off disk on Line 26.

Now that the prototype is loaded, let'southward go alee and apply our image descriptor and excerpt features from the image on Line 29. The depict method of our ColorDescriptor returns a list of floating bespeak values used to stand for and quantify our prototype.

This list of numbers, or feature vector contains representations for each of the v image regions we described in Step ane. Each section is represented past a histogram with viii 10 12 x 3 = 288 entries. Given 5 entries, our overall characteristic vector is 5 x 288 = 1440 dimensionality. Thus each image is quantified and represented using i,440 numbers.

Lines 32 and 33 simply write the filename of the prototype and its associated feature vector to file.

To index our vacation photo dataset, open a crush and upshot the post-obit command:

$ python index.py --dataset dataset --index index.csv

This script shouldn't take longer than a few seconds to run. Subsequently it is finished you will have a new file, index.csv .

Open this file using your favorite text editor and take a await inside.

Yous'll see that for each row in the .csv file, the outset entry is the filename, followed past a list of numbers. These numbers are your characteristic vectors and are used to represent and quantify the paradigm.

Running a wc on the index, we can see that we take successfully indexed our dataset of 805 images:

$ wc -50 index.csv      805 index.csv

Stride 3: The Searcher

At present that we've extracted features from our dataset, nosotros need a method to compare these features for similarity. That's where Step 3 comes in — we are now gear up to create a class that will ascertain the bodily similarity metric between two images.

Create a new file, name information technology searcher.py and let'south make some magic happen:

# import the necessary packages import numpy as np import csv  form Searcher: 	def __init__(self, indexPath): 		# store our alphabetize path 		cocky.indexPath = indexPath  	def search(self, queryFeatures, limit = 10): 		# initialize our dictionary of results 		results = {}

We'll become ahead and import NumPy for numerical processing and csv for convenience to make parsing our index.csv file easier.

From there let's define our Searcher class on Line 5. The constructor for our Searcher will only require a single argument, indexPath which is the path to where our index.csv file resides on deejay.

To really perform a search, nosotros'll exist making a call to the search method on Line 10. This method will take two parameters, the queryFeatures extracted from the query image (i.e. the image nosotros'll be submitting to our CBIR system and asking for similar images to), and limit which is the maximum number of results to return.

Finally, we initialize our results dictionary on Line 12. A dictionary is a good information-blazon in this situation as it volition let the states to use the (unique) imageID for a given image as the primal and the similarity to the query equally the value.

Okay, so pay attention hither. This is where the magic happens:

            # open up the index file for reading 		with open(self.indexPath) equally f: 			# initialize the CSV reader 			reader = csv.reader(f)  			# loop over the rows in the alphabetize 			for row in reader: 				# parse out the image ID and features, then compute the 				# chi-squared distance between the features in our index 				# and our query features 				features = [bladder(x) for x in row[one:]] 				d = self.chi2_distance(features, queryFeatures)  				# now that we have the altitude between the ii characteristic 				# vectors, we can udpate the results dictionary -- the 				# key is the current epitome ID in the alphabetize and the 				# value is the distance we just computed, representing 				# how 'similar' the image in the index is to our query 				results[row[0]] = d  			# close the reader 			f.shut()  		# sort our results, then that the smaller distances (i.east. the 		# more than relevant images are at the forepart of the list) 		results = sorted([(v, yard) for (k, v) in results.items()])  		# return our (limited) results 		return results[:limit]

We open up our index.csv file on Line 15, grab a handle to our CSV reader on Line 17, and and so outset looping over each row of the alphabetize.csv file on Line xx.

For each row, nosotros extract the color histograms associated with the indexed image and so compare it to the query epitome features using the chi2_distance (Line 25), which I'll define in a second.

Our results dictionary is updated on Line 32 using the unique image filename as the key and the similarity of the query epitome to the indexed image every bit the value.

Lastly, all nosotros accept to do is sort the results dictionary co-ordinate to the similarity value in ascending order.

Images that have a chi-squared similarity of 0 will be deemed to exist identical to each other. Equally the chi-squared similarity value increases, the images are considered to exist less like to each other.

Speaking of chi-squared similarity, permit's go alee and define that function:

            def chi2_distance(cocky, histA, histB, eps = 1e-10): 		# compute the chi-squared distance 		d = 0.5 * np.sum([((a - b) ** ii) / (a + b + eps) 			for (a, b) in zip(histA, histB)])  		# return the chi-squared distance 		return d

Our chi2_distance function requires two arguments, which are the two histograms nosotros want to compare for similarity. An optional eps value is used to prevent segmentation-by-zippo errors.

The part gets its proper noun from the Pearson'southward chi-squared test statistic which is used to compare discrete probability distributions.

Since we are comparing colour histograms, which are by definition probability distributions, the chi-squared function is an first-class choice.

In general, the difference betwixt large bins vs. small bins is less important and should be weighted as such — and this is exactly what the chi-squared distance function does.

Are you still with united states of america? We're getting there, I promise. The last stride is actually the easiest and is simply a driver that glues all the pieces together.

Step four: Performing a Search

Would you lot believe information technology if I told you that performing the actual search is the easiest role? In reality, information technology's just a driver that imports all of the packages that we take defined before and uses them in conjunction with each other to build a full-fledged Content-Based Image Retrieval Organisation.

So open upwards ane last file, name information technology search.py , and we'll bring this example dwelling house:

# import the necessary packages from pyimagesearch.colordescriptor import ColorDescriptor from pyimagesearch.searcher import Searcher import argparse import cv2  # construct the argument parser and parse the arguments ap = argparse.ArgumentParser() ap.add_argument("-i", "--index", required = True, 	help = "Path to where the computed index will be stored") ap.add_argument("-q", "--query", required = True, 	help = "Path to the query image") ap.add_argument("-r", "--result-path", required = True, 	help = "Path to the outcome path") args = vars(ap.parse_args())  # initialize the image descriptor cd = ColorDescriptor((eight, 12, iii))

The starting time thing we'll do is import our necessary packages. We'll import our ColorDescriptor from Pace 1 so that we can extract features from the query epitome. And nosotros'll also import our Searcher that we divers in Step iii and so that nosotros can perform the actual search.

The argparse and cv2 packages round out our imports.

Nosotros then parse control line arguments on Lines 8-fifteen. We'll need an --alphabetize , which is the path to where our index.csv file resides.

Nosotros'll also need a --query , which is the path to our query epitome. This image will be compared to each image in our index. The goal will be to find images in the index that are like to our query prototype.

Think of it this way — when you go to Google and type in the term "Python OpenCV tutorials", y'all would look to find search results that contain information relevant to learning Python and OpenCV.

Similarly, if we are building an image search engine for our vacation photos and we submit an image of a sailboat on a blueish bounding main and white puffy clouds, we would expect to get similar sea view images back from our epitome search engine.

We'll then ask for a --result-path , which is the path to our vacation photos dataset. We require this switch because we'll demand to display the bodily issue images to the user.

Finally, we initialize our image descriptor on Line eighteen using the exact same parameters as we did in the indexing pace. If our intention is to compare images for similarity (which it is), it wouldn't make sense to alter the number of bins in our color histograms from indexing to search.

Simply put: use the exact aforementioned number of bins for your color histogram during Stride 4 as you did in Step 3.

This will ensure that your images are described in a consistent manner and are thus comparable.

Okay, time to perform the actual search:

# load the query prototype and describe it query = cv2.imread(args["query"]) features = cd.describe(query)  # perform the search searcher = Searcher(args["index"]) results = searcher.search(features)  # brandish the query cv2.imshow("Query", query)  # loop over the results for (score, resultID) in results: 	# load the effect prototype and brandish it 	result = cv2.imread(args["result_path"] + "/" + resultID) 	cv2.imshow("Result", result) 	cv2.waitKey(0)

We load our query epitome off disk on Line 21 and extract features from it on Line 22.

The search is and then performed on Lines 25 and 26 using the features extracted from the query image, returning our list of ranked results .

From hither, all we need to do is brandish the results to the user.

We display the query image to the user on Line 29. And and then we loop over our search results on Lines 32-36 and display them to the screen.

After all this work I'thou sure y'all're set up to encounter this system in action, aren't you lot?

Well keep reading — this is where all our hard work pays off.

Our CBIR Arrangement in Activity

Open up up your terminal, navigate to the directory where your code lives, and result the post-obit control:

$ python search.py --index index.csv --query queries/108100.png --upshot-path dataset

Figure 18: Search our vacation image dataset for pictures of the pyramids and Egypt. — **Effigy eighteen:** Search our vacation image dataset for pictures of the pyramids and Egypt.

The kickoff image y'all'll come across is our query epitome of the Egyptian pyramids. Our goal is to discover like images in our dataset. Equally you can run into, we have clearly found the other photos of the dataset from when we visited the pyramids.

We besides spent some time visiting other areas of Egypt. Let'south try another query epitome:

$ python search.py --index index.csv --query queries/115100.png --event-path dataset

Figure 18: The results of our image search engine for other areas of Egypt. Notice how the blue sky consistently appears in the search results. — **Effigy xix:** The results of our image search engine for other areas of Egypt. Observe how the bluish sky consistently appears in the search results.

Exist sure to pay shut attention to our query paradigm epitome. Detect how the sky is a brilliant shade of blue in the upper regions of the image. And notice how we take brown and tan desert and buildings at the bottom and eye of the image.

And sure plenty, in our results the images returned to us have blue heaven in the upper regions and tan/dark-brown desert and structures at the lesser.

The reason for this is considering of our region-based color histogram descriptor that we detailed earlier in this post. By utilizing this paradigm descriptor we accept been able to perform a crude form of localization, providing our color histogram with hints every bit to "where" the pixel intensities occurred in the image.

Next up on our vacation we stopped at the embankment. Execute the following command to search for beach photos:

$ python search.py --index index.csv --query queries/103300.png --result-path dataset

Figure 20: Using our Content-Based Image Retrieval System built using OpenCV to find images of the beach in our dataset. — **Figure 20:** Using our Content-Based Prototype Retrieval Organisation built using OpenCV to observe images of the beach in our dataset.

Notice how the showtime 3 results are from the verbal aforementioned location on the trip to the embankment. And the rest of the result images comprise shades of blue.

Of course, no trip to the beach is complete without scubadiving:

$ python search.py --index alphabetize.csv --query queries/103100.png --result-path dataset

Figure 21: Once again, our image search engine is able to return relevant results. Thus time, of an underwater adventure. — **Figure 21:** Over again, our prototype search engine is able to return relevant results. Thus fourth dimension, of an underwater risk.

The results from this search are particularly impressive. The acme 5 results are of the aforementioned fish — and all but one of the top 10 results are from the underwater circuit.

Finally, after a long day, information technology's time to sentinel the dusk:

$ python search.py --alphabetize index.csv --query queries/127502.png --result-path dataset

Figure 22: Our OpenCV image search engine is able to find the images of the sunset in our vacation photo dataset. — **Figure 22:** Our OpenCV image search engine is able to observe the images of the sunset in our holiday photograph dataset.

These search results are also quite good — all of the images returned are of the sunset at sunset.

Then at that place you have it! Your first image search engine.

What's next? I recommend PyImageSearch University.

Course data:
35+ full classes • 39h 44m video • Last updated: February 2022
★★★★★ 4.84 (128 Ratings) • three,000+ Students Enrolled

I strongly believe that if you had the right teacher yous could master estimator vision and deep learning.

Do you think learning calculator vision and deep learning has to be fourth dimension-consuming, overwhelming, and complicated? Or has to involve circuitous mathematics and equations? Or requires a degree in computer science?

That's not the case.

All you demand to master figurer vision and deep learning is for someone to explain things to you in simple, intuitive terms. And that's exactly what I do. My mission is to modify education and how complex Artificial Intelligence topics are taught.

If you're serious nearly learning computer vision, your adjacent finish should be PyImageSearch University, the well-nigh comprehensive figurer vision, deep learning, and OpenCV grade online today. Hither you'll learn how to successfully and confidently apply estimator vision to your work, research, and projects. Join me in calculator vision mastery.

Inside PyImageSearch University y'all'll discover:

✓ 35+ courses on essential computer vision, deep learning, and OpenCV topics
✓ 35+ Certificates of Completion
✓ 39h 44m on-demand video
✓ Brand new courses released every month , ensuring yous can keep up with state-of-the-fine art techniques
✓ Pre-configured Jupyter Notebooks in Google Colab
✓ Run all lawmaking examples in your web browser — works on Windows, macOS, and Linux (no dev surround configuration required!)
✓ Access to centralized code repos for all 500+ tutorials on PyImageSearch
✓ Like shooting fish in a barrel one-click downloads for lawmaking, datasets, pre-trained models, etc.
✓ Access on mobile, laptop, desktop, etc.

Click here to join PyImageSearch University

Summary

In this web log post we explored how to build an prototype search engine to make our vacation photos search-able.

We utilized a colour histogram to characterize the color distribution of our photos. And then, we indexed our dataset using our colour descriptor, extracting colour histograms from each of the images in the dataset.

To compare images we utilized the chi-squared distance, a popular choice when comparing discrete probability distributions.

From in that location, we implemented the necessary logic to accept a query image and then return relevant results.

Next Steps

So what are the side by side steps?

Well every bit you can run across, the only manner to interact with our epitome search engine is via the command line — that'south non very attractive.

In the next post we'll explore how to wrap our image search engine in a Python web-framework to get in easier and sexier to use.

Download the Source Lawmaking and Free 17-page Resources Guide

Enter your email accost beneath to get a .zip of the code and a Costless 17-page Resource Guide on Computer Vision, OpenCV, and Deep Learning. Inside you'll find my hand-picked tutorials, books, courses, and libraries to help you master CV and DL!

Source: https://pyimagesearch.com/2014/12/01/complete-guide-building-image-search-engine-python-opencv/

Posted by: peltierstectint87.blogspot.com