Content based Image Retrieval (CBIR) using MATLAB

Description

Table of Contents:

  1. Content based Image Retrieval (CBIR) using MATLAB
  2. Part II
  3. Part III
  4. Part IV
  5. Part V
  6. Part VI
  7. Part VII
  8. Part VIII
  9. Part IX
  10. Part X
  11. Part XI
  12. Part XII
  13. Part XIII
Contributor: 
Anshu Raj, India

1. INTRODUCTION

1.1       Aim of the Project

The aim of this project is to review the current state of the art in content-based image retrieval (CBIR), a technique for retrieving images on the basis of automatically-derived features such as color, texture and shape. Our findings are based both on a review of the relevant literature and on discussions with researchers in the field.

The need to find a desired image from a collection is shared by many professional groups, including journalists, design engineers and art historians. While the requirements of image users can vary considerably, it can be useful to characterize image queries into three levels of abstraction: primitive features such as color or shape, logical features such as the identity of objects shown and abstract attributes such as the significance of the scenes depicted. While CBIR systems currently operate effectively only at the lowest of these levels, most users demand higher levels of retrieval.

1.2       General Introduction

1.2.1 Content Based Image Retrieval

Content-based image retrieval (CBIR), also known as query by image content (QBIC) and content-based visual information retrieval (CBVIR) is the application of computer vision to the image retrieval problem, that is, the problem of searching for digital images in large databases.

"Content-based" means that the search will analyze the actual contents of the image. The term 'content' in this context might refer colors, shapes, textures, or any other information that can be derived form the image itself. Without the ability to examine image content, searches must rely on metadata such as captions or keywords. Such metadata must be generated by a human and stored alongside each image in the database.

Problems with traditional methods of image indexing [Enser,1995] have led to the rise of interest in techniques for retrieving images on the basis of automatically-derived features such as color, texture and shape – a technology now generally referred to as Content-Based Image Retrieval (CBIR). However, the technology still lacks maturity, and is not yet being used on a significant scale. In the absence of hard evidence on the effectiveness of CBIR techniques in practice, opinion is still sharply divided about their usefulness in handling real-life queries in large and diverse image collections. The concepts which are presently used for CBIR system are all under research.

1.2.2 Images

Let us start with the word “image”. The surrounding world is composed of images. Humans are using their eyes, containing 1.5x10^8 sensors, to obtaining images from the surrounding world in the visible portion of the electromagnetic spectrum (wavelengths between 400 and 700 nanometers). The light changes on the retina are sent to image processor center in the cortex.

In the image database systems geographical maps, pictures, medical images, pictures in medical atlases, pictures obtaining by cameras, microscopes, telescopes, video cameras, paintings, drawings and architectures plans, drawings of industrial parts, space images are considered as images.

There are different models for color image representation. In the seventeen century Sir Isaac Newton showed that a beam of sunlight passing through a glass prism comes into view as a rainbow of colors. Therefore, he first understood that white light is composed of many colors. Typically, the computer screen can display 2^8 or 256 different shades of gray. For color images this makes 2^(3x8) = 16,777,216 different colors.

Clerk Maxwell showed in the late nineteen century that every color image cough be created using three images – Red, green and Blue image. A mix of these three images can produce every color. This model, named RGB model, is primarily used in image representation. The RGB image could be presented as a triple(R, G, B) where usually R, G, and B take values in the range [0, 255]. Another color model is YIQ model (lamination (Y) , phase (I), quadrature phase (Q)). It is the base for the color television standard. Images are presented in computers as a matrix of pixels. They have finite area. If we decrease the pixel dimension the pixel brightness will become close to the real brightness. The same image with different pixel dimension is shown below.

1.2.3 Image Database systems

Set of images are collected, analyzed and stored in multimedia information systems, office systems, Geographical information systems(GIS), robotics systems , CAD/CAM systems, earth resources systems,  medical databases, virtual reality systems,

information retrieval systems, art gallery and museum catalogues, animal and plant atlases, sky star maps, meteorological maps, catalogues in shops and many other places. 

 There are sets of international organizations dealing with different aspects of image storage, analysis and retrieval. Some of them are: AIA (Automated Imaging/Machine vision), AIIM (Document imaging), ASPRES (Remote Sensing/Protogram) etc.         

There are also many international centers storing images such as :  Advanced imaging, Scientific/Industrial Imaging, Microscopy imaging, Industrial Imaging etc. There are also different international work groups working in the field of image compression, TV images, office documents, medical images, industrial images, multimedia images, graphical images, etc.

1.2.4 Logical Image Representation in Database Systems:

 The logical image representation in image databases systems is based on different image data models. An image object is either an entire image or some other meaningful portion (consisting of a union of one or more disjoint regions) of an image. The logical image description includes: meta, semantic, color, texture, shape, and spatial attributes.

Color attributes could be represented as a histogram of intensity of the pixel colors. A histogram refinement technique is also used by partitioning histogram bins based on the spatial coherence of pixels. Statistical methods are also proposed to index an image by color correlograms, which is actually a table containing color pairs, where the k-th entry for <i,j> specifies the probability of locating a pixel of color j at a distance k from a pixel of color I in the image.

1.2.5 Classification and indexing schemes

Many picture libraries use keywords as their main form of retrieval – often using indexing schemes developed in-house, which reflect the special nature of their collections. A good example of this is the system developed by Getty Images to index their collection of contemporary stock photographs. Their thesaurus comprises just over 10 000 keywords, divided into nine semantic groups, including geography, people, activities and concepts. Index terms are assigned to the whole image, the main objects depicted, and their setting. Retrieval software has been developed to allow users to submit and refine queries at a range of levels, from the broad (e.g. “freedom”) to the specific (e.g. “a child pushing a swing”).

Probably the best-known indexing scheme in the public domain is the Art and Architecture Thesaurus (AAT), originating at Rensselaer Polytechnic Institute in the early 1980s, and now used in art libraries across the world. AAT is maintained by the Getty Information Institute and consists of nearly 120,000 terms for describing objects, textural materials, images, architecture and other cultural heritage material. There are seven facets or categories which are further subdivided into 33 sub facets or hierarchies. The facets, which progress from the abstract to the concrete, are: associated concepts, physical attributes, styles and periods, agents, activities, materials, and objects. AAT is available on the Web from the Getty Information Institute at http://www.ahip.getty.edu/aat_browser/. Other tools from Getty include the Union List of Artist Names (ULAN) and the Getty Thesaurus of Geographic Names (TGN). Another popular source for providing subject access to visual material is the Library of Congress Thesaurus for Graphic Materials (LCTGM). Derived from the Library of Congress Subject Headings (LCSH), LCTGM is designed to assist with the indexing of historical image collections in the automated environment. Greenberg [1993] provides a useful comparison between AAT and LCTGM.

A number of indexing schemes use classification codes rather than keywords or subject descriptors to describe image content, as these can give a greater degree of language independence and show concept hierarchies more clearly. Examples of this genre include ICONCLASS from the University of Leiden [Gordon, 1990], and TELCLASS from the BBC [Evans, 1987]. Like AAT, ICONCLASS was designed for the classification of works of art, and to some extent duplicates its function; an example of its use is described by Franklin [1998]. TELCLASS was designed with TV and video programmes in mind, and is hence rather more general in its outlook. The Social History and Industrial Classification, maintained by the Museum Documentation Association, is a subject classification for museum cataloguing. It is designed to make links between a wide variety of material including objects, photographs, archival material, tape recordings and information files.

A number of less widely-known schemes have been devised to classify images and drawings for specialist purposes. Examples include the Vienna classification for trademark images [World Intellectual Property Organization, 1998], used by registries Worldwide to identify potentially conflicting trademark applications, and the Opitz coding system for machined parts [Opitz et al, 1969], used to identify families of similar parts which can be manufactured together.

A survey of art librarians conducted for this report suggests that, despite the existence of specialist classification schemes for images, general classification schemes, such as Dewey Decimal Classification (DDC), Library of Congress (LC), BLISS and the Universal Decimal Classification (UDC), are still widely used in photographic, slide and video libraries. The former scheme is the most popular, which is not surprising when one considers the dominance of DDC in UK public and academic library sectors. ICONCLASS, AAT, LCTGM, SHIC are all in use in at least one or more of the institutions in the survey. However, many libraries and archives use in-house schemes for the description of the subject content. For example, nearly a third of all respondents have their own in-house scheme for indexing slides.

When discussing the indexing of images and videos, one needs to distinguish between systems which are geared to the formal description of the image and those concerned with subject indexing and retrieval. The former is comparable to the bibliographical description of a book. However, there is still no one standard in use for image description, although much effort is being expended in this area by a range of organizations such as the Museum Documentation Association, the Getty Information Institute, the Visual Resources Association the International Federation of Library Association/Art Libraries and the International Committee for Documentation (CIDOC) of the International Council of Museums (ICOM).

The descriptive cataloguing of photographs presents a number of special challenges. Photographs, for example, are not self-identifying. Unlike textual works that provide such essential cataloguing aids as title pages, abstracts and table of contents, photographs often contain no indication of author or photographer, names of persons or places depicted dates, or any textual information whatever. Cataloguing of images is more complex than that for text documents, since records should contain information about the standards used for image capture and how the data is stored as well as descriptive information, such as title, photographer (or painter, artist, etc). In addition, copies of certain types of images may involve many layers of intellectual property rights, pertaining to the original work, its copy (e.g. a photograph), a digital image scanned from the photograph, and any subsequent digital image derived from that image.

Published reviews of traditional indexing practices for images and video include many writers discuss the difficulties of indexing images. The problems of managing a large image collection. He notes that, unlike books, images make no attempt to tell us what they are about and that often they may be used for purposes not anticipated by their originators. Images are rich in information and can be used by researchers from a broad range of disciplines. As Baser comments:

“A set of photographs of a busy street scene a century ago might be useful to historians wanting a ‘snapshot’ of the times, to architects looking at buildings, to urban planners looking at traffic patterns or building shadows, to cultural historians looking at changes in fashion, to medical researchers looking at female smoking habits, to sociologists looking at class distinctions, or to students looking at the use of certain photographic processes or techniques.”

Svenonius [1994] discusses the question of whether it is possible to use words to express the “abruptness of a work in a wordless medium, like art. To get around the problem of the needs of different users groups, van der Starre [1995] advocates that indexers should “stick to ‘plain and simple’ indexing, using index terms accepted by the users, and using preferably a thesaurus with many lead-ins,” thus placing the burden of further selection on the user. Shatford Layne (1994) suggests that, when indexing images, it may be necessary to determine which attributes provide useful groupings of images; which attributes provide information that is useful once the images are found; and which attributes may, or even should, be left to the searcher or researcher to identify. She also advocates further research into the ways images are sought and the reasons that they are useful in order to improve the indexing process. Constantopulos and Doerr (1995) also support a user centred approach to the designing of effective image retrieval systems. They urge that attention needs to be paid to the intentions and goals of the users, since this will help define the desirable descriptive structures and retrieval mechanisms as well as understanding what is ‘out of the scope’ of an indexing system.

When it comes to describing the content of images, respondents in our own survey seem to include a wide range of descriptors including title, period, genre, subject headings, keywords, classification and captions (although there was some variation by format). Virtually all maintain some description of the subject content of their images. The majority of our respondents maintain manual collections of images, so it is not surprising that they also maintain manual indexes. Some 11% of respondents included their photographs and slides in the online catalogues, whilst more than half added their videos to their online catalogues. Standard text retrieval or database management systems were in use in a number of libraries (with textual descriptions only for their images). Three respondents used specific image management systems: Index+, iBase and a bespoke in-house system. Unsurprisingly, none currently use CBIR software.

1.2.6 Research Trends in the Image Database Systems

Most image database systems are products of research, and therefore emphasize only one aspect of content-based retrieval. Sometimes this is the sketching capability in the user interface; sometimes it is a new indexing data structure, etc. Some systems are created as a research version and a commercial product. The commercial version is usually less advanced, and shows more standard searching capabilities. A number of systems provide user interface that allows more powerful query formulation than is useful in demo system. Most systems use color and texture features, few systems use shape features, and yet less use spatial features. The retrieval on color usually yield images that have similar colors. The larger the collection of images, the greater is the chance that it contains an image similar to the query image.

 

Comments (52)

thanks for sharing.. very

thanks for sharing.. very interesting.

plz send me code for this

plz send me code for this project.i am doing this project for the last year

 

hey very interesting yaar !!

hey very interesting yaar !!  where can i get the source code ?! i think i can improvise this :) 

will please explain me about

will please explain me about how u created gui interface....means what tools u have used..

 

what tools u have used for

what tools u have used for gui interface

Hi,Can i get the source code,

Hi,

Can i get the source code, so that i can take up this challenge for further enhancing the project.

Thanks.

good yar . can u send me 

good yar . can u send me

 

hi i m doing project in CBIR

hi i m doing project in CBIR for medical images...  i have completed only feature extraction i can't get idea to find probablistic output and classification using SVM classifier please help me mam...

hi i was new to matlab please

hi i was new to matlab please guide me in coding.

thankyou

This is really very

This is really very interesting but I don't understand the  sequence in which the codes will be executed.

Could you plz let me know the sequence to start these codes. I need the project of CBIR with hierarchical clustering for my MCA final semester project.

Plz help me.It's very very urgent.

 

Regards

Seema

 

hi seema can u forward the

hi seema

 can u forward the same code to me so that i wil execute and i wil explain u in which sequence u have to execute

my mail id harin.2k6@gmail.com

Hi, it is very very

Hi,

 

it is very very interesting paper.I really very much interesting.Can i get the source code, so that i can take up this challenge for further enhancing the project and implement this paper for my P.hd(Research)..

send to code it is very useful to us.

Please send me the code for

Please send me the code for Above project in one .rar or zip file formate,..

Really Interesting project, want to work on this project.  My email id is rajkumarpomaji@gmail.com

Hope you will reply soon.

 

Thanks and regards

Rajkumar Pomaji

HiI'm not able to execute the

Hi

I'm not able to execute the code. Plz guide me.Its very urgent.

Could you plz resend me the code in a zip file?

 

Regards

Seema

 

hi i liked the project can u

hi i liked the project can u send me the code to my mail id..i am working on it ..i want the source code

I am doing research in CBIR

I am doing research in CBIR CAn u send the matlab code for CBIR, kindly do d needful

my mail-ID is vainavarshni@gmail.com

   can i also ask for the

 

 

 can i also ask for the code , as evrey one is asking here from you . i wrote a survey paper on CBIR and have to write some scientific p taper , dont copy ur idea , just want to have complete sense of  evry thing in it tammna

my mailing id

my mailing id zoha.qamer@gmail.com , pls reply me with code :) if u can thanks

Hi..Its very interesting,can

Hi..

Its very interesting,can u pl.send me the source code to mail id: harin.2k6@gmail.com , in .rar or .zip so that  i can take up this challenge for further enhancing the project.actually i am doing this type of project only,its  very very urgent so pl.send me.

Thanks.

hi really interesting

hi really interesting paper..

 

if you pleased, would you like to send me the sourcecode..Iam studying about CBIR now, and I really need your help to study about the application of CBIR

 

 

best regard...

 

Thanks

Hi. I am studying on the

Hi. I am studying on the topic. Could you please send me the source code? my id is lupasust@hotmail.com.

 

Regards

Lupa

hi,im doing thesis on cbir,

hi,im doing thesis on cbir, can u send matlab source code

Hi, I am doing project on

Hi,

 I am doing project on CBIR. will it be possible for you to send me code of this paper. my email address is sumairafd@yahoo.com . I shall be grateful to you. I really need your help as I am not good at coding in matlab.

 

Regards

Sumaira

hi,im doing thesis on cbir,

hi,im doing thesis on cbir, can u send matlab source code

iam doing project on cbir

iam doing project on cbir .please send me matlab code.my mail id is veeru.ramesh@gmail.com

please send the code for

please send the code for connection the dataset to m file

plse send matlab code for

plse send matlab code for cbir to my email:pkanoje16@gmail.com

Hello sir, iam doing project

Hello sir,

 

iam doing project on cbir for medical images  but i have completed only feature extraction .please send me matlab codeIts urgent!!!.my mail id is ishash100@gmail.com.

I shall be grateful to you. I really need your help

may i get the source code?

may i get the source code? thank you

I want the matlab code for

I want the matlab code for CBIR, kindly do d needful

hi.am doing ma project in

hi.am doing ma project in CBIR..

i need matlab code to obtain color feature vector using hue and saturation.

kindly send the code to ma mail id :hemlata.pa@gmail.com

hi.am doing ma project in

hi.am doing ma project in CBIR..

i need matlab code to obtain color feature vector using hue and saturation.

kindly send the code to ma mail id :hemlata.pa@gmail.com

 

hiii i m doing my project on

hiii i m doing my project on CBIR system......kindly send me the source code of MATLAB....Thanks

hiii i m  doing my project on

hiii i m  doing my project on CBIRsystem...plz send me the source code of MATLAB.my email address is mitthumriu@gmail.com

i m doing project in content

i m doing project in content based image retreival system i need the codings

i m doing project in content

i m doing project in content based image retreival system i need the codings

i m doing project in content

i m doing project in content based image retreival system i need the codings

my mail id is priyavigna@yahoo.com

 

pls send me cbir project its

pls send me cbir project its urgent

-dipteechikmurge2009@gmail.com

higr8 job... il iked it.. it

hi

gr8 job... il iked it.. it woulda be nice to share me code plz, im doin a small porject on CBIR

ill be thankfull if email it at : god_gift77@hotmail.com

best regards,

tarkan

hi, This is a great job. May

hi,

 

This is a great job. May I have the matlab source codes ?

 

Thanks!

 

My e-mail is

 

cjyang@sunrise.hk.edu.tw

hII i am doing project on

hII i am doing project on this topic....  Pls provide the source code of MATLAB.... Thanku...

We are doing our main project

We are doing our main project on image retrieval using both colour and texture features so please provide the required source code in MATLAB thank u in advance please give us the code as fast as possible because we r in urgency about submission of the project.

plz provide me with the

plz provide me with the source code
my id is harinder.hsj@gmail.com

Hi,nice presentation on CBIR

Hi,

nice presentation on CBIR & MATLAB..

I'm doing project on CBIR, but i'm using shape as feature vector. for extracting shape i've used canny edge method, next step i'm not getting how to fetch images from DB & how to compare with query image, my DB is just collection of images in a folder( NOT MS-Access / SQL)...

     Also i'm very much interested in creating GUI.. Can you please help me??

thank u

 

my id is tvchethan03@gmail.com

Hi,nice presentation on CBIR

Hi,

nice presentation on CBIR & MATLAB..

I'm doing project on CBIR, but i'm using shape as feature vector. for extracting shape i've used canny edge method, next step i'm not getting how to fetch images from DB & how to compare with query image, my DB is just collection of images in a folder( NOT MS-Access / SQL)...

     Also i'm very much interested in creating GUI.. Can you please help me??

thank u

 

my email-id is tvchethan03@gmail.com

hi i am doing my project on

hi i am doing my project on cbir tourism picture using mpeg-7 descriptor.i need matlab code about this.please help me.me email is:

maryamsarafraz@gmail.com

Could you please send me the

Could you please send me the source code? my id is do3aa2.89@gmail.com

Could you please send me the

Could you please send me the source code of this project ? please send this source code to my id  erkhan13@gmail.com

thanks in advance

Could you please send me the

Could you please send me the source code of this project ? please send this source code to my id  khan975310@gmail.com

thanks in advance

Could you please send me the

Could you please send me the source code of this project ? please send this source code to my id  sohrabkhan2012@gmail.com

thanks in advance