DKNY Multimedia Basics: Content- Based Information Retrieval (CBIR) and Multimedia Information Retrieval (MMIR)

First to know, Information retrieval (IR) is the area of study concerned with searching for documents,information within documents, and for metadata about documents, as well as that of searching relational databases and the World Wide Web. There is overlap in the usage of the terms data retrieval, document retrieval, information retrieval, and text retrieval, but each also has its own body of literature, theory, praxis, and technologies. IR is interdisciplinary, based on computer science, mathematics, library science, information science, information architecture, cognitive psychology, linguistics, and statistics.

*Definition

Multimedia information retrieval is a cross-cutting field. Extending beyond the borders of culture, art, and science, the search for digital information is one of the major challenges of our time. Digital libraries, bio-computing & medical science, the Internet and social networking sites, streaming video, multimedia databases, cultural heritage collections and P2P networks have created a worldwide need for new paradigms and techniques on how to browse, search and summarize multimedia collections and more generally how to afford efficient multimedia content consumption.

ie, Video Search Engines, Image Databases, Spoken Document Retrieval, Music Retrieval, Query Languages and Query Mediation.

Content-based Retrieval is automatically retrieves images, video, and audio based on the visual and audio content.
(A general term for methods for using information stored in image archives.)
There are three common type of content-based IR which are: image, video, and audio

What is Multimedia Information Retrieval?
Multimedia Information Retrieval is based on the addition of some great elements to the information retrieval there are the audio, animation , graphics and the video. In that case all the information given will be based on 4 elements that have just given . Everything is based on the multimedia so this is something new to information retrieval. A system for management (storage, retrieval and manipulation) of multiple media data, such as some combination of tabular administrative data, text documents, image, spatial, historical, audio, or video data.
At its very core multimedia information retrieval means the process of searching for and finding multimedia documents, the corresponding research field is concerned with building the best possible multimedia search engines in order to used the modern world . The intriguing bit here is that the query itself can be a multimedia excerpter: For example, when you walk around in an unknown place and stumble across an interesting landmark, would it not be great if you could just take a picture with your mobile phone and send it to a service that finds a similar picture in a database and tells you more about the building and about its significance, for that matter? This book goes further by examining the full matrix of a variety of query modes versus document types. How do you retrieve a music piece by humming? What if you want to find news video clips on forest fires using a still image? The text discusses underlying techniques and common approaches to facilitate multimedia search engines from metadata driven retrieval, via piggy-back text retrieval where automated processes create text surrogates for multimedia, automated image annotation and content-based retrieval. The latter is studied in great depth looking at features and distances, and how to effectively combine them for efficient retrieval, to a point where the readers have the ingredients and recipe in their hands for building their own multimedia search engines.
Supporting users in their resource discovery mission when hunting for multimedia material is not a technological indexing problem alone. We look at interactive ways of engaging with repositories through browsing and relevance feedback, roping in geographical context, and providing visual summaries for videos. The book concludes with an overview of state-of-the-art research projects in the area of multimedia information retrieval, which gives an indication of the research and development trends and, thereby, a glimpse of the future world. In this talk, given an overview of recent and ongoing work at ORL in the area of multimedia information retrieval (MMIR). Our lab has a strong background in high-bandwidth multimedia systems, and it was in fact the real problem of dealing with our increasingly large archives of video mail which motivated our first MMIR project, Video Mail Retrieval (VMR), we conducted jointly with the speech recognition group of the Cambridge University Engineering Department. The first version of VMR enabled the retrieval of video documents based on scanning the audio track for keywords, and the final version used pre-computed phoneme lattices for open-vocabulary and speaker-independent retrieval. The system was also applied to the retrieval of items of interest from an archive of several months worth of television broadcast news. The VMR project finished this year, but we are now ramping up a follow-on MMIR project which will consider indexing on image content (using both still-image and, in the case of video, inter-frame analyses) as well as further improvements to audio indexing (using, for example, audio classification techniques in an initial phase). The goal is a system which supports multi-modal queries on large heterogeneous multimedia collections, including those drawn dynamically from the web.
MAKE A BETTER LIFE FOR FUTURE STUDENTS!!!

*Application of Multimedia Information Retrieval
The application of multimedia information retrieval is for financial, marketing, scientific databases, medical databases, criminal investigation, and also personal achieves. Retrieval is based on the understanding of the content of documents and of their components. The goals for retrieval is actually for the accuracy and speed. It means that it has to be able to retrieve the documents with as few incorrects as possible and also be fast in real time in doing it. Humans retrieve information every time they ask a question and receive a response that addresses their question and adds to their knowledge about the queried topic. Information requests vary widely in their complexity and in the quantity of potentially relevant material that can be retrieved, as well as in the effort required to retrieve satisfactory information.The easiest information retrieval strategy is to ask someone with knowledge about the topic of the question, for example to request information from a nearby person about where one is. However, there are many information needs for which a knowledgeable person may not be available. For example, a student assignment may be to gather information about the most numerous of the 18 species of whales in the North Atlantic. In this case, the information seeker may need to first search for information about the population sizes of each of the North Atlantic whales, for so to gather information about the largest of these species. The student will most likely have to depend on information that has been previously recorded and stored in some accessible location. Text-based applications were among the first computer applications, initiated by the need in the American intelligence agencies to analyze large quantities of text documents. Other early applications included establishment of information retrieval systems (IRS) to support management, analysis and retrieval of information from digitized medical journals and legal documents, public access catalogs, on-line catalogs (OPACs), and digital libraries, insurance documents, and digitized literature, for example the works of major authors. Apart from that, image-based communication pre-dated written language communication. However, due to computer capacity limitations, development of image-based computer applications followed text-based applications. Perhaps the first image-based computer applications were developed for weather forecasting, though other early applications included satellite image analysis for monitoring weather, climate, agricultural quality, human activity, analysis of medical images, navigation - for submarines, ships and aircraft, security - photo identity and finger prints. As broadband Internet capacity has become more generally available, art, cultural, and natural science museums have found new audiences for their, predominantly visual, exhibits. Rich examples can be found in the virtual exhibits of the digital collection in the Russian Hermitage museum and the Maastrich museum in the Netherlands. Image data consists of a string of picture elements - pixels, each of which describes the color and intensity of 'its' location in the image. Since there is no common structure between images and no alphabet or word set that can be used as a basis for selection of images, it is common to add (manually) a set of descriptive attributes or metadata to each image. Streamed media, or dynamic media is composed of a series of media objects that have a time relationship that is necessary for proper communication. Audio, film (a series of images) and video and a combination of audio and film data belong to this class. Misrepresentation of presentation time or errors in time flow and sequencing can distort the information that the data stream is meant to convey. In addition to uneven presentation such as slow presentation leads to lowered pitch or slow motion film, while increased presentation speed leads to heightened pitch and fast action film. Besides, the above applications use multiple media, if we count the metadata that has been added to facilitate retrieval. However, what is commonly thought of as multiple media or multimedia applications are those in which several media types are central to communicating the meaning of data. Typical multimedia applications include video, which combines synchronized audio, visual/image, and often text data streams, games, alone and as part of a larger presentation, as used for the Mesoamericanball game, museum exhibits, for example the audio, film and animations of the voices of corlado, illustrated text as found in advertisements, Web-books and Newspapers.

What is Content-Based Information Retrieval?
A content-based retrieval system processes the information contained in image data and creates an abstraction of its content in terms of visual attributes. Any query operations deal solely with this abstraction rather than with the image itself. Thus, every image inserted into the database is analyzed, and a compact representation of its content is stored in a feature vector, or signature.

The signature for the image in Figure 1.1 is extracted by segmenting the image into regions based on color as shown in Figure 1.2. Each region has associated with it color, texture, and shape information. The signature contains this region-based information along with global color, texture, and shape information to represent these attributes for the entire image. In Figure 1.2, there are a total of 55 shapes (patches of connected pixels with similar color) in this segmented image. In addition, there is also a "background" shape, which consists of small disjoint dark patches. These tiny patches (usually having distinct colors) do not belong to any of their adjacent shapes and are all classified into a single "background" shape. This background shape is also taken into consideration for image retrieval.

Figure 1.1 Unsegmented Image

Figure 1.2 Segmented Image

The below video shows a survey on Content-Based Video Analysis:

*Applications of Content-Based Information Retrieval
>Financial, marketing: stock prices, sales etc. (find companies whose stock prices move similarly)
>Scientific databases: sensor data (whether, geological, environmental data)
>Office automation, electronic encyclopedias,electronic books
>Medical databases: X-rays, CT, MRI scans
>Criminal investigation: suspects, fingerprints
>Personal archives: text and color images

*Examples of Content-Based Information Retrieval
Until today, there are many CBIR systems have been built.

The growing list: ADL, AltaVista Photofinder, Amore, ASSERT, BDLP, Blobworld, CANDID, C-bird,
Chabot, CBVQ, DrawSearch, Excalibur Visual RetrievalWare, FIDS, FIR, FOCUS, ImageFinder,
ImageMiner, ImageRETRO, ImageRover, ImageScape, Jacob, LCPD, MARS, MetaSEEk, MIR, NETRA,
Photobook, Picasso, PicHunter, PicToSeek, QBIC, Quicklook2, SIMBA, SQUID, Surfimage, SYNAPSE,
TODAI, VIR Image Engine, VisualSEEk, VP Image Retrieval System, WebSEEk, WebSeer, WISE…

Source:
1. http://download.oracle.com/docs/cd/B19306_01/appdev.102/b14302/ch_cbr.htm
2. http://www.youtube.com/watch?v=woj5QobXa40
3. http://nordbotten.com/ADM/ADM_book/MIRS-glossary.htm
4. http://www.intelligence.tuc.gr/~petrakis/courses/multimedia/retrieval.pdf

DKNY Multimedia Basics

Monday, 3 October 2011

Content- Based Information Retrieval (CBIR) and Multimedia Information Retrieval (MMIR)

No comments:

Post a Comment