Svoboda | Graniru | BBC Russia | Golosameriki | Facebook
Skip to main content
  • Brno, Jihomoravsky kraj, Czech Republic
The Encyclopedia of Databases, a comprehensive work, provides easy access to relevant information on all aspects of very large databases. This encyclopedia features alphabetical organization of concepts covering main areas of very large... more
The Encyclopedia of Databases, a comprehensive work, provides easy access to relevant information on all aspects of very large databases. This encyclopedia features alphabetical organization of concepts covering main areas of very large databases. These 1000 entries offer convenient access to information in the field of databases with definitions and illustrations of basic terminology, concepts, methods, and algorithms, references to literature, and cross-references to other entries and journal articles. Topics for the encyclopedia were selected by a distinguished international advisory board, and written by world class experts in the field. The Encyclopedia of Databases is designed to meet the needs of research scientists, professors and graduate-level students in computer science and engineering. This encyclopedia is also suitable for practitioners in industry.
Abstract The main paradigm of similarity searching in metric spaces has remained mostly unchanged for decades – data objects are organized into a hierarchical structure according to their mutual distances, using representative pivots to... more
Abstract The main paradigm of similarity searching in metric spaces has remained mostly unchanged for decades – data objects are organized into a hierarchical structure according to their mutual distances, using representative pivots to reduce the number of distance computations needed to efficiently search the data. We propose an alternative to this paradigm, using machine learning models to replace pivots, thus posing similarity search as a classification problem, which stands in for numerous expensive distance computations. Even a relatively naive implementation of this idea is more than competitive with state-of-the-art methods in terms of speed and recall, proving the concept as viable and showing great potential for its future development.
Similarity searching has become more and more popular, which was stimulated by the growth of diverse data archives available on-line that offer search services to users, and by the increasing complexity of data that must be searched. This... more
Similarity searching has become more and more popular, which was stimulated by the growth of diverse data archives available on-line that offer search services to users, and by the increasing complexity of data that must be searched. This issue has also been recognized by major Internet search engines, exemplified by Google, that re-cently enriched their image search services by allowing users to search for images by similarity. They usually apply the following procedure. Firstly, a candidate set of im-ages is obtained by a regular text search in images ’ file names and associated textual tags. Then this set is reordered by images ’ content, expressed as color histograms, for example. Finally, this result is presented to the user. In this thesis, we focus on similarity searching – content-based retrieval. In this area, data items are retrieved by their content rather than by textual information associated with them. For example, images are searched by comparing their color histogram...
A nozzle shut-off valve for injection molding machine for plastic material, especially thermoplastic material, has two pneumatic cylinder-and-plunger units. One such unit has its plunger mounted to reciprocate so as to block the... more
A nozzle shut-off valve for injection molding machine for plastic material, especially thermoplastic material, has two pneumatic cylinder-and-plunger units. One such unit has its plunger mounted to reciprocate so as to block the passageway of the plastics through the nozzle. The first such plunger and cylinder unit is provided with a pilot passage in the valve nozzle so that the pressure of molten plastic can be used to open it. The second such unit is much smaller diameter and has its plunger mounted to block the pilot passage so that the second unit in effect becomes a pilot valve and controls the application of fluid pressure through the pilot passage to the first unit plunger or blocking plunger.
We propose a self-organized content-based Image Retrieval Network (IRN) that is inspired by a Metric Social Network (MSN) search system. The proposed network model is strictly data-owner oriented so no data redistribution among peers is... more
We propose a self-organized content-based Image Retrieval Network (IRN) that is inspired by a Metric Social Network (MSN) search system. The proposed network model is strictly data-owner oriented so no data redistribution among peers is needed in order to efficiently process queries. Thus a shared database where each peer is fully in charge of its data, is created. The self-organization of the network is obtained by exploiting the social-network approach of the MSN – the connections between peers in the network are created as social-network relationships formed on the basis of a queryanswer principle. The knowledge of answers to previous queries is used to fast navigate to peers, possibly containing the best answers to new queries. Additionally, the network uses a randomized mechanism to explore new and unvisited parts of the network. In this way, the self-adaptability and robustness of the system are achieved. The proposed concepts are verified using a real network consisting of 2,...
This implementation framework called MESSIF eases the task of building metric-based similarity-searching prototypes. It provides a number of modules from storage management to automatic collecting of performance statistics. Due to its... more
This implementation framework called MESSIF eases the task of building metric-based similarity-searching prototypes. It provides a number of modules from storage management to automatic collecting of performance statistics. Due to its open and modular design it is also easy to implement additional modules if necessary. The MESSIF also offers several ready-to-use generic clients that allow to control and test the index structures and also measure its performance.
With the increasing number of applications that base searching on similarity rather than on exact matching, novel index structures are needed to speedup execution of similarity queries. An important stream of research in this direction... more
With the increasing number of applications that base searching on similarity rather than on exact matching, novel index structures are needed to speedup execution of similarity queries. An important stream of research in this direction uses the metric space as a model of similarity. We explain the principles and survey the most important representatives of index structures. We put most emphasis on distributed similarity search architectures which try to solve the difficult problem of scalability of similarity searching. The actual achievements are demonstrated by practical experiments. Future research directions are outlined in the conclusions.
We address the problem of organizing personal photo albums by assigning tags/names to people present in photographs. Our proposed framework improves similar systems such as Google+ Photos (Picasa) by incorporating not only a face detector... more
We address the problem of organizing personal photo albums by assigning tags/names to people present in photographs. Our proposed framework improves similar systems such as Google+ Photos (Picasa) by incorporating not only a face detector but also a full-body detector. Both these modalities are combined together to provide the user with tags of people whose face has not been detected or is not even present in the photograph. An implementation of the proposed framework is evaluated on a sample of real life photographs.
Content-based retrieval in large collections of unstructured data is challenging not only from the difficulty of the defining similarity between data images where the phenomenon of semantic gap appears, but also the efficiency of... more
Content-based retrieval in large collections of unstructured data is challenging not only from the difficulty of the defining similarity between data images where the phenomenon of semantic gap appears, but also the efficiency of execution of similarity queries. Search engines providing similarity search typically organize various multimedia data, e.g. images of a photo stock, and support k-nearest neighbor query. Users accessing such systems then look for data items similar to their specific query object and refine results by re-running the search with an object from the previous query results. This paper is motivated by unsatisfactory query execution performance of indexing structures that use metric space as a convenient data model. We present performance behavior of two state-of-the-art representatives and propose a new universal technique for ordering priority queue of data partitions to be accessed during kNN query evaluation. We verify it in experiments on real-life data-sets.
D-index je indexacni struktura organizujici data modelovana jako metrický prostor, což umožňuje podobnostni hledani. Tato struktura ma statický character s ohledem na pocet kapas a urovni definovaných hasovacimi funkcemi. Tyto funkce musi... more
D-index je indexacni struktura organizujici data modelovana jako metrický prostor, což umožňuje podobnostni hledani. Tato struktura ma statický character s ohledem na pocet kapas a urovni definovaných hasovacimi funkcemi. Tyto funkce musi být navrženy před vytvořenim indexu a naplněnim daty. D-index je pak schopen ukladat libovolne množstvi objektů tak, že kapacita kapes je neomezena.
The problem of similarity searching is nowadays attracting a lot of attention, because upcoming applications process complex data and the traditional exact match searching is not sufficient. There are efficient solutions, but they are... more
The problem of similarity searching is nowadays attracting a lot of attention, because upcoming applications process complex data and the traditional exact match searching is not sufficient. There are efficient solutions, but they are tailored for the needs of specific data domains. General solutions, based on the metric space abstraction, are extensible, but they are designed to operate on a single computer only. Therefore, their scalability is limited and they cannot adapt to different performance requirements. In this paper, we propose a distributed access structure which is fully dynamic and exploits a Grid infrastructure. We study properties of this structure in numerous experiments. Besides, the performance tuning is analyzed with respect to user-specific requirements which include the maximum response time and the number of queries executed concurrently.
Text collections of data need not only search support for identical objects, but the approximate matching is even more important. A suitable metric to such a task is the edit distance measure. However, the quadratic computa- tional... more
Text collections of data need not only search support for identical objects, but the approximate matching is even more important. A suitable metric to such a task is the edit distance measure. However, the quadratic computa- tional complexity of edit distance prevents from apply- ing naive storage organizations, such as the sequential search, and more sophisticated search structures must be
ABSTRACT In this paper, we tackle the issues of analyzing the struc-tural evolution of the Metric Social Network. The Metric Social Network operates in a P2P environment where peers maintain their own data and the relationships among them... more
ABSTRACT In this paper, we tackle the issues of analyzing the struc-tural evolution of the Metric Social Network. The Metric Social Network operates in a P2P environment where peers maintain their own data and the relationships among them are formed on the basis of the pro-cessed similarity queries. The evolution is analyzed by traditional social networking tools – the characteristic path length and the clustering co-efficient. Nonetheless, due to the special structure of the Metric Social Network, own designed gauges – the average overlap and robustness of description coefficients – are presented to analyze the structure of emerg-ing communities encompassing similar data.
... facial or iris recognition and retinal scanning to DNA testing, speech verification and gaitrecognition. ... As a result, a fingerprint is described as a sequence of points. ... Information Society Technologies (2007) 5. Batko, M.,... more
... facial or iris recognition and retinal scanning to DNA testing, speech verification and gaitrecognition. ... As a result, a fingerprint is described as a sequence of points. ... Information Society Technologies (2007) 5. Batko, M., Kohoutková, P., Zezula, P.: Combining metric features in ...
The availability of various photo archives and photo sharing systems made similarity searching much more important because the photos are not usually conveniently tagged. So the photos (images) need to be searched by their content.... more
The availability of various photo archives and photo sharing systems made similarity searching much more important because the photos are not usually conveniently tagged. So the photos (images) need to be searched by their content. Moreover, it is important not only to compare images with a query holistically but also to locate images that contain the query as their part. The query can be a picture of a person, building, or an abstract object and the task is to retrieve images of the query object but from a different perspective or images capturing a global scene containing the query object. This retrieval is called the sub-image searching. In this paper, the authors propose an algorithm, called SASISA, for retrieving database images by their similarity to and containment of a query. The novelty of it lies in application of a sequence alignment algorithm, which is commonly used in text retrieval. This forms an orthogonal solution to currently used approaches based on inverted files....
ABSTRACT As the volume of non-textual data, such images and other multimedia data, available on Internet is increasing. The issue of identifying data items based on query containment rather than query equality is becoming more and more... more
ABSTRACT As the volume of non-textual data, such images and other multimedia data, available on Internet is increasing. The issue of identifying data items based on query containment rather than query equality is becoming more and more important. In this paper, we propose a solution to this problem. We assume local descriptors are extracted from data items, so the aforementioned problem reduces to finding data items that share as many as possible local descriptors with the query. In particular, we define a new ε-intersection for this purpose. Local descriptors usually contain the location of the descriptors, so the proposed solution takes them into account to increase effectiveness of searching. We evaluate the ε-intersection on two real-life image collections using SIFT and SURF local descriptors from both effectiveness and efficiency points of view. Moreover, we study the influence of individual parameters of the ε-intersection to query results.
The metric space paradigm has recently received attention as an important model of similarity in the area of Bioinformatics. Numerous techniques have been proposed to solve similarity (range or nearest-neighbor) queries on collections of... more
The metric space paradigm has recently received attention as an important model of similarity in the area of Bioinformatics. Numerous techniques have been proposed to solve similarity (range or nearest-neighbor) queries on collections of data from metric domains. Though important representatives are outlined, this chapter is not trying to substitute existing comprehensive surveys. The main objective is to explain and prove by experiments that similarity searching is typically an expensive process which does not easily scale to very ...
ABSTRACT We focus on content-based retrieval in unstructured P2P networks consisting of thousands of peers that unpredictably join and leave the network. Such environments with permanent churning of peers require self-organizing... more
ABSTRACT We focus on content-based retrieval in unstructured P2P networks consisting of thousands of peers that unpredictably join and leave the network. Such environments with permanent churning of peers require self-organizing mechanisms that should deal with sudden peer failures, arrivals of new peers, and continual changes of data or network topology. In this paper, we build a self-organizing search system that operates in an unstructured P2P network and allows users to search for multimedia data by their content. In order to efficiently route queries to relevant peers, we define and evaluate several techniques for joining new peers to the existing network. These techniques create new relationships between peers - on the basis of answers returned to queries - so that a new peer will be able to efficiently forward queries and other peers will be immediately informed about its data. In addition, we demonstrate resilience of the system to sudden peer failures by studying system performance and quality of returned answers after a large number of peers is disconnected. The experiments, evaluated on a synthetic and real-life multimedia dataset, confirm that the proposed techniques are suitable for dynamic environments.

And 46 more