Svoboda | Graniru | BBC Russia | Golosameriki | Facebook
Skip to main content
For the video images with complex actions, achieving accurate text detection and recognition results is very challenging. This paper presents a hybrid model for classification of action-oriented video images which reduces the complexity... more
For the video images with complex actions, achieving accurate text detection and recognition results is very challenging. This paper presents a hybrid model for classification of action-oriented video images which reduces the complexity of the problem to improve text detection and recognition performance. Here, we consider the following five categories of genres, namely concert, cooking, craft, teleshopping and yoga. For classifying action-oriented video images, we explore ResNet50 for learning the general pixel-distribution level information and the VGG16 network is implemented for learning the features of Maximally Stable Extremal Regions and again another VGG16 is used for learning facial components obtained by a multitask cascaded convolutional network. The approach integrates the outputs of the three above-mentioned models using a fully connected neural network for classification of five action-oriented image classes. We demonstrated the efficacy of the proposed method by testi...
As more and more office documents are captured, stored, and shared in digital format, and as image editing software are becoming increasingly more powerful, there is a growing concern about document authenticity. To prevent illicit... more
As more and more office documents are captured, stored, and shared in digital format, and as image editing software are becoming increasingly more powerful, there is a growing concern about document authenticity. To prevent illicit activities, this paper presents a new method for detecting altered text in document images. The proposed method explores the relationship between positive and negative coefficients of DCT to extract the effect of distortions caused by tampering by fusing reconstructed images of respective positive and negative coefficients, which results in Positive-Negative DCT coefficients Fusion (PNDF). To take advantage of spatial information, we propose to fuse R, G, and B color channels of input images, which results in RGBF (RGB Fusion). Next, the same fusion operation is used for fusing PNDF and RGBF, which results in a fused image for the original input one. We compute a histogram to extract features from the fused image, which results in a feature vector. The feature vector is then fed to a deep neural network for classifying altered text images. The proposed method is tested on our own dataset and the standard datasets from the ICPR 2018 Fraud Contest, Altered Handwriting (AH), and faked IMEI number images. The results show that the proposed method is effective and the proposed method outperforms the existing methods irrespective of image type.
) Daniel Lopresti Andrew Tomkins [email protected] [email protected] Matsushita Information Technology Laboratory Panasonic Technologies, Inc. Two Research Way Princeton, NJ 08540 USA February 22, 1995 1 Introduction In this... more
) Daniel Lopresti Andrew Tomkins [email protected] [email protected] Matsushita Information Technology Laboratory Panasonic Technologies, Inc. Two Research Way Princeton, NJ 08540 USA February 22, 1995 1 Introduction In this paper we discuss a new paradigm for pen computing based on the notion of deferring or even eliminating handwriting recognition (HWX) in many cases. In its place, key functionality is brought closer to the user by implementing it directly in the ink domain. The primary advantage of this approach is increased expressive power, but it also results in a different class of pattern matching problems, some of which may be more tractable and less intrusive than traditional HWX. For input and interaction, pens have many advantages: they are expressive [Mor95], lightweight, and familiar. It has been shown, for example, that a pen is better than a mouse or trackball for pointing tasks [MSB91]. But while pen-based computers have met with success in vertical market...
Page 1. Certifiable Optical Character Recognition Daniel P. Lopresti and Jonathan S. Sandberg Matsushita Information Technology Laboratory Two Research Way Princeton, NJ 08540 USA Abstract In this paper we describe ...
Image defects and their effects on drawing analysis algo- rithms are investigated in this work. To study general draw- ing analysis systems, we use unconstrained, well-behaved random polygons as test inputs. We generate synthetic noisy... more
Image defects and their effects on drawing analysis algo- rithms are investigated in this work. To study general draw- ing analysis systems, we use unconstrained, well-behaved random polygons as test inputs. We generate synthetic noisy samples through the use of image defect models. Im- age analysis algorithms are then applied to these samples, and the results are empirically evaluated by
Comparison of text-based methods for detecting duplication in document image databases. [Proceedings of SPIE 3967, 210 (1999)]. Daniel P. Lopresti. Abstract. This paper presents an experimental evaluation of several text-based ...
In concatenative Text-to-Speech, the size of the speech corpus is closely related to synthetic speech quality. In this paper, we describe our work on a new corpus-based Bell Labs' TTS system. This encompasses large acoustic... more
In concatenative Text-to-Speech, the size of the speech corpus is closely related to synthetic speech quality. In this paper, we describe our work on a new corpus-based Bell Labs' TTS system. This encompasses large acoustic inventories with a rich set of annotations, models and data structures for representing and managing such inventories, and an optimal unit selection algorithm that accommodates
In this paper, we show how cross­domain approximate string matching can be applied to searching a database of scanned typeset documents using handwritten queries without requiring the correction of recognition errors. We present... more
In this paper, we show how cross­domain approximate string matching can be applied to searching a database of scanned typeset documents using handwritten queries without requiring the correction of recognition errors. We present preliminary experimental results that suggest this approach can significantly improve retrieval effectiveness.
In this paper, we introduce a method of extracting and comparing geometric structures within images for content based query-by-sketch image retrievals. These structures are modeled by a set of curvelets, ie, line segments, circular arcs,... more
In this paper, we introduce a method of extracting and comparing geometric structures within images for content based query-by-sketch image retrievals. These structures are modeled by a set of curvelets, ie, line segments, circular arcs, and higher ...
In this paper, we describe our first steps towards adapting a new approach for graph comparison known as graph probing to allow for the pre-computation of a compact, efficient probe set for databases of graph-structured documents (eg, Web... more
In this paper, we describe our first steps towards adapting a new approach for graph comparison known as graph probing to allow for the pre-computation of a compact, efficient probe set for databases of graph-structured documents (eg, Web pages coded in ...
Шжг ж бб а бг а д гв з в д жзгв а и а зз зи виз ДШ зЕ л и б жгд гв з д жб и кг Й ж к в йз ж ви ж з в л йз ж джгк з вЙ дйи н зд в К Св и з д д жИ л з гл гл иг мЙ даг и и з д а ин иг в ж и жндиг ж д нз гв зй к зК Ыд Ќ аанИ л и а гйж бЙ да б... more
Шжг ж бб а бг а д гв з в д жзгв а и а зз зи виз ДШ зЕ л и б жгд гв з д жб и кг Й ж к в йз ж ви ж з в л йз ж джгк з вЙ дйи н зд в К Св и з д д жИ л з гл гл иг мЙ даг и и з д а ин иг в ж и жндиг ж д нз гв зй к зК Ыд Ќ аанИ л и а гйж бЙ да б ви и гв г и в ей иг в ж и ж д иЙ а жндиг ж д ...

And 202 more