IJIGSP Vol. 11, No. 5, May. 2019
Cover page and Table of Contents: PDF (size: 714KB)
Speech is the natural mode of communication between humans. Human-to-machine interaction is gaining importance in the past few decades which demands the machine to be able to analyze, respond and perform tasks at the same speed as performed by human. This task is achieved by Automatic Speech Recognition (ASR) system which is typically a speech-to-text converter. In order to recognize the areas of further research in ASR, one must be aware of the current approaches, challenges faced by each and issues that needs to be addressed. Therefore, in this paper human speech production mechanism is discussed. The various speech recognition techniques and models are addressed in detail. The performance parameters that measure the accuracy of the system in recognizing the speech signal are described.[...] Read more.
Flowers are blessing of nature. Classification of flowers as a natural image is difficult as they are surrounded by background. So a segmentation phase is needed to separate the flower from background as good as possible. Computer vision has gained much attention for classification task. This paper proposes a method to classify flower with the help of LBP and SURF as features and SVM as a classifier. Input image is pre-processed for enhancement of image quality. Then the image is segmented by applying active contour segmentation method. After segmentation of the image, LBP and SURF features are extracted. SURF features are extracted from MSER regions. Then both features are concatenated. These concatenated features are sent for classification to SVM classifier. Quadratic SVM is employed here. Quadratic SVM trains these feature and tests to classify. We also tried out with different classifier. But they provide poor results. Proposed quadratic SVM achieves an accuracy of 87.2% which is significant and comparable for this classification taskK.[...] Read more.
Objective: This paper presents an automated approach that combines Fisher ranking and dimensional reduction method as kernel principal component analysis (KPCA) with support vector machine (SVM) to accurately classify the defects of rolling element bearing used in induction motor.
Methodology: In this perspective, vibration signal produced by rolling element bearing was decomposed to four levels using wavelet packet decomposition (WPD) method. Thirty one Logarithmic Root Mean Square Features (LRMSF) were extracted from four level decomposed vibration signals. Initially, thirty one features were rank by Fisher score and top ten rank features were selected. For effective detection, top ten features were reduced to a new feature using dimension reduction methods as KPCA and generalized discriminant analysis (GDA). After this, the new feature applied to SVM for binary classification of bearing defects. For analysis of this thirty six standard vibration datasets taken from online available bearing data center website of Case Western Reserve University on bearing conditions like healthy (NF), inner race defect (IR) and ball bearing (BB) defects at different loads.
Results: The simulated numerical results show that proposed method KPCA with SVM classifier using Gaussian Kernel achieved an accuracy (AC) of 100, Sensitivity (SE) of 100%, Specificity (SP) of 99.3% and Positive prediction value (PPV) of 99.3% for NF_IRB dataset, and an AC of 100, SE of 99.8%, SP of 100% and PPV of 100% for NF_BBB dataset.
A certain number of researchers have utilized uni-modal bio-metric traits for gender classification. It has many limitations which can be mitigated with inclusion of multiple sources of biometric information to identify or classify user’s information. Intuitively multimodal systems are more reliable and viable solution as multiple independent characteristics of modalities are fused together. The objective of this work is inferring the gender by combining different biometric traits like face, iris, and fingerprints of same subject. In the proposed work, feature level fusion is considered to obtain robustness in gender determination; and an accuracy of 99.8% was achieved on homologous multimodal biometric database SDUMLA-HMT (Group of Machine Learning and Applications, Shandong University). The results demonstrate that the feature level fusion of Multimodal Biometric system greatly improves the performance of gender classification and our approach outperforms the state-of-the-art techniques noticed in the literature.[...] Read more.
Image Inpainting of ruined historic monuments and heritage sites can help in visualizing how these may have existed in the past. An inpainted image of a monument can serve as a tool for physical reconstruction purpose. The purpose of the proposed method is to fill cracks and gaps of selected damaged regions in heritage monuments by exploiting the statistical properties of foreground and background along with the spatial location of the damage in the image of the monuments. The patch based image inpainting algorithm is improved by segmenting the image using K means clustering to search the candidate patches in relevant source region only. Segmentation improves patch searching in terms of both quality and time. The priority of the patch to fill is decided based on the standard deviation of the patch around destination pixel. Kn similar patches are selected from the source region based on minimum value of sum squared distance. The selected patches are refined using an efficient patch refinement scheme using higher order singular value decomposition to capture underlying pattern among the candidate source patches. The threshold for refinement is selected by using minimum and maximum value of standard deviation of the target patch. This eliminates random variation and unwanted artifacts. Experimental results carried on a large number of natural images and comparisons with well-known existing methods demonstrate the efficacy and superiority of the proposed method.[...] Read more.
This paper mainly studies Multi Band Spectral Subtraction (MBSS) for speech enhancement based on the spectrum representation in the frequency domain with three different scales(linear, log, mel) and their effect on performance measures in presence of additive non-stationary noise at different ranges of input SNR. Since speech is non-stationary signal, noise distribution is non-uniform i.e few frequency components are affected severely than others. A common method to restore the original speech in presence of noise is speech enhancement by suppressing the back ground noise. Multi Band Spectral Subtraction is one among the speech enhancement techniques which performs spectral subtraction by dividing noisy speech spectrum into uniformly spaced non over lapping frequency bands and spectral over subtraction is performed in each band separately. The performance of this method is evaluated in terms of objective measures such as Cepstrum distance, Log Likelihood Ratio, Weighted Spectral Slope distance, segmental SNR and Perceptual Evaluation of Speech Quality.[...] Read more.