Sheet Music

A novel hybrid classification model of genetic algorithms, modified k-Nearest Neighbor and developed backpropagation neural network

Description
Among numerous artificial intelligence approaches, k-Nearest Neighbor algorithms, genetic algorithms, and artificial neural networks are considered as the most common and effective methods in classification problems in numerous studies. In the
Categories
Published
of 50
4
Categories
Published
All materials on our website are shared by users. If you have any questions about copyright issues, please report us to resolve them. We are always happy to assist you.
Similar Documents
Share
Transcript
  RESEARCH ARTICLE A Novel Hybrid Classification Model of Genetic Algorithms, Modified k-NearestNeighbor and Developed BackpropagationNeural Network Nader Salari 1,2 *, Shamarina Shohaimi 1 , Farid Najafi 2 , Meenakshii Nallappan 1 ,Isthrinayagy Karishnarajah 3 1.  Department of Biology, Faculty of Science, University Putra Malaysia, Serdang, Selangor, Malaysia,  2. Department of Biostatistics and Epidemiology, School of Public Health, Kermanshah University of MedicalSciences, Kermanshah, Iran,  3.  Department of Mathematics, Faculty of Science, University Putra Malaysia,Serdang, Selangor, Malaysia*nader.salari.1344@gmail.com Abstract  Among numerous artificial intelligence approaches, k-Nearest Neighbor algorithms,genetic algorithms, and artificial neural networks are considered as the mostcommon and effective methods in classification problems in numerous studies. Inthe present study, the results of the implementation of a novel hybrid featureselection-classification model using the above mentioned methods are presented.The purpose is benefitting from the synergies obtained from combining thesetechnologies for the development of classification models. Such a combinationcreates an opportunity to invest in the strength of each algorithm, and is anapproach to make up for their deficiencies. To develop proposed model, with theaim of obtaining the best array of features, first, feature ranking techniques such asthe Fisher’s discriminant ratio and class separability criteria were used to prioritizefeatures. Second, the obtained results that included arrays of the top-rankedfeatures were used as the initial population of a genetic algorithm to produceoptimum arrays of features. Third, using a modified k-Nearest Neighbor method aswell as an improved method of backpropagation neural networks, the classificationprocess was advanced based on optimum arrays of the features selected bygenetic algorithms. The performance of the proposed model was compared withthirteen well-known classification models based on seven datasets. Furthermore,the statistical analysis was performed using the Friedman test followed by post-hoctests. The experimental findings indicated that the novel proposed hybrid modelresulted in significantly better classification performance compared with all 13classification methods. Finally, the performance results of the proposed model was OPEN ACCESS Citation:  Salari N, Shohaimi S, Najafi F, NallappanM, Karishnarajah I (2014) A Novel HybridClassification Model of Genetic Algorithms,Modified k-Nearest Neighbor and DevelopedBackpropagation Neural Network. PLoSONE 9(11): e112987. doi:10.1371/journal.pone.0112987 Editor:  Sergio Go´mez, Universitat Rovira i Virgili,Spain Received:  July 22, 2013 Accepted:  October 21, 2014 Published:  November 24, 2014 Copyright:  2014 Salari et al. This is an open-access article distributed under the terms of theCreative Commons Attribution License, whichpermits unrestricted use, distribution, and repro-duction in any medium, provided the srcinal author and source are credited. Funding:  The authors have no support or fundingto report. Competing Interests:  The authors have declaredthat no competing interests exist. PLOS ONE | DOI:10.1371/journal.pone.0112987  November 24, 2014  1 / 50  benchmarked against the best ones reported as the state-of-the-art classifiers interms of classification accuracy for the same data sets. The substantial findings of the comprehensive comparative study revealed that performance of the proposedmodel in terms of classification accuracy is desirable, promising, and competitive tothe existing state-of-the-art classification models. Introduction In the last decade, the extensive effect of classification models on decision makingin various scientific fields including medicine, has attracted a lot of attention.Classification in the realm of research is the designation of an individual or anitem to a set of classes so that the decision making is made based on thecharacteristic of that individual or the item. Successful classification depends onthe two major factors of ‘‘how to select the most informative features’’ and the‘‘classifier method’’, especially in the field of medical classification. Thewidespread in congruency of features in this field has made the selection of asubcategory of the best factors of features more significant, and has given it a moreeffective and valuable role in the promotion of the performance of theclassification model. Using a set of training patterns, in which the correctclassification is known subcategory of classified observations called the trainingset, the classifier function organizes the classification. Thus, it is expected thatproper selection and classification of methods at each stage would lead to aclassification model with successful performance.Following the first classification rule presented in 1936 by Fisher in statisticalclassification literature [1], various classification models have been proposed. Among them, the simple and efficient method for the implementation andunderstanding of non-parameterized classification was the k-Nearest Neighbor(k-NN) which has been well-received. For instance, in order to improve theclassification accuracy, Weinberger and Saul [2] presented a developed algorithmof k-NN. In their proposed model, they used Mahalanobis distance as thecriterion for distance determination. A developed hierarchical model of k-NN wasintroduced by Kubota et al. [3]. The high capability and sensitivity of this modelin the fine discrimination of classes is noteworthy. Zeng et al. [4] have proposed amodified classification algorithm of k-NN whose underlying algorithm is localaverage and class statistics. That is, in addition to local information from k-NN of new non-classified data, general information about neighbors in each class isanalyzed separately.Artificial neural network is an efficient approach that in recent years has beenconsidered by researchers as one of the most useful and applicable constructs inartificial intelligence. This is due to its numerous advantages such as being non-parametric (no requirement for any primary assumption on data), self-adaptiveness, ability to be generalized, and having a high capacity in modeling  A Novel Hybrid Classification ModelPLOS ONE | DOI:10.1371/journal.pone.0112987  November 24, 2014  2 / 50  non-linear patterns. This approach is a functional technology that provides theuser the possibility to obtain the best linear combination of features in order toachieve his/her goals including the classification of complex models, estimation of non-linear functions and prediction [5]. In the medical field, Olmez and Dokur [6] have proposed the use of artificial neural networks algorithm to classify heart beats. In their proposed model, they first selected the best features using dynamic programming; then, using artificialneural networks, they successfully classified heart beats into seven categories. Toclassify heart beat data, Rajendra A et al. [7] employed artificial neural networksand Fuzzy equivalence classifier. Qiu et al. [8] presented a model for classificationof cervical cancer risk, using artificial neural networks. The findings indicatedsensitivity and specifity of 98% and 97%, respectively. Salari et al. [9] usedartificial neural network methods for prediction of late onset heart failure. In2013, Salari et al. [10] used an integrated medical model based on artificialintelligence approach. The proposed model, was put forwarded for medical dataclassification.However, traditional methods which are based on single technology weregradually replaced by hybrid models. Hybrid models which are increasingly getting noticed by researchers are a relatively new approach which includeinnovative, creative, and appropriate combination of several models for achievinga final common goal with a performance far better than traditional models basedon single technology. The main idea behind these models is to benefit from thesynergies among technologies. This characteristic provides the opportunity tolearn about the exclusive strengths of each technology and can be used as a meansof compensating for the deficiencies, and overcome limitations of each technology [11,12]. The review of medical literature indicates that research on the application of hybrid models in the field of artificial intelligence is growing. Chakraborty  [13] proposed an integrated approach for cancer classification and simultaneous geneselection. He argues that, because only a small part of the large number of genes inthis field is suitable for discriminating between different types of cancer, it will bebetter if these two processes take place simultaneously. The application of thismodel is choosing findings among suitable genes and simultaneously developing amodel of possible nearest neighbor for cancer classification. Ostermark [14]proposed a classification hybrid model by employing genetic algorithms, Fuzzy logic, and artificial neural networks. Aci et al. [15] presented a hybrid model witha combination of genetic algorithms, Bayesian methods, and k-NN. Their goal isto eliminate the data that are barrier to learning to achieve successful results inclassification. Khashei et al. [16] proposed a hybrid model combining artificialneural networks and multiple linear regression. This model has been proposed forclassification purposes, and for achieving higher accuracy and a more generalizedapplication than the traditional artificial neural network models.In 2014, Seera and Lim [17] also put forward a hybrid intelligent system for medical data classification. The proposed system consisted hybrid of the Fuzzy Min–Max neural network, the classification and regression tree (CART), and the  A Novel Hybrid Classification ModelPLOS ONE | DOI:10.1371/journal.pone.0112987  November 24, 2014  3 / 50  random forest model. They concluded that the domain users (i.e., medicalpractitioners) were able to comprehend the prediction given by the hybridintelligent system; thus accepting its role as a useful and efficient medical decisionsupport system. Again, in 2014, Shao et al. [18] addressed the classification heartdisease issue by combining the multivariate adaptive regression splines (MARS),logistic regression, artificial neural network, and rough set (RS) techniques. Ininitial step, the proposed hybrid model reduced the set of explanatory variables by using logistic regression, MARS, and RS techniques. Subsequently, selectedvariables was employed as inputs for the artificial neural network method in theprocess of classifying heart disease patients. Experimental results have shown theeffectiveness of the proposed hybrid model to classify heart disease.Forghani and Yazdi [19] came up with a hybrid model called ‘‘robust support vector machine-trained fuzzy system’’. The proposed hybrid classifier establishedwith a combination of support vector machine and Fuzzy if–then rules.Experimental results have shown the use of proposed approach results in very fasttraining and testing convergence time with good overall classification accuracy rate. In effect, this model had 63% of classification accuracy based on theCleveland multi-class data set. Zhang and Zhang [20] suggested a hybrid methodemploying Rotation Forest in conjunction with AdaBoost. This model achieved55.62% and 74.69% classification accuracies for the Cleveland multi-class andPima’s data sets, respectively. A classification model entitled ‘‘Forest OptimizationAlgorithm’’ was proposed by Ghaemi and Feizi-Derakhshi [21]. It was establishedby incorporating a few trees into the forests to improve the predictive accuracy of classifiers. This classification model attained 58.14% and 71.11% accuracies forthe Cleveland multi-class and Pima’s data sets, respectively. Zhang et al. [22] cameup with a novel k-NN-based algorithm, 3N-Q, for enhancing the performanceaccuracy of k-NN classifiers. The reported experiment results demonstrated that3N-Q is efficient and accurate for performing classification tasks.The review of literature indicates that models with diverse applications basedon various combinations of k-NN, genetic algorithms and artificial neuralnetwoks have been proposed for classification purposes. However, no measure hasbeen taken for linking these three methods in the literature of classificationmodels. Therefore, it can be argued that such an action is a novel approach thatadds to the body of literature in this field. The present study aims to present a new model to appropriately link the above mentioned three methods. It is expectedthat the synergy resulting from the combination of these elements improvesclassification performance, especially in various medical fields.This model begins with features prioritization using classification techniquesthat facilitate learning such as Fisher’s discriminant ratio, and class separability criteria. In fact, Fisher’s discriminant ratio is the criteria for features ordering interms of discrimination ratio of both classes relative to each other whereas classseparability criteria is the criteria for features classification in terms of discrimination ratio of each class relative to all other classes. Then, using high andunique capabilities of genetic algorithm in optimization, optimized arrays wereproduced so that the results of features classification, including previously   A Novel Hybrid Classification ModelPLOS ONE | DOI:10.1371/journal.pone.0112987  November 24, 2014  4 / 50  classified arrays, were utilized as initial population of the genetic algorithm.Afterwards, using the modified k-NN method in parallel with a Developed Back Propagation Neural Network (DBPNN) method, the classification process wascarried out according to optimization arrays of selected features by geneticalgorithms. Finally, a method of Fuzzy class membership was applied to integrateand finalize decision making from proposed classes.The new proposed model was tested with six data sets taken from the University of California Irvine (UCI) machine-learning repository as well as a dataset in thereal world called Acute Coronary Syndrome Event — in Kermanshah, Iran(ACSEKI). From these data sets, four were on heart diseases, two on breast cancerand one on diabetes. In addition, the performance of the proposed new hybridmodel was compared to some of the well-known classification models.The rest of this study is organized as follows: section two presents the materialsand methods including a brief explanation of each applied approach in the hybridmodel; the framework and building process of the proposed hybrid model isdescribed in detail; the model performance assessment process is presented and adetailed plan for the statistical evaluation of the model is provided. The results of the performance evaluation of the proposed model are discussed in section threecomparing to some of the well-known classification models (based on sevendifferent data sets) as well as statistical evaluation results. Finally, section fourincludes the conclusion. Material and Methods In this section, first the attributes of each dataset is explained. Second, a brief review of concepts and methods of Genetic Algorithms, fuzzy class memberships,BackPropagation Neural Networks(BPNNs), and k-NNs is presented. Finaly, theproposed model is thoroughly described. 2.1 Data sets technical information To test the proposed hybrid model in this study, widespread and differentstandard data sets from the real world were used. Among these data sets, four wereon heart disease, two on breast cancer and one on diabetes. These data sets, briefly discussed here, are similar in terms of number and type of features, number of classes, and number of missing values.One of the data sets applied in heart field is ACSEKI. Using the Euro HeartSurvey on ACS, designed by the European Society of Cardiology, we registered alladmitted patients referred to the Imam Ali hospital, the main center forcardiovascular care in Kermanshah, Iran. While the first Euro Heart Survey of ACS was conducted in 25 countries (in 2000–2001), the second survey involved 32European countries. For the purpose of this registry, all hospitalized patientsdiagnosed with ACS during 2010–2011 were included. According to the standard  A Novel Hybrid Classification ModelPLOS ONE | DOI:10.1371/journal.pone.0112987  November 24, 2014  5 / 50
Search
Similar documents
View more...
Tags
Related Search
We Need Your Support
Thank you for visiting our website and your interest in our free products and services. We are nonprofit website to share and download documents. To the running of this website, we need your help to support us.

Thanks to everyone for your continued support.

No, Thanks
SAVE OUR EARTH

We need your sign to support Project to invent "SMART AND CONTROLLABLE REFLECTIVE BALLOONS" to cover the Sun and Save Our Earth.

More details...

Sign Now!

We are very appreciated for your Prompt Action!

x