Co-evolutionary cloud-based attribute ensemblemulti-agent reduction algorithm

Ding Weiping1,2,4 Wang Jiandong3 Zhang Xiaofeng2 Guan Zhijin2

(1State Key Laboratory for Novel Software Technology, Nanjing University, Nanjing 210093, China)(2School of Computer Science and Technology, Nantong University, Nantong 226019, China)(3College of Computer Science and Technology, Nanjing University of Aeronautics and Astronautics, Nanjing 210016, China)(4Key Laboratory of Intelligent Perception and Systems for High-Dimensional Information of Ministry of Education,Nanjing University of Science and Technology, Nanjing 210094, China)

Abstract:In order to improve the performance of the attribute reduction algorithm to deal with the noisy and uncertain large data, a novel co-evolutionary cloud-based attribute ensemble multi-agent reduction (CCAEMR) algorithm is proposed. First, a co-evolutionary cloud framework is designed under the MapReduce mechanism to divide the entire population into different co-evolutionary subpopulations with a self-adaptive scale. Meanwhile, these subpopulations will share their rewards to accelerate attribute reduction implementation. Secondly, a multi-agent ensemble strategy of co-evolutionary elitist optimization is constructed to ensure that subpopulations can exploit any correlation and interdependency between interacting attribute subsets with reinforcing noise tolerance. Hence, these agents are kept within the stable elitist region to achieve the optimal profit. The experimental results show that the proposed CCAEMR algorithm has better efficiency and feasibility to solve large-scale and uncertain dataset problems with complex noise.

Key words:co-evolutionary elitist optimization; attribute reduction; co-evolutionary cloud framework; multi-agent ensemble strategy; neonatal brain 3D-MRI

The aim of attribute reduction using the Rough set theory is to discover the minimum attribute set and induce the minimum length of decision rules inherent in an information system, while retaining a much higher accuracy and efficiency[1-2]. The attribute reduction is considered a common technique for data preprocessing in data mining and knowledge discovery, and its progress has been made on such representative topics as feature selection, multi-granulation analysis, noisy processing and so on[3-6]. However, the real-life datasets are available everywhere from the sensor networks, social networks, and the proprietary databases, and they often refer to incompleteness, uncertainty and vagueness[7]. Most attribute reduction algorithms are very unreliable when applied directly for extracting knowledge from the ever-greater number of samples with complex multiple relevant structures. Moreover, noise is one of the main sources of uncertainty in applications. It has been shown that most traditional algorithms are not robust to noise. Hence, there is an urgent need to propose some novel and effective attribute reduction algorithm to deal with these large-scale complex datasets.

In recent years, a few various attribute reduction algorithms and models have been discussed. Zhang et al.[8] presented the parallel algorithm for computing equivalence classes, decision classes based on the MapReduce model, which can be used to deal with massive data by updating rough set approximations. Besides, Zhang et al.[9] enhanced the parallel large-scale rough set-based methods for knowledge acquisition, and implemented them on such representative MapReduce runtime systems to mine knowledge from big data. Qian et al.[10] explored a novel structure of 〈key, value〉 pair to speed up the computation of equivalence classes and attribute significance, and parallelized the traditional attribute reduction process based on the MapReduce mechanism. Xu et al.[11] provided a new approach to approximate reduction of interval-valued multi-decision tables with the distributed data storage architecture of big data. It has been empirically and theoretically demonstrated that the using MapReduce technique can obtain better attribute reduction performance. Nevertheless, these results by using the above algorithms are not sometimes guaranteed to be the same results as those achieved by implemented on the whole and non-separated large-scale dataset. Since most 〈key, value〉 pairs mainly focus on optimizing their respective objectives, they will cause the inefficient approximation without the information-sharing strategy. The optimality of cooperative co-evolution is best expressed as the Nash equilibrium of self-adaptive balance achieved by the co-evolutionary populations[12], so the combination of the MapReduce framework and cooperative co-evolution to greatly enhance attribute reduction performance in big datasets has motivated us to investigate a novel attribute reduction algorithm.

To address the above-mentioned problems, a novel co-evolutionary cloud-based attribute ensemble multi-agent reduction (CCAEMR) algorithm is proposed in this paper. A co-evolutionary cloud framework is designed to divide the entire population into many co-evolutionary subpopulations with a self-adaptive scale. Then, a multi-agent ensemble strategy of co-evolutionary elitist optimization is constructed to ensure that subpopulations can exploit any correlation and interdependency between interacting attribute subsets. Therefore, the Pareto front region of the stable multi-agent co-evolutionary elitist can be achieved and the better knowledge of attribute reduction from large datasets can be selected. Experimental results indicate that the proposed CCAEMR algorithm has better feasibility and effectiveness than traditional algorithms, and it can obviously boost superior performance for attribute reduction.

1 Co-Evolutionary Cloud Framework

Since the MapReduce model provides a scalable and robust framework to automatically split the whole dataset into many data subsets, we mainly focus on designing proper 〈key,value〉 pairs to implement the map and reduce functions for parallel attribute reduction in big datasets. In this section, the co-evolutionary cloud framework is designed under the MapReduce mechanism to divide the entire population into co-evolutionary subpopulations with self-adaptive scale. These subpopulations will share their rewards to accelerate attribute reduction implementation. This framework can exploit and explore the inherent parallelism of co-evolutionary populations for attribute reduction, and it is built up by the following steps as shown in Algorithm 1.

Algorithm 1 Co-evolutionary cloud framework (CCF)

Input: Co-evolutionary Subpopulationi.

Output: MapReduce probability 〈keyi, valuei〉.

Design the best Subpopulationi with the self-adaptive probability pi, which represents the probability of 〈keyi, valuei〉 on the MapReduce. The probability pi is formulated as follows:

(1)

where r is the number of co-evolutionary individuals in the Subpopulationi; fj is the best fitness of the j-th individual; fSubpopi is the local best fitness of Subpopulationi; and fPop is the global best fitness of the whole population.

Partition the subpopulations. If the local best fitness of most subpopulations arrives in the same target area, Subpopulationi learns from any co-evolutionary subpopulations Subpopulationj.

Continue to partition subpopulations, and the consumption of computing resources will be increased.

Define the formulation cn(t) as follows:

(2)

(3)

where δ is the threshold set between the local best fitness of subpopulations in the i-th iteration and the average fitness from the 1st to the (t-1)-th iteration; represents the average fitness from the 1st to the (t-1)-th iteration as follows:

(4)

Merge some related subpopulations into a new subpopulation according to the local best fitness diversity of subpopulations, which is computed as

(5)

where cn(t) denotes the sum of the difference number which meets threshold set δ. Thus, most subpopulations can be merged, respectively.

Re-construct the parallel operation 〈keyi,valuei〉 for each Subpopulationi by the MapReduce formwork and the local best fitness diversity value of subpopulations as follows:

(6)

valuei=wi×diversity(t)×keyi

(7)

2 Multi-Agent Ensemble Strategy of Co-Evolutionary Elitist Optimization

Currently, most attribute reduction algorithms are sensitive to noisy big datasets, which will result in some inaccurate or unexpected reduction results. In order to enhance the robustness of attribute reduction, a multi-agent ensemble strategy of co-evolutionary elitist optimization (MESCEO) for attribute reduction is constructed to deal with the big dataset with complex noise. This strategy ensures that the co-evolutionary subpopulations can provide an easy balancing strategy between exploration and exploitation to improve the performance of attribute reduction. It also addresses the limitations of the existing MapReduce structure and interaction by using the meta-heuristics organization model and the dynamic adaptation.

There are five agents: main elitist agent (MEA), slave elitist agent (SEA), population agent (PA), subpopulation agent (SA) and individual agent (IA), as described in Fig.1. Each agent is formed from its individual agent and controlled by its own SA. The MEA is evaluated as MapReducei, and then its fitness is passed back to all participating agents. This strategy is illustrated in Fig.2, and its steps are detailed in Algorithm 2.

Fig.1 Organization representation of structural multi-agent co-evolution

Fig.2 Multi-agent ensemble strategy of co-evolutionary elitist optimization

Algorithm 2 Multi-agent ensemble strategy of co-evolutionary elitist optimization (MESCEO)

Input: Multi-agent MEA, SEA, PA, SA and IA.

Output: Pareto front region of stable multi-agent co-evolutionary elitist.

Generate a random location of MEA which starts in a horizontal row at the bottom right corner of the MapReduce framework. All the other agents make their move simultaneously, and SEA can move into a position occupied by such two agents as PA and SA. MEA catches the SEA agent when PA and SA move into the position occupied by IA.

Design the forward extend range of elitist agents as

xi∈[(θi-ηi),(θi+ηi)]

(8)

(9)

where αi represents each extending step length.

Evaluate MEA around the center of the Pareto front by using the agents of its subpopulation. IA located in the edges of the Pareto front is preferable when the shape is concave. These fitness values are sent to the same neighbor MEA to re-evaluate its own elitist agent. When MEA is selected into the archive, the penalty is imposed on SEA which has the same grid coordinate as the selected MA. The penalty function is defined as

(10)

where ξ(x) is the attribute subsets; γξ(x)(D) is the reduction quality of attribute subsets ξ(x) relative to decision attribute set D, and λi is the adaptive penalty factor,

(11)

According to the λi proximity degree of approximation to the optimal solution, the penalty function Φ(x) can be adapted to adjust the fitness value, which greatly improves the convergence of the multi-agent in the attribute approximation space of large-scale datasets.

Consider a set of SA and IA that have the grid coordinates. As the forward extend way is based on the fitness of solutions, the neighborhood agent will be eliminated simultaneously, and MEA is preferable to PA after SEA has entered the archive.

In order to further prevent crowding of PA, SA and IA, the forward re-extend way is used to improve the convergence of all agents. MEA can have a better grid dominated by its neighborhood agents and thus all agents obtain a well approximated and well-distributed archive set. Hence, the Pareto front region of stable multi-agent co-evolutionary elitist can be achieved.

By using the MESCEO strategy, five kinds of agents are kept within the stable elitist region to achieve optimal profit. Meanwhile, this strategy can lead the elitists with a co-evolutionary cloud framework to find the favorable regional area of equilibrium optimal solution set. So, it improves the attribute reduction performance to converge onto the Pareto front in the noisy big datasets.

3 CCAEMR Algorithm

To improve the computing performance of approximations of massive complex attribute sets, we recommend the attribute ensemble multi-agent reduction algorithm (CCAEMR) based on the above proposed CCF framework and MESCEO strategy. The CCAEMR can construct the parallel framework to compute the positive region and discernibility matrix using MapReduce, and it can well avoid the shortcoming that most of traditional algorithms only run on a few of computers to deal with big datasets. The flowchart of the CCAEMR is illustrated in Fig.3.

Fig.3 Graph representation of CCAEMR flowchart

4 Experimental Study

In this experiment, compared with traditional algorithms, we conduct the performance studies of the proposed CCAEMR algorithm on representative datasets. These datasets are selected from the protein datasets (LiverACO), biomedical datasets (Ovarian-cancer), NIPS 2003 feature selection challenge datasets (Dorothea), and public microarray datasets (Prostate). The dimensionalities (from 2×103 to 1×105) and sample sizes (from 500 to thousands) can represent the real practical applications. The datasets stratified ten-fold cross-validation (10-FCV) is employed for data validation. Our experiments run on the Apache Hadoop platform with Hadoop version 1.0.1 and Java 1.6.0.12. The software being used is Microsoft Visual Studio 2005 and Visual C#. The operating system is Linux CentOS 5.2 Kernel 2.6.18.

The comparative results are used to evaluate the efficiency of the CCAEMR, compared with such representative algorithms as FRRCUT[4], and GARIv[11]. We add some random numbers to each attribute value as attribute noise rate, and further observe the variation of average misclassification cost on four representative datasets. We define nNP as the number of objects classified incorrectly, nBP as the number of objects with deferment decisions, μNP as the cost for classifying an object into the negative region when it belongs to the positive region, and μBP as the cost for classifying an object into the boundary region. The misclassification cost can be calculated as

MCF_cost=nNPμNP+nBPμBP

(12)

Fig.4 presents the average comparison results of mis-classification cost and its corresponding running time. This experiment is performed with different levels of noise rate values from 6.0% to 14.0%. The x-axis pertains to different levels of incremental noise rate, whereas the y-axis concerns the variation of updating misclassification cost and CPU running time. It can be observed that the variations of the misclassification cost and the running time values rise as the level of the additional noise rates increases. This indicates that the additional noise rate has a strong impact on the classification performance. However, we can see from Fig.4 that the value variation of the CCAEMR is small in most cases. We take the LiverACO dataset as an example, when the level of additional noise rate ranges from 10.0% to 12.0%, the variation of misclassification costs of the CCAEMR is 2.12. When the level of noise rate is from 12.0% to 14.0%, the variation of misclassification costs is 1.87. Meanwhile, the CPU time curve of the CCAEMR rises slightly and remains stable. Despite its appealing performance, the FRRCUT is dominated by the CCAEMR in most cases throughout our experiments. Furthermore, with the number of levels of noise rate dynamically increasing, the efficiency of the CCAEMR is more and more clear. These similar behaviors also hold for other datasets.

(a)

(b)

(c)

(d)

Fig.4 Performance comparisons of three algorithms with different levels of incremental noise rate. (a) LiverACO dataset; (b) Ovarian-cancer dataset; (c) Dorothea dataset; (d) Prostate dataset

The aforementioned analysis shows that the CCAEMR can delete many more unnecessary objects from different datasets, construct fewer solutions and take far less running time for attribute reduction. Meanwhile, the CCAEMR is suitable for dealing with attribute reduction in big datasets with different additional noises, overcoming the limitations of traditional attribute reduction algorithms in discovering and exploiting attribute structures inherent with complex noise.

5 Application Performance for Neonatal Brain 3D-MRI

As the neonatal brain 3D-MRI has the insufficient tissue contrast with the heterogeneous and dynamic changing characteristics, the process of removing non-brain tissue is the first module of most brain MRI studies. It is also clear that the neonatal brain has low spatial resolution and insufficient tissue contrast, so the edges of the non-brain region may be easily mistaken for the brain region[13-14]. In the following experiment, we will assess the performance of the CCAEMR on attribute reduction for multi-atlas-based simultaneous labeling of inner and outer cortical surfaces of neonatal brain 3D-MRI. Fig.5 shows the surface distance from 4 to 16 months in two typical subjects by adding 5% and 8% Gaussian noise into neonatal brain 3D-MRI, respectively. It can be noticed that the cortical thickness develops dynamically in neonates, particularly from 4 to 8 months. As the edges of different organizations are fuzzy and the 3D-MRI data is vulnerable under the influence of noise, the non-brain region may be easily mistaken for the brain region during the process of organization merging. However, after the CCAEMR is used to segment them, the outline of watershed is easily taken as the initial curves of level set to realize the automatic segmentation of brain tissue. The CCAEMR substantially improves the accuracy and robustness of brain extraction, as well as keeping the details of different brain regions.

(a)

(b)

Fig.5 Longitudinal cortical surface labeling results in two representative neonatal subjects. (a) With 5% Gaussian noise; (b) With 8% Gaussian noise

To quantitatively evaluate the consistency of results, we compute the average value of symmetric distance of boundaries for labeling regions between each pair of aligned longitudinal surfaces of 10 neonatal subjects, as shown in Fig.6. The average boundary distance are 0.63±0.02 mm (CCAEMR), 0.76±0.03 mm (Warfield et al. algorithm[15]), and 0.69±0.04 mm (Wang et al. algorithm[16]) respectively. So, the CCAEMR achieves a much lower boundary distance. It can be dynamically adaptive to derive from atlas surfaces and cortical folding geometries, exhibiting the strong improvement for complex neonatal brain 3D-MRI. The above results have important implications for the forecasting and diagnosis of the brain morphometry and cortical surface reconstruction.

Fig.6 Quantitative comparisons of distance of boundaries for labeling regions

6 Conclusion

With the increasing datasets, the performance of traditional attribute reduction algorithms will deteriorate rapidly and the distribution of solutions may be non-uniform. In this paper, we introduce two contributions to the study of the attribute reduction algorithm: the use of the co-evolutionary cloud framework and a novel multi-agent ensemble strategy of co-evolutionary elitist optimization. Then, we propose a novel co-evolutionary cloud-based attribute ensemble multi-agent reduction (CCAEMR) algorithm, and provide the insight into the attribute reduction issue of the current big data application. Extensive experimental comparative studies confirm that the CCAEMR outperforms some traditional algorithms in terms of accuracy and efficiency. Meanwhile, the successful application in the neonatal brain 3D-MRI strongly demonstrates that the CCAEMR has the superior application performance to deal with complex medical brain datasets with noise.

References:

[1]Pawlak Z. Rough sets [J]. International Journal of Computer and Information Sciences, 1982, 11(5): 341-356. DOI:10.1007/bf01001956.

[2]Hassanien A. Fuzzy rough sets hybrid scheme for breast cancer detection [J]. Image and Vision Computing, 2007, 25(2): 172-183. DOI:10.1016/j.imavis.2006.01.026.

[3]Maji P, Garai P. IT2 fuzzy-rough sets and max relevance-max significance criterion for attribute selection [J]. IEEE Transactions on Cybernetics, 2015, 45(8): 1657-1668. DOI:10.1109/TCYB.2014.2357892.

[4]Zhao S Y, Chen H, Li C P, et al. RFRR: Robust fuzzy rough reduction [J]. IEEE Transactions on Fuzzy Systems, 2013, 21(5): 825-841. DOI:10.1109/tfuzz.2012.2231417.

[5]Zhao S Y, Chen H, Li C P, et al. A novel approach to building a robust fuzzy rough classifier [J]. IEEE Transactions on Fuzzy Systems, 2015, 23(4): 769-786. DOI:10.1109/tfuzz.2014.2327993.

[6]Yao Y, Zhao Y. Discernibility matrix simplification for constructing attribute reducts [J]. Information Sciences, 2009, 179(7): 867-882. DOI:10.1016/j.ins.2008.11.020.

[7]Wu X, Zhu X, Wu G Q, et al. Data mining with big data [J]. IEEE Transactions on Knowledge and Data Engineer, 2014, 26(1): 97-107. DOI: 10.1109/TKDE.2013.109.

[8]Zhang J B, Li T R, Ruan D, et al. A parallel method for computing rough set approximations [J]. Information Sciences, 2012, 194: 209-223. DOI:10.1016/j.ins.2011.12.036.

[9]Zhang J B, Wong J S, Li T R, et al. A comparison of parallel large-scale knowledge acquisition using rough set theory on different MapReduce runtime systems [J]. International Journal of Approximate Reasoning, 2014, 55(3): 896-907. DOI:10.1016/j.ijar.2013.08.003.

[10]Qian J, Miao D Q, Zhang Z H, et al. Parallel attribute reduction algorithms using MapReduce[J]. Information Sciences, 2014, 279: 671-690. DOI:10.1016/j.ins.2014.04.019.

[11]Xu F F, Lei J S, Bi Z Q, et al. Approaches to approximate reduction with interval-valued multi-decision tables in big data[J]. Journal of Software, 2014, 25(9): 2119-2135 DOI:10.13328/j.cnki.jos.004640. (in Chinese)

[12]You S, Baldick R. Hybrid coevolutionary programming for Nash equilibrium search in games with local optima [J]. IEEE Transactions on Evolutionary Computation, 2004, 8(4): 305-315. DOI:10.1109/tevc.2004.832862.

[13]Jiang T Z. Brainnetome: A new-ome to understand the brain and its disorders [J]. Neuroimage, 2013, 80: 263-272. DOI:10.1016/j.neuroimage.2013.04.002.

[14]Knickmeyer R C, Gouttard S, Kang C, et al. A structural MRI study of human brain development from birth to 2 years[J]. Journal of Neuroscience, 2008, 28(47): 12176-12182. DOI:10.1523/JNEUROSCI.3479-08.2008.

[15]Warfield S K, Zou K H, Wells W M. Simultaneous truth and performance level estimation (STAPLE): An algorithm for the validation of image segmentation [J]. IEEE Transactions on Medical Imaging, 2004, 23(7): 903-921. DOI:10.1109/TMI.2004.828354.

[16]Wang L, Shi F, Li G, et al. Segmentation of neonatal brain MR images using patch-driven level sets [J]. Neuroimage, 2014, 84: 141-158. DOI:10.1016/j.neuroimage.2013.08.008.

基于协同进化云的属性集成多代理约简算法

丁卫平1,2,4 王建东3 张晓峰2 管致锦2

(1南京大学计算机软件新技术国家重点实验室, 南京 210093) (2南通大学计算机科学与技术学院, 南通 226019) (3南京航空航天大学计算机科学与技术学院, 南京 210016) (4南京理工大学高维信息智能感知与系统教育部重点实验室, 南京 210094)

摘要:为提高属性约简算法处理含噪音和不确定大数据的性能,提出了一种基于协同进化云的属性集成多代理约简算法(CCAEMR).该算法首先基于MapReduce机制设计协同进化云框架,将整个种群分解成多个具有自适应规模的协同进化子种群, 通过子种群的共享奖酬来加速属性约简实现.然后,构造了一种协同精英优化的多代理集成策略,确保划分的子种群能够充分探索交叠属性子集之间的相关性和相互依赖性,且具有较强的抗噪音性能, 这些代理能保持在稳定的精英地区且取得了最佳收益.实验结果表明:所提出的CCAEMR算法在解决大规模和不确定复杂噪音数据的属性约简时具有更好的效率和适用性.

关键词:协同进化精英优化;属性约简;协同云框架;集成多代理策略;婴幼儿脑3D-MRI

中图分类号:TP301

Foundation items: The National Natural Science Foundation of China (No.61300167), the Open Project Program of State Key Laboratory for Novel Software Technology of Nanjing University (No.KFKT2015B17), the Natural Science Foundation of Jiangsu Province (No.BK20151274), Qing Lan Project of Jiangsu Province, the Open Project Program of Key Laboratory of Intelligent Perception and Systems for High-Dimensional Information of Ministry of Education (No.JYB201606), the Program for Special Talent in Six Fields of Jiangsu Province (No.XYDXXJS-048).

Citation::Ding Weiping, Wang Jiandong, Zhang Xiaofeng, et al. Co-evolutionary cloud-based attribute ensemble multi-agent reduction algorithm[J].Journal of Southeast University (English Edition),2016,32(4):432-438.

DOI:10.3969/j.issn.1003-7985.2016.04.007.

DOI:10.3969/j.issn.1003-7985.2016.04.007

Received 2016-03-06.

Biography:Ding Weiping (1979—), male, doctor, associate professor, dwp9988@163.com.