[an error occurred while processing this directive] | Global Geology 2018, 21(1) 36-47 DOI: 10.3969/j.issn.1673-9736.2018.01.04 ISSN: 1673-9736 CN: 22-1371/P | ||||||||||||||||||||||||||||||||||||||||||||
����Ŀ¼ | ����Ŀ¼ | ������� | ������ [��ӡ��ҳ] [�ر�] | |||||||||||||||||||||||||||||||||||||||||||||
���� |
| ||||||||||||||||||||||||||||||||||||||||||||
Application of isolation forest to extract multivariate anomalies from geochemical exploration data | |||||||||||||||||||||||||||||||||||||||||||||
WU Wei1, CHEN Yongliang2 | |||||||||||||||||||||||||||||||||||||||||||||
1. Changchun Institute of Urban Planning and Design, Changchun 130033, China; 2. Institute of Mineral Resources Prognosis on Synthetic Information, Jilin University, Changchun 130026, China | |||||||||||||||||||||||||||||||||||||||||||||
ժҪ�� Constructing a statistical model that best fits the background is a key step in geochemical anomaly identification. But the model is hard to be constructed in situations where the sample population has unknown and/or complex distribution. Isolation forest is an outlier detection approach that explicitly isolates anomaly samples rather than models the population distribution. It can extract multivariate anomalies from huge-sized high-dimensional data with unknown population distribution. For this reason, we tentatively applied the method to identify multivariate anomalies from the stream sediment survey data of the Lalingzaohuo district, an area with a complex geological setting, in Qinghai Province in China. The performance of the isolation forest algorithm in anomaly identification was compared with that of a continuous restricted Boltzmann machine. The results show that the isolation forest model performs superiorly to the continuous restricted Boltzmann machine in multivariate anomaly identification in terms of receiver operating characteristic curve, area under the curve, and data-processing efficiency. The anomalies identified by the isolation forest model occupy 19% of the study area and contain 82% of the known mineral deposits, whereas the anomalies identified by the continuous restricted Boltzmann machine occupy 35% of the study area and contain 88% of the known mineral deposits. It takes 4. 07 and 279. 36 seconds respectively handling the dataset using the two models. Therefore, isolation forest is a useful anomaly detection method that can quickly extract multivariate anomalies from geochemical exploration data. | |||||||||||||||||||||||||||||||||||||||||||||
�ؼ����� isolation forest continuous restricted Boltzmann machine receiver operating characteristic curve Youden index geochemical anomaly identification | |||||||||||||||||||||||||||||||||||||||||||||
Application of isolation forest to extract multivariate anomalies from geochemical exploration data | |||||||||||||||||||||||||||||||||||||||||||||
WU Wei1, CHEN Yongliang2 | |||||||||||||||||||||||||||||||||||||||||||||
1. Changchun Institute of Urban Planning and Design, Changchun 130033, China; 2. Institute of Mineral Resources Prognosis on Synthetic Information, Jilin University, Changchun 130026, China | |||||||||||||||||||||||||||||||||||||||||||||
Abstract: Constructing a statistical model that best fits the background is a key step in geochemical anomaly identification. But the model is hard to be constructed in situations where the sample population has unknown and/or complex distribution. Isolation forest is an outlier detection approach that explicitly isolates anomaly samples rather than models the population distribution. It can extract multivariate anomalies from huge-sized high-dimensional data with unknown population distribution. For this reason, we tentatively applied the method to identify multivariate anomalies from the stream sediment survey data of the Lalingzaohuo district, an area with a complex geological setting, in Qinghai Province in China. The performance of the isolation forest algorithm in anomaly identification was compared with that of a continuous restricted Boltzmann machine. The results show that the isolation forest model performs superiorly to the continuous restricted Boltzmann machine in multivariate anomaly identification in terms of receiver operating characteristic curve, area under the curve, and data-processing efficiency. The anomalies identified by the isolation forest model occupy 19% of the study area and contain 82% of the known mineral deposits, whereas the anomalies identified by the continuous restricted Boltzmann machine occupy 35% of the study area and contain 88% of the known mineral deposits. It takes 4. 07 and 279. 36 seconds respectively handling the dataset using the two models. Therefore, isolation forest is a useful anomaly detection method that can quickly extract multivariate anomalies from geochemical exploration data. | |||||||||||||||||||||||||||||||||||||||||||||
Keywords: isolation forest continuous restricted Boltzmann machine receiver operating characteristic curve Youden index geochemical anomaly identification | |||||||||||||||||||||||||||||||||||||||||||||
�ո����� 2017-10-25 ������ 2017-11-20 ����淢������ | |||||||||||||||||||||||||||||||||||||||||||||
DOI: 10.3969/j.issn.1673-9736.2018.01.04 | |||||||||||||||||||||||||||||||||||||||||||||
������Ŀ:
Supported by projects of the National Natural Science Foundation of China (Nos.41272360, 41472299, 41672322). | |||||||||||||||||||||||||||||||||||||||||||||
ͨѶ����: CHEN Yongliang | |||||||||||||||||||||||||||||||||||||||||||||
�����: | |||||||||||||||||||||||||||||||||||||||||||||
����Email: chenyongliang2009@hotmail.com | |||||||||||||||||||||||||||||||||||||||||||||
| |||||||||||||||||||||||||||||||||||||||||||||
�ο����ף� | |||||||||||||||||||||||||||||||||||||||||||||
Chen H W, Luo Z H, Mo X X,et al. 2006. SHRIMP ages of Kayakedengtage complex in the East Kunlun Mountains and their geological implications.Acta Petrologica et Mineralogica,25(1):25-32. (in Chinese with English Abstract) Chen Y L. 2015. Mineral potential mapping with a restricted Boltzmann machine.Ore Geology Reviews,71:749-760. Chen Y L, An A J. 2016. Application of ant colony algorithm to geochemical anomaly detection.Journal of Geochemical Exploration,164:75-85. Chen Y L, Wu W. 2016. A prospecting cost-benefit strategy for mineral potential mapping based on ROC curve analysis.Ore Geology Reviews,74:26-38. Chen Y L, Wu W. 2017a. Mineral prospectivity mapping using an extreme learning machine regression.Ore Geology Reviews,80:200-213. Chen Y L, Wu W. 2017b. Application of one-class support vector machine to quickly identify multivariate anomalies from geochemical exploration data.Geochemistry:Exploration, Environment, Analysis,17:231-238. Chen Y L, Lu L J, Li X B. 2014a. Kernel Mahalanobis distance for multivariate geochemical anomaly recognition.Journal of Jilin University (Earth Science Edition),44(1):396-408. (in Chinese with English Abstract) Chen Y L, Lu L J, Li X B. 2014b. Application of continuous restricted Boltzmann machine to identify multivariate geochemical anomaly.Journal of Geochemical Exploration,140:56-63. Cheng Q. 1995. The perimeter-area fractal model and its application to geology.Mathematical Geology, 27(1):69-82. Cheng Q. 2006. Singularity-generalized self-similarity-fractal spectrum (3S) model.Earth Sciences-China University of Geosciences,31(3):337-348. (in Chinese with English Abstract) Cheng Q. 2007. Mapping singularities with stream sediment geochemical data for prediction of undiscovered mineral deposits in Gejiu, Yunnan Province, China.Ore Geology Reviews,32:314-324. Cheng Q. 2008. Non-linear theory and power-law models for information integration and mineral resources quantitative assessments.Mathematical Geology,40(3):503-532. Cheng Q, Agterberg F P. 1995. Multifractal modeling and spatial point processes. Mathematical Geology, 27(6):831-845. Cheng Q, Agterberg F P, Ballantyne S B. 1994. The separation of geochemical anomalies from background by fractal methods.Journal of Geochemical Exploration,51:109-130. Cheng Q, Agterberg F P, Bonham-Carter G F. 1996. A spatial analysis method for geochemical anomaly separation.Journal of Geochemical Exploration,56:183-195. Cheng Q, Xu Y, Grunsky E. 2000. Multifractal power spectrum-area method for geochemical anomaly separation.Natural Resources Research,9(1):43-51. Deng J, Wang Q, Yang L,et al. 2010. Delineation and explanation of geochemical anomalies using fractal models in the Heqing area, Yunnan Province, China.Journal of Geochemical Exploration,105:95-105. Du Y L, Jia Q Z, Han S F. 2012. Mesozoic tectono-magmatic-mineralization and copper-gold polymetallic ore prospecting research in east Kunlun metallogenic belt in Qinghai.Northwestern Geology,45(4):69-75. (in Chinese with English Abstract) Galuszka A. 2007. A review of geochemical background concepts and an example using data from Poland.Environmental Geology,52(5):861-870. Govett G J S, Goodfellow W D, Chapman A,et al. 1975. Exploration geochemistry distribution of elements and recognition of anomalies.Mathematical Geology,7(3):415-446. Grunsky E C, Agterberg F P. 1988. Spatial and multivariate analysis of geochemical data from metavolcanic rocks in the Ben Nevis Area, Ontario.Mathematical Geology,20(7):825-861. Kong D Y, Hu Y. 2014. Geological characteristics and ore-controlling factors of the Xiarihamu copper polymetallic deposit.Journal of Qinghai University (Natural Science Edition),32(6):63-66. (in Chinese with English Abstract) Li Q, Cheng Q. 2004. Fractal singular-value (eigen-value) decomposition method for geophysical and geochemical anomaly reconstruction.Earth Sciences-China University of Geosciences,29(1):109-118. (in Chinese with English Abstract) Liu F T, Ting K M, Zhou Z H. 2008. Isolation forest//Proceedings of the Eighth IEEE International Conference on Data Mining (ICDM), 413-422. Meng X W. 1993. Single-variate factor kriging for geochemical field separation.Geology and Prospecting,29(2):47-50. (in Chinese with English Abstract) Meng X W. 1994. Robust kriging and its application in delineation of geochemical anomalies with scale of 1:50000.Computing Techniques for Geophysical and Geochemical Exploration,16(1):72-76. (in Chinese with English Abstract) Stanley C R, Sinclair A J. 1989. Comparison of probability plots and gap statistics in the selection of threshold for exploration geochemistry data.Journal of Geochemical Exploration,32:355-357. Wang G, Sun F Y, Li B L,et al. 2014. Petrography, zircon U-Pb geochronology and geochemistry of the mafic-ultramafic intrusions in the Xiarihamu Cu-Ni deposit from East Kunlun, with implications for geodynamic setting.Earth Science Frontiers,21(6):381-401. (in Chinese with English Abstract) Xiong Y H, Zuo R G. 2016. Recognition of geochemical anomalies using a deep autoencoder network.Computers & Geosciences, 86:75-82. Zheng J K. 1992. Regional tectonic evolution of east Kunlun.Qinghai Geology,1(1):17-25. (in Chinese with English Abstract) Zuo R, Cheng Q, Agterberg F P,et al. 2009. Application of singularity mapping technique to identify local anomalies using stream sediment geochemical data, a case study from Gangdese, Tibet, western China.Journal of Geochemical Exploration,101:355-357. |
|||||||||||||||||||||||||||||||||||||||||||||
������������� | |||||||||||||||||||||||||||||||||||||||||||||
Copyright by Global Geology |