Global Geology 2018, 21(1) 36-47 DOI:   10.3969/j.issn.1673-9736.2018.01.04  ISSN: 1673-9736 CN: 22-1371/P

Current Issue | Archive | Search                                                            [Print]   [Close]
Articles
Information and Service
This Article
Supporting info
PDF(1271KB)
[HTML]
Reference
Service and feedback
Email this article to a colleague
Add to Bookshelf
Add to Citation Manager
Cite This Article
Email Alert
Keywords
isolation forest
continuous restricted Boltzmann machine
receiver operating characteristic curve
Youden index
geochemical anomaly identification
Authors
PubMed

Application of isolation forest to extract multivariate anomalies from geochemical exploration data

WU Wei1, CHEN Yongliang2

1. Changchun Institute of Urban Planning and Design, Changchun 130033, China;
2. Institute of Mineral Resources Prognosis on Synthetic Information, Jilin University, Changchun 130026, China

Abstract��

Constructing a statistical model that best fits the background is a key step in geochemical anomaly identification. But the model is hard to be constructed in situations where the sample population has unknown and/or complex distribution. Isolation forest is an outlier detection approach that explicitly isolates anomaly samples rather than models the population distribution. It can extract multivariate anomalies from huge-sized high-dimensional data with unknown population distribution. For this reason, we tentatively applied the method to identify multivariate anomalies from the stream sediment survey data of the Lalingzaohuo district, an area with a complex geological setting, in Qinghai Province in China. The performance of the isolation forest algorithm in anomaly identification was compared with that of a continuous restricted Boltzmann machine. The results show that the isolation forest model performs superiorly to the continuous restricted Boltzmann machine in multivariate anomaly identification in terms of receiver operating characteristic curve, area under the curve, and data-processing efficiency. The anomalies identified by the isolation forest model occupy 19% of the study area and contain 82% of the known mineral deposits, whereas the anomalies identified by the continuous restricted Boltzmann machine occupy 35% of the study area and contain 88% of the known mineral deposits. It takes 4. 07 and 279. 36 seconds respectively handling the dataset using the two models. Therefore, isolation forest is a useful anomaly detection method that can quickly extract multivariate anomalies from geochemical exploration data.

Keywords�� isolation forest   continuous restricted Boltzmann machine   receiver operating characteristic curve   Youden index   geochemical anomaly identification  
Received 2017-10-25 Revised 2017-11-20 Online:  
DOI: 10.3969/j.issn.1673-9736.2018.01.04
Fund:Supported by projects of the National Natural Science Foundation of China (Nos.41272360, 41472299, 41672322).
Corresponding Authors: CHEN Yongliang
Email: chenyongliang2009@hotmail.com
About author:

References��
Similar articles

Copyright by Global Geology