Minestis “automatic domaining” has proven to be efficient with big datasets
Minestis offers a unique feature called Automatic Domaining which quickly groups borehole samples into domains in an automatic way. Those who have seen the tool run qualifies it as impressive.
The classification of samples into geological domains is a fastidious and fairly subjective step in geological modeling. Geovariances and Mines Paris Tech have developed a combination of Geostatistical Hierarchical Clustering (‘GHC’) and Support Vector Machine (‘SVM’) in Minestis in order to reduce the subjectivity and improve the productivity of that crucial step of MRE. In particular the updating of an existing sample classification occurs in shorter time than traditional methods with more flexibility and dynamism.
GHC is a clustering algorithm that respects the spatial connectivity of data, forming subsets according to the degree of similarity between samples, eventually assigning a domain to each sample. SVM is a machine learning algorithm used when working with big data sets. In a first step, a fraction of the samples are classified using GHC and in a second step the remaining samples are classified using SVM supervised by the result of the first classification. Using hybrid classification speeds up the classification procedure.
Geovariances has tested GHC and SVM algorithm for fast and flexible sample classification with updating capacities with a real 3D data set which was kindly provided by BHP Billiton. The case study dataset consists of about 2.120 vertical drill holes, with 114.842 samples and 45 variables.
To begin, a first drill hole campaign (65.468 samples) was taken into account. The dissimilarity between the samples is based on 5 numerical variables (Fe, Al2O3, SiO2, and spectral measurements of hematite and goethite) and a categorical variable (weathering). A weight was attributed to each variable based on the relevance of each variable to the overall domaining rationale. Post-processing tools were used to smooth the output (variable with a given domain assigned to each sample).
In a second time, a second campaign (49.374 samples) with additional data information was added and classification was updated using SVM with the same weights.