Sample clustering in Isatis.neo has proven to be efficient with big datasets
Posted on Isatis.neo quickly groups borehole samples into homogeneous classes (e.g., facies, geological or mining domains) in an automatic way. Those who have seen the tool run qualifies it as impressive.

The classification of samples into geological domains is a fastidious and fairly subjective step in geological modeling. Geovariances and Mines Paris Tech have developed a combination of Geostatistical Hierarchical Clustering (‘GHC’) and Support Vector Machine (‘SVM’) in order to reduce the subjectivity and improve the productivity of that crucial step of MRE. In particular, the updating of an existing sample classification occurs in a shorter time than traditional methods with more flexibility and dynamism.
GHC is a clustering algorithm that respects the spatial connectivity of data, forming subsets according to the degree of similarity between samples, eventually assigning a domain to each sample. SVM is a machine learning algorithm used when working with big data sets. In a first step, a fraction of the samples are classified using GHC and in a second step, the remaining samples are classified using SVM supervised by the result of the first classification. Using hybrid classification speeds up the classification procedure.
Geovariances has tested GHC and SVM algorithm for fast and flexible sample classification with updating capacities with a real 3D data set which was kindly provided by BHP Billiton. The case study dataset consists of about 2.120 vertical drill holes, with 114.842 samples and 45 variables.
To begin, a first drill hole campaign (65.468 samples) was taken into account. The dissimilarity between the samples is based on 5 numerical variables (Fe, Al2O3, SiO2, and spectral measurements of hematite and goethite) and a categorical variable (weathering). A weight was attributed to each variable based on the relevance of each variable to the overall domaining rationale. Post-processing tools were used to smooth the output (variable with a given domain assigned to each sample).
In a second time, a second campaign (49.374 samples) with additional data information was added and classification was updated using SVM with the same weights.
Geostatistical Hierarchical Clustering and Support Vector Machine techniques give similar results when compared to manual classification for both campaigns, but they bring the added benefits of being able to integrate easily many variables (such grade, structural data, lithology, etc.) and to update quickly and flawlessly with new data.
Mining (14)
Nuclear Decommissioning (9)
Contaminated sites (7)
Oil & Gas (6)
Hydrogeology (5)
TAGS:
2D/3D (2)
Background images (1)
Big data (1)
Conditional simulations (7)
Contaminated sites (2)
Contamination (2)
Drill Hole Spacing Analysis DHSA (3)
Excavation (2)
Facies modeling (2)
Flow modeling (1)
Geological modeling (3)
Gestion des sites pollués (2)
H2020 INSIDER (1)
Horizon mapping (1)
Ice content evaluation (1)
Isatis (11)
Isatis.neo (16)
Kartotrak (8)
Machine Learning (2)
Mapping (3)
MIK (2)
Mineral resource estimation (7)
Monitoring network optimization (1)
MPS (2)
Ore Control (1)
Pareto (2)
Pollution (2)
Post-accidental situation (2)
Recoverable resource estimation (3)
Resource classification (2)
Resources workflow (1)
Resource workflow (3)
Risk analysis (3)
Sample clustering (1)
Sampling optimization (3)
Scripting procedures (3)
Simulation post-processing (1)
Site characterization (2)
Soil contamination mapping (4)
Time-to-Depth conversion (1)
Uncertainty analysis (2)
Uniform Conditioning (5)
Variography (2)
Volumes (2)
Water quality modeling (1)
AUTHORS:
David Barry (3)
Pedram Masoudi (2)
Yvon Desnoyers (2)
Pedro Correia (1)
Catherine BLEINES (1)
DATES:
2023 (2)
2022 (3)
2021 (2)
2020 (2)
2019 (8)
2018 (4)
2017 (3)
2016 (3)