Panoramic of the Musgrave Rabges

Blogs (3)

Geology and data-science blogs

Rian Dutch

Using Python to handle large geochemical datasets

The Geological Survey of South Australia (GSSA) holds a wealth of data in its geoscientific database SA Geodata. SA Geodata is the primary repository for geoscientific information in South Australia and contains open source data collected from a variety of sources. The SA Geodata database contains over 10 Gb of…

Continue reading...
Rian Dutch

What are explorers looking for in S.A.? Part 1

Finding topics in exploration reports using natural language processing. When people think about data, often they think about tabular data sets of numbers. But there is a wealth of data and knowledge hiding in unstructured textural data sets such as company exploration reports. Natural Language Processing (NLP) is the field…

Continue reading...
Rian Dutch

What are explorers looking for in S.A.? Part 2

Spatially assessing the distribution of exploration topics using python. In the previous article, I demonstrated an application of NLP topic modelling using Latent Dirichlet Allocation and the Gensim library to identify themes in South Australian exploration report summaries. That ML model identified eight coherent topics in the report data set…

Continue reading...