Benefits of using convolutional neural networks for seismic data quality analysis
Abstract. Seismic data represent an excellent source of information and can be used to investigate several phenomena such as earthquake nature, faults geometry, tomography etc. These data are affected by several types of noise that are often grouped into two main classes: anthropogenic and environmental ones. Nevertheless instrumental noise or malfunctioning stations detection is also a relevant step in terms of data quality control and in the efficiency of the seismic network. As we will show, visual inspection of seismic spectral diagrams allows us to detect problems that can compromise data quality, for example invalidating subsequent calculations, such as Magnitude or Peak Ground Acceleration (PGA). However, such visual inspection requires human experience (due to the complexity of the diagrams), time demanding and effort as there are too many stations to be checked. That’s why, in this paper, we have explored the possibility of “transferring” such human experience into an artificial intelligence system in order to automatically and quickly perform such detection. The results have been very encouraging as the automatic system we have set up shows a detection accuracy of over 90 % on a set of 840 noise spectral diagrams obtained from seismic station records.
Paolo Casale and Alessandro Pignatelli
Status: open (until 13 Jun 2023)
- RC1: 'Comment on gi-2023-4', Anonymous Referee #1, 31 May 2023 reply
- RC2: 'Comment on gi-2023-4', Anonymous Referee #2, 02 Jun 2023 reply
Paolo Casale and Alessandro Pignatelli
Paolo Casale and Alessandro Pignatelli
Viewed (geographical distribution)
The paper discusses the applicability of AlexNet to seismic spectral plots to detect data quality problems. They combined existing diagrams from different sources into a new database containing 840 diagrams. By labeling (classifying) the diagrams using expert knowledge, they created a new training dataset for CNNs. In general, the approach was successful and the AlexNet achieved an accuracy of 90% on this dataset.
In my opinion, data quality is one of the most important issues in any domain. Therefore, the authors have addressed a very important problem. They show a possible solution to automate a pre-analysis to identify erroneous stations or stations that should be treated with caution.
1 Does the paper address relevant scientific questions within the scope of GI:
Yes. Data quality is important in any scientific field. In this case, the authors focus on seismic data, so this paper is relevant to GI.
2 Does the paper present novel concepts, ideas, tools, or data:
The authors have applied an existing and well-known concept to a new dataset. It is obvious that a CNN approach to classification will work. However, the authors noted that there are no large databases on which to train the neural networks (they “focused on the entire noise spectra to detect acceptable signals from anomalous ones”). As datasets are rare and very time consuming to create, this contribution is relevant to the scientific community if the data is made (“correctly”) publicly available (see 6).
3 Are substantial conclusions reached:
As stated above, it is obvious that the concept will work. The authors have proven this once again. Therefore, the paper does not provide any substantial new conclusions.
4 Are the scientific methods and assumptions valid and clearly outlined:
The methods are clear with strong weaknesses regarding “4. Machine learning general description”. The headline is not consistent with the following subsections. The heading should be: “4. Deep Learning” with 4.1 being: “4.1 General description”.
Some assumptions are mixed with results, such as the data splits in 5.1, 5.2, 5.3 and 5.4 and their explanations.
5 Are the results sufficient to support the interpretations and conclusions:
Partially. Comparison of the four tests is not possible due to changing test sets.
6 Is the description of experiments and calculations sufficiently complete and precise to allow their reproduction by fellow scientists (traceability of results):
No. The data provided are only some diagrams without labels. What is missing:
Furthermore, the images (which by themselves have no use) are published only on google drive, which is a bad decision. I suggest using an appropriate website to share your data such as https://www.kaggle.com/.
7 Do the authors give proper credit to related work and clearly indicate their own new/original contribution:
8 Does the title clearly reflect the contents of the paper:
Yes. However, in my opinion, the title suggests that a CNN is the best approach, which seems logical. However, the paper only uses a 2D CNN, and a comparison with a 1D CNN is missing. I would at least expect a section discussing the advantages and disadvantages of a 1D and 2D approach.
9 Does the abstract provide a concise and complete summary:
Partially. In my opinion, the results of the large multiclass experiment need to be added.
10 Is the overall presentation well structured and clear:
The general presentation of the introduction and theory up to section 3 is OK. Sections 4 and 5 are not. The experiments are mixed with results and discussion, which makes it difficult to follow the idea. Regarding the general structure, I suggest following common headings such as Introduction, (materials/data and) Methods, Experiments, Results and Discussion, Conclusions. See also the suggestions in 4 and the general comments below.
11 Is the language fluent and precise:
No. Some sentences are difficult to read because they are unnecessarily nested. Proofreading is needed.
There are many typing errors such as:
12 Are mathematical formulae, symbols, abbreviations, and units correctly defined and used:
13 Should any parts of the paper (text, formulae, figures, tables) be clarified, reduced, combined, or eliminated:
Please add a section: "Dataset" and move relevant parts to this section, as in my eyes this is an important contribution. This will make it easier to extract relevant information (see also general comments). I have major concerns with sections 4 and 5. See general comments.
14 Are the number and quality of references appropriate:
The references regarding section 4 are a little outdated in some cases or not present at all. Please add more and also recent references to section 4 as the field of deep learning is rapidly evolving. AlexNet should not be used.
15 Is the amount and quality of supplementary material appropriate:
Yes. To make it even more precise, I suggest adding a table with all four tests to make the comparison easier.
The authors have used several data sets in the manuscript, which is confusing. I do not agree with the division into smaller datasets (of size 224, 447 and 840 plots) and one large dataset (1865 PDF spectral plots). It is obvious that more data will improve training accuracy. Removing uncertain diagrams from the training is not a good practice as results become not useful in real world applications and conclusions are not representative anymore. The authors could have just used the large dataset as it is the most general dataset in use.
Weighting strategies were not covered, for example, to address their imbalance problem faced in tests 1 and 2. Furthermore, the authors used different test sets to validate their approaches, making it impossible to compare results.
In the whole manuscript there is no information about hyperparameters such as batch size, epochs trained, duration of training and many more.
A very important topic to discuss is overfitting, which has not been considered at all. They did not provide any information about the training and validation loss, which is a common practice in applied deep learning. The lack of this information makes it very difficult to interpret the results.
Since reproducibility was not covered, the results will vary between two training runs. The authors should retrain the model several times and provide an average accuracy and standard deviation.
In conclusion: The experimental design is bad. Too many mistakes have been made.
There is no explanation as to why a 2D CNN approach was chosen in the first place. The authors claim that: “In our study we find that just images of data fulfil our goal” (line 336). Proof is lacking. In fact, they have shown that uncertain graphs are likely to be misclassified. A 1D CNN can be applied directly to the data without any loss of information due to encoding into an image. As the seismic information is stored in vectors, 1D CNNs are more efficient, easier to train and less complex. Since far fewer parameters are needed, the loss of information due to encoding the data into an image can be avoided and the only loss is introduced by subsampling the spectral data itself. Did the authors take this into account in their research? For example, the images could be used by the experts to classify the data, but the 1D data could be used for training. Why is it advantageous to use a 2D CNN? I suggest adding a section on this. I refer also to: https://doi.org/10.1016/j.ymssp.2020.107398.
Did the authors consider data augmentation? I would like to see a section discussing the possibility of data augmentation in such a case as it can improve accuracy a lot. I refer to: https://doi.org/10.1088/1741-2552/ac4430.
- CNNs (reduce the general description, remove ANNs and focus on the CNN),
- Training (a short overview, perhaps supported by an enumeration of important training steps),
- Metrics (accuracy, confusion matrix, F1 score (all results also class-based for multiclass problems), Precision, Recall) and
- Hyperparameters (training - validation - test split, batch size, epochs, loss function, optimizer, did the authors use dropout? did the authors use regularization techniques? did the authors use weighting to deal with class imbalances?)
Comment: I recommend not to go too deep into the topic, because many mistakes can be avoided. Instead, explain the general concept of CNNs and cite accordingly.