Knowledge Discovery in Databases (KDD)
Por: reqz1234 • 15/4/2020 • Abstract • 400 Palavras (2 Páginas) • 159 Visualizações
The Knowledge Discovery in Databases (KDD) is a process that includes all the necessary steps to extract useful knowledge from databases. It covers a range of different methods for data analysis, data cleaning, data mining, and data interpretation. In the geotechnical domain, many studies have focused on the development of data mining systems for soil classification, aiming to assist and automate this task. However, there are few proposals that use all the KDD steps in this context. Besides that, there is no indication for a framework that performs the soil determination based on the KDD. So, in this paper, we present a KDD-based approach to predict the subsoil layers using Cone Penetration Test data. The method was evaluated using three data sets containing more than ten thousand soil data. Our approach was able to detect and ad-just inconsistencies in the data, through- the preparation module, and also identified an unbalanced problem in the data. Using the trained knowledge base we obtained an accuracy of over 90% in the soil prediction task, even with a small number of data training samples. This show that our approach is able to provide good soil classification results that can further improve with more data.
The Knowledge Discovery in Databases (KDD) is a process that includes all the necessary steps to extract useful knowledge from databases. It covers a range of different methods for data analysis, data cleaning, data mining, and data interpretation. In the geotechnical domain, many studies have focused on the development of data mining systems for soil classification, aiming to assist and automate this task. However, there are few proposals that use all the KDD steps in this context. Besides that, there is no indication for a framework that performs the soil determination based on the KDD. So, in this paper, we present a KDD-based approach to predict the subsoil layers using Cone Penetration Test data. The method was evaluated using three data sets containing more than ten thousand soil data. Our approach was able to detect and ad-just inconsistencies in the data, through- the preparation module, and also identified an unbalanced problem in the data. Using the trained knowledge base we obtained an accuracy of over 90% in the soil prediction task, even with a small number of data training samples. This show that our approach is able to provide good soil classification results that can further improve with more data.
...