A Comparison among Data Mining Algorithms for Outlier Detection using Flow Pattern Experiments

Document Type: Article


1 Civil Engineering Department, Persian Gulf University, Bushehr, Iran

2 Faculty of Marine Engineering, Amirkabir University of Technology, Tehran, Iran

3 Persian Gulf University


Accurate outlier detection is an important matter to consider prior to data applied to predict flow pattern. Identifying these outliers and reducing their impact in measurements could be effective in presenting the authentic flow pattern. This paper aims to detect outliers in flow pattern experiments along a 180 degree sharp bend channel with and without a T-shaped spur dike. Velocity components have been collected using 3D velocimeter called Vectrino in order to determine flow pattern. Some of outlier detection methods employed in the paper, such as Z-score test, sum of sine curve fitting, Mahalanobis distance, hierarchical clustering, LSC-Mine, Self-organizing map, Fuzzy C-Means Clustering, and voting. Considering the experiments carried out, the methods were efficient in outlier detection, however, the voting method appeared to be the most efficient one. Briefly, this paper has calculated different hydraulic parameters in the sharp bend and made comparison between them for the sake of studying how effective running the voting method are on mean and turbulent flow pattern variations. The results indicated that developing the voting method in flow pattern experiment in the bend would cause a decrease in Reynolds shear stress, by 36%, while the mean velocities were not significantly influenced by the method.


