A Comparison among Data Mining Algorithms for Outlier Detection using Flow Pattern Experiments

Document Type : Article


1 Civil Engineering Department, Persian Gulf University, Bushehr, Iran

2 Faculty of Marine Engineering, Amirkabir University of Technology, Tehran, Iran

3 Persian Gulf University


Accurate outlier detection is an important matter to consider prior to data applied to predict flow pattern. Identifying these outliers and reducing their impact in measurements could be effective in presenting the authentic flow pattern. This paper aims to detect outliers in flow pattern experiments along a 180 degree sharp bend channel with and without a T-shaped spur dike. Velocity components have been collected using 3D velocimeter called Vectrino in order to determine flow pattern. Some of outlier detection methods employed in the paper, such as Z-score test, sum of sine curve fitting, Mahalanobis distance, hierarchical clustering, LSC-Mine, Self-organizing map, Fuzzy C-Means Clustering, and voting. Considering the experiments carried out, the methods were efficient in outlier detection, however, the voting method appeared to be the most efficient one. Briefly, this paper has calculated different hydraulic parameters in the sharp bend and made comparison between them for the sake of studying how effective running the voting method are on mean and turbulent flow pattern variations. The results indicated that developing the voting method in flow pattern experiment in the bend would cause a decrease in Reynolds shear stress, by 36%, while the mean velocities were not significantly influenced by the method.


Main Subjects

1. Basser, H., Karami, H., Shamshirband, S., Akib, S., Amirmojahedi, M., Ahmad, R., Jahangirzadeh, A., and Javidnia, H. Hybrid ANFIS-PSO approach for predicting optimum parameters of a protective spur dike", Applied Soft Computing, 30, pp. 642-649 (2015).
2. Vaghe , M., Ghodsian, M., and Neyshabouri, S.A.A. Experimental study on scour around a T-shaped
spur dike in a channel bend", Journal of Hydraulic
Engineering, 138(5), pp. 471-474 (2012).
3. Ghodsian, M. and Vaghe , M. Experimental study
on scour and
ow eld in a scour hole around a Tshape
spur dike in a 90 bend", International Journal
of Sediment Research, 24(2), pp. 145-158 (2009).
4. Liao, T.W. A clustering procedure for exploratory
mining of vector time series", Pattern Recognition,
40(9), pp. 2550-2562 (2007).
5. Goring, D.G. and Nikora, V.I. Despiking acoustic
Doppler velocimeter data", Journal of Hydraulic Engineering,
128(1), pp. 117-126 (2002).
6. Mori, N., Suzuki, T., and Kakuno, S. Noise of acoustic
Doppler velocimeter data in bubbly
ows", Journal of
Engineering Mechanics, 133(1), pp. 122-125 (2007).
7. Duncan, J., Dabiri, D., Hove, J., and Gharib, M. Universal
outlier detection for particle image velocime604
M. Vaghe et al./Scientia Iranica, Transactions A: Civil Engineering 25 (2018) 590{605
try (PIV) and particle tracking velocimetry (PTV)
data", Measurement Science and Technology, 21(5),
p. 057002 (2010).
8. Westerweel, J. and Scarano, F. Universal outlier
detection for PIV data", Experiments in Fluids, 39(6),
pp. 1096-1100 (2005).
9. Razaz, M. and Kawanisi, K. Signal post-processing
for acoustic velocimeters: detecting and replacing
spikes", Measurement Science and Technology, 22(12),
p. 125404 (2011).
10. Hawkins, D., Identi cation of Outliers, Chapman and
Hall, London, UK (1980).
11. Filzmoser, P., Maronna, R., and Werner, M. Outlier
identi cation in high dimensions", Computational
Statistics and Data Analysis, 52, pp. 1694-1711 (2008).
12. Ramaswamy, S., Rastogi, R., and Kyuseok, S. E-
cient algorithms for mining outliers from large data
sets", ACM SIGMOD Record, 29(2), pp. 93-104
13. Hinneburg, A. and Keim, D.A. An ecient approach
to cluster in large multimedia databases with noise",
SIGKDD, 98, pp. 12-19 (1998).
14. Hodge, V.J. and Austin, J. A survey of outlier detection
methodologies", Arti cial Intelligence Review,
22(2), pp. 85-126 (2004).
15. Leschziner, M.A. and Rodi, W. Calculation of
strongly curved open channel
ow", Journal of Hydraulic
Division, 105(10), pp. 1297-1314 (1979).
16. Vaghe , M., Akbari, M., and Fiouz, A.R. An experimental
study of mean and turbulent
ow in a 180
degree sharp open channel bend: Secondary
ow and
bed shear stress", KSCE Journal of Civil Engineering,
20(4), pp. 1582-1593 (2016).
17. Nortek, A.S., Vectrino Velocimeter User Guide, Nortek
AS, Vangkroken, Norway (2009).
18. Schier, R.E. Maximum Z Score and outliers", The
American Statistician, 42(1), pp. 79-80 (1988).
19. Byrd, R.H. Schnabel, R.B., and Shultz, G.A. Approximate
solution of the trust region problem by
minimization over two-dimensional subspaces", Mathematical
Programming, 40(1), pp. 247-263 (1988).
20. Marquardt, D. An algorithm for least-squares estimation
of nonlinear parameters", SIAM Journal on
Numerical Analysis, 11(2), pp. 431-441 (1963).
21. Gimenez, E., Crespi, M., Garrido, M.S., and Gil,
A.J. Multivariate outlier detection based on robust
computation of Mahalanobis distances application to
positioning assisted by RTK GNSS Networks", International
Journal of Applied Earth Observation and
Geoinformation, 16, pp. 94-100 (2012).
22. Liao, T.W. Clustering of time series data-a survey",
Pattern Recognition, 38, pp. 1857-1874 (2005).
23. De Morsier, F., Tuia, D., Borgeaud, M., Gass, V., and
Thiran, J.P. Cluster validity measure and merging
system for hierarchical clustering considering outliers",
Pattern Recognition, 48(4), pp. 1478-1489 (2015).
24. Farris, J.S. On the cophenetic correlation coecient",
Systematic Biology, 18(3), pp. 279-285 (1969).
25. Agyemang, M. and Ezeife, C.I. LSC Mine: algorithm
for mining local outliers", 15th Information Resources
Management Association, New Orleans, USA, pp. 23-
26 (2004).
26. Kohonen, T., Self-Organizing Maps, Springer, New
York, USA (1997).
27. Kaufman, L. and Rousseeuw, P.J., Finding Groups in
Data: an Introduction to Cluster Analysis, John Wiley
& Sons, New York, USA (1990).
28. Cormen, T.H., Leiserson, E.C., and Rivest, R.L.,
Introduction to Algorithms, 1st Edn., McGraw-Hill,
New York, USA (1990).
29. Barbhuiya, A.K. and Dey, S. Measurement of turbulent

ow eld at a vertical semicircular cylinder
attached to the sidewall of a rectangular channel",
Flow Measurement and Instrumentation, 15(2), pp.
87-96 (2004).
30. Kim, S.C., Friedrichs, C.T., Maa, J.Y., and Wright,
L.D. Estimating bottom stress in tidal boundary layer
from acoustic Doppler velocimeter data", Journal of
Hydraulic Engineering, 126(6), pp. 399-406 (2000).