Increasing the Efficacy of Umbilical Cord Blood Banking Using Machine Learning Algorithms: A Case Study from Royan Cord Blood Bank

Document Type : Article


1 Department of Industrial Engineering and Management Systems, Amirkabir University of Technology, Tehran, Iran

2 Department of Industrial Engineering, Faculty of Engineering, Kharazmi University, Tehran, Iran

3 Royan Stem Cell Technology Company, Cord Blood Bank, Tehran, Iran


Cord blood is the blood that obtains after the birth of a baby. Cord blood is rich in stem cells, which are used to treat a variety of diseases, including cancers and immune disorders. These treatments' effectiveness depends on the quantity of total nucleated cells (TNCs) in cord blood units (CBUs). Both public and private cord blood banks store these CBUs. Public banks rely on government funding for the cost of testing, storing, and maintaining CBUs. In addition, the quantity of TNCs in each CBU remains uncertain until the TNC test is conducted. This study aims to utilize ensemble learning algorithms to aid public banks in identifying and collecting potentially valuable CBUs prior to TNC testing in order to save the cost of TNC testing on CBUs that are not valuable. This study has three main contributions: Firstly, it demonstrates that the XGBoost and LightGBM algorithms can identify CBUs with TNC of more than 0.7×10^9,1×10^9, and 1.5×10^9; Secondly, the study combines the smote_NC method with Xgboost and LightGBM algorithms and evaluates each algorithm in identifying high TNC samples. Lastly, this article considers the effect of the phlebotomist experience on identifying high TNC samples, a variable overlooked in other studies.


Main Subjects