Comparative Analysis of Machine Learning Techniques for Predicting Air Pollution
The modern and motorized way of life has cultured air pollution. Air pollution has become the biggest rival of robust living. This situation is becoming more lethal in developing countries and so in Pakistan. Hence, this inquiry was carried out to propose an architecture design that could make real-time prediction of air pollution with another purpose of scanning the frequently adopted algorithm in past investigations. In addition, it was also intended to narrate the toxic effects of air pollution on human health. So, this research was carried out on a large dataset of Seoul as an adequate dataset of Pakistan was not attainable. The dataset consisted of three years (2017-2019) including 647,512 instances and 11 attributes. The four distinctive algorithms termed Random Forest, Linear Regression, Decision Tree and XGBoosting were employed. It was inferred that XGB is more promising and feasible in predicting concentration level of NO2, O3, SO2, PM10, PM2.5 and CO with the lowest RMSE and MAE values of 0.0111, 0.0262, 0.0168, 49.64, 41.68 and 0.1856 and 0.0067, 0.0096, 0.0017, 12.28, 7.63 and 0.0982 respectively. Furthermore, it was found out as well that the Random Forest was preferred mostly in the previous studies related to air pollution prophecy while many probes supported that air pollution is very detrimental to human health especially long-lasting exposure causes lung cancer, respiratory and cardiovascular diseases.