Abstract:The main ecological patterns and functioning of estuarine ecosystems are difficult to evaluate owing to natural and human induced complexity and variability on biodiversity. Therefore, there is an increased demand to analyze and predict the relationships between the environment and the distribution of biota in estuarine ecosystems. Biodiversity is viewed as the variety of life, encompassing variations from the gene to ecosystem levels, and is commonly expressed as species richness. The patterns of biodiversity in the Yangtze River Estuary have remained largely unexplored, despite the increasing understanding of the importance of estuarine ecosystems and the existing knowledge on the variability of fish communities within estuaries and their environmental drivers. As a transitional system, the Yangtze River Estuary, a typical ecotone, is the largest estuarine ecosystem in the western Pacific Ocean. It establishes links between the marine and freshwater ecosystems in the East China Sea; persistent environmental fluctuations in this estuarine ecosystem creates considerable physiological demands on the species that inhabit this ecosystem. Predictive modelling techniques are being increasingly used to determine major habitat requirements that affect species distribution. Important technological advancements have benefited predictive distribution modelling, and new and sophisticated methods have been developed for use in statistical models that are applied to ecology. The prediction of fish biodiversity has important scientific implications for evaluating the Yangtze River Estuary ecosystem. Based on fishery and environmental data collected in 2012-2013, a regression tree model was built to predict fish species richness in the Yangtze River Estuary. The node structure of the optimal decision tree model indicated that salinity, dissolved oxygen, and month (i.e. season) were three factors affecting fish biodiversity in the Yangtze River Estuary. In addition, the data observed in 2014 was used to validate the predictive performance of the tree-based model by calculating root mean square error (RMSE), average absolute error (AAE), and average relative error (ARE), which were often used as statistical indicators to compare fitted value and observed value in modelling studies. The results showed that the prediction performance was better in spring and summer than in autumn and winter, and generally, the model presents a fair predictable ability indicating the feasibility to predict fish species richness by utilizing a classification and regression trees (CART) algorithm. Estuarine ecosystems are often considered a complex mosaic of habitat types, and their fish biodiversity are best predicted through a CART algorithm. In the present study, in terms of predictive performance, CART could be viewed as an appropriate technique to predict fish species richness in the Yangtze River Estuary.