Mapping small watercourses with deep learning – impact of training watercourse types separately

Deep learning methods for semantic segmentation have shown great potential in automating mapping of geospatial features, including small watercourses such as streams and ditches. There are a variety of small watercourse types. In many use cases users are only interested in specific types of watercourses. However, the impact on results from neural networks trained with only some types of small watercourses, compared to all types of watercourses is not well known. We trained four deep learning models to semantically segment watercourses from an elevation model. One model was trained with all small watercourses in the labels as a single class, while three models were trained each with a single type of watercourse in the label data. The results show that training the network with a single type of watercourse results in worse recall for all three watercourse types, compared to when training all of them together. This indicates that if the goal is to get as complete set of features as possible, it is better to include all watercourse types in the training data. Future studies could use multi-class output from neural network to determine how well networks could automatically classify features when training with all small watercourses in an area.


Introduction
Small watercourses, such as ditches and narrow streams have previously been mapped through fieldwork or stereo images in Finland. With high resolution light detection and ranging (LIDAR) digital elevation models (DEMs) available, it is now possible to map these features with greater accuracy. However, because of the nature of small watercourses, manual mapping of these features is very laborious. There are many types of small watercourses, for example, natural rivulets, road-side ditches, and drainage ditches for both agricultural and forested areas. Deep learning methods, particularly convolutional neural networks (CNN) for semantic segmentation have shown great potential in automating extraction of small watercourses from DEMs (e.g., Xu et al. 2021;Paul, Ågren and Lidberg 2021). While it may be the goal to only map certain types of watercourses, it is unclear how leaving out certain types of watercourses from the training data impacts results from CNN models, compared to training with all small watercourses found in an area.
In this poster we present a study where we compared results from training four deep learning models to semantically segment small watercourses (less than 5m wide). One model was trained with training data containing all watercourses, while three models were trained with one out of three watercourse types in the training data. The aim of the study was to understand the impact on results when training with only some small watercourse types in the training data.

Data and methods
The study area covers an area of 36km 2 in Finland and includes a variety of small watercourse types. For input data, a DEM with 0.5m cell size was produced from a 5 AGILE: GIScience Series, 3, 43, 2022. https://doi.org/10.5194/agile-giss-3-43-2022 Proceedings of the 25th AGILE Conference on Geographic Information Science, 2022. Editors: E. Parseliunas, A. Mansourian, P. Partsinevelos, and J. Suziedelyte-Visockiene. This contribution underwent peer review based on a full paper submission. © Author(s) 2022. This work is distributed under the Creative Commons Attribution 4.0 License. points/m 2 LIDAR point cloud 1 , using the "demTINgridding" tool in WhiteBoxTools 2 . Training data was manually digitised using DEM-derived hillshading, topographic position index, and flow accumulation layers, together with a topographic map in the background to support the visual interpretation. First, all small watercourses (less than 5m wide) from the study area were digitised by a single person, and second, checked manually by another person. The digitised vectors were given a "type" attribute describing the watercourse type of the segment. Types include ditch (excl. roadside), roadside ditch, and natural stream and together cover al watercourse types found in the area. The digitised dataset was expanded with a 1.5m buffer and rasterised into the same 0.5m grid as the DEM.
A CNN model based on U-Net (Ronneberger, Fisher and Brox 2015) was developed using the PyTorch 3 Python library. U-Net was chosen as base as it has been used for hydrographic feature extraction from DEMs before (e.g., Stanislawski et al. 2021).
The input and label datasets were divided into training data (80%) and validation data (20%). Four models were trained, one with all feature types in the training data and one with only one of each type in the training data. Training was conducted through 250 epochs, with 6882 DEM pieces of size 128x128px, cut from random locations of the training set. DEM pieces were augmented using fully random rotation and mirroring (50% chance). The results were evaluated through recall, precision, and f1-score. In addition, when training with only one type of watercourse, the percentage of false positive predictions matching labels of other watercourse types was calculated. In addition, visual inspection was used to further assess the model predictions.

Results
When training with the full dataset, roadside ditches had the best recall, followed by other ditches (Table 1). Table 1. Training with all watercourse types in the training set as one class When training with the watercourse types separately, ditches (excl. roadside) had the highest f1-score followed by roadside ditches (Table 2). All types had worse recalls than when training with the types together. When training with ditches (excl. roadside) and roadside ditches, most of the false positives were something else than watercourses (Table 3), while when training with natural streams, most false positives were other watercourses. Table 3. Other watercourse types in model prediction, percentage of false positives and percentage of all predictions.
Visual inspection showed that when training watercourses by separate watercourse types, the model also included other types of watercourses in the positive predictions. It also appeared that training with all features together results in more complete features.

Discussion and conclusions
The results indicate that if the goal is to find small watercourses of some class, the network will find a more complete set of features in that class when training with all data. Separating ditches by type results in lower recall values, which means that less of each type of features are found. For ditches (excl. roadside) and roadside ditches, most false positives are negative in labels of other watercourse features, which indicates that the models can distinguish between these types to some degree. Visual inspection supports the numerical findings and shows training watercourses separately results in features more frequently being missing parts of them.
More accurate results could be achieved with a more extensive training dataset that would have a more balanced distribution of watercourse types, as it is not known how sufficient the training dataset size is for the task. Future studies could also use multi-class output from neural network to determine how well networks could automatically classify features when training with all small watercourses in an area.