Building Change Detection of Airborne Laser Scanning and Dense Image Matching Point Clouds using Height and Class Information

Detecting changes is an important task to update databases and find irregularities in spatial data. Every couple of years, national mapping agencies (NMAs) acquire nation-wide point cloud data from Airborne Laser Scanning (ALS) as well as from Dense Image Matching (DIM) using aerial images. Besides deriving several other products such as Digital Elevation Models (DEMs) from them, those point clouds also offer the chance to detect changes between two points in time on a large scale. Buildings are an important object class in the context of change detection to update cadastre data. As detecting changes manually is very time consuming, the aim of this study is to provide reliable change detections for different building sizes in order to support NMAs in their task to update their databases. As datasets of different times may have varying point densities due to technological advancements or different sensors, we propose a rasterbased approach, which is independent of the point density altogether. Within a raster cell, our approach considers the height distribution of all points for two points in time by exploiting the Jensen-Shannon distance to measure their similarity. Our proposed method outperforms simple threshold methods on detecting building changes with respect to the same or different point cloud types. In combination with our proposed class change detection approach, we achieve a change detection performance measured by the mean F1-Score of about 71% between two ALS and about 60% between ALS and DIM point clouds acquired at different times.


Introduction
Knowing when and where a building has been newly built or has been demolished and consequently keeping the cadastre data up-to-date is a challenging, but only one of many tasks and responsibilities of NMAs to accomplish. If the building owners do not report finished changes on their buildings, the NMA or some local surveying agencies have to send out teams to check the status of running building projects, what often causes additional work and resources. By automatically analysing and detecting changes, more coordinated work trips are possible. Due to their 3D information, point clouds have a high potential to be used for object change detection.
Besides ALS point clouds, which are acquired by the NMA every couple of years, DIM point clouds are often derived as a secondary product conducted by Orthophoto flight missions using a semi-global matching algorithm to generate a dense point cloud. As already discussed in (Mandlburger et al., 2017), ALS and DIM point clouds possess very unique characteristics concerning different point accuracies, point densities and especially their behaviour towards vegetation and texture lacking planes. Despite all these differences, both point cloud types are acquired in a regular interval by the NMAs and cover objects at a wide scale. As such, they offer a way to detect changes automatically, in a reasonable time interval and for many places at the same time.
To support the workflow of the NMAs, we propose a change detection algorithm, which detects changed building objects independent of the given point cloud type, which consequently ensures maximal practical usability.
The contributions of this paper are summarized as follows:  The proposed algorithm detects height changes based on the height distribution within raster cells. Compared to thresholdbased approaches, ours is more flexible and does not require normalized heights.
 We combine the analysis of height changes with simple class changes and explore the advantages and disadvantages towards the detection results.  We test our method to detect building changes on ALS and ALS as well as on ALS and DIM point cloud constellations to show its generalization potential.
The remainder of the paper is structured as follows. In section 2, related work regarding change detection on airborne point clouds is discussed. We describe our method, evaluation methods, the used datasets and software in section 3. We present and discuss our results in section 4 and 5 and conclude this paper in section 6.

Related Work
Change detection on ALS point clouds is often working on either 2.5D raster or 3D voxel representations of point clouds or on the point clouds itself. 2D raster methods are often simple, easy to calculate and work on different point densities. Difference of Digital Elevation Models (DoD) subtract normalized DEMs from two points in time and then use a mostly manually set threshold to generate a binary detection result (Murakami et al., 1999;Teo and Shih, 2013). This result is then combined with a classification into non-ground and building for all detected changes to filter out non building objects. On a local level using point clouds from mobile mapping, 3D voxel grids track the occupancy of a voxel. From these occupancies, short and long term changes can be derived (Schachtschneider et al., 2017).
Change detection methods on the point cloud level often measure the cloud to cloud (C2C) or cloud to plane distance of point pairs (Richter et al, 2012). Alternatively, an Iterative Closest Point algorithm is used, where the determined registration parameter are interpreted as movement between two point clouds (Matikainen et al., 2010;Scott et al., 2018). More complex variants such as the Model to Model Cloud Comparison algorithm (M3C2) also consider the normal vectors for their change calculations (Lague et al., 2013). Recently, even multi-directional change detection is used to detect the dominant movement direction of the ground (Williams et al., 2021). Other than only focusing on the geometric differences between point clouds, (Tran et al., 2018)  In our approach, we also rasterize the ALS and DIM point cloud to integrate both into a joint resolution. Instead of detecting changes according to coarse patches like in (Zhang et al., 2019) however, we calculate our change detection on each raster cell independently similar to the DoD methods. Rather than a simple threshold, we consider the complete height distribution within a raster cell to capture the similarity of two distributions using the Jensen-Shannon distance, which results in a likelihood of a change rather than a binary output.

Method
Change detection probabilities are calculated on a raster level with a cell size of 1m² and contain information about height and class change scores. As we calculate class changes, we expect the point clouds to be already classified. For our experiments, we used the classification method by (Politz et al., 2020).
First, the points of each point cloud are mapped into their respective raster cell. Second, the height and class change scores of section 3.1 and 3.2 are calculated and later combined to one joint change detection in section 3.3. Third, the quality of the detected changes is evaluated using manually controlled reference data as explained in section 3.4. An overview of the method is shown in Fig. 1.

Height change (HC)
The height change records the change probability concerning the variation in point height distribution within a raster cell between two different points in time.

Threshold
In order to compare our method to the state-of-the-art, we implemented a version of the threshold method used by (Murakami et al, 1999) using a threshold of 2 . Instead of a normalized point cloud however, we used the minimal point height , of time within a raster cell as an approximation of a DEM.
This approximation is sufficient, as the threshold method should still detect those height jumps. The height change using the threshold method is defined as 1, , , 0, !" . (1)

Jensen-Shannon Distance
The Jensen-Shannon Distance (JSD) is the square root of the Jensen-Shannon divergence, which is the symmetrized version of the Kullback-Leibler divergence $%•'. It is calculated using Eq.
Instead of comparing the distributionsand /, which are the respective height distributions of time and within a raster cell, the Jensen-Shannon divergence comparesand / with the mean distribution , where %-4 /'. As such, the order ofand / does not change the results, which is the main advantage of JSD compared to the Kullback-Leibler divergence. As discussed in (Lin, 1991), the divergence of two probability distributions such as +,$%-||/' in Eq.
(2) vary in range between 0 and 1 using a base 2 logarithm. As output values close to 0 represent similar and output values close to 1 represent differences in the distributions, we can interpret the result of +,$%-||/' as a height change probability, which is called ()* for the remainder of this paper.
As the calculation of +,$%-||/' requiresand / to be vectorised to same length, the for time and are transformed into a joint histogram format using a bin size of 5. The range of the histogram is determined by the extreme height values between both point clouds within a tile's extent. The calculated range is applied to each raster cell individually. Based on extensive experiments as shown in Fig. 14 -17 in the Appendix, we chose 5 to be 0.5m, which is still small enough to detect real building changes while not being too oversensitive for smaller changes due to point accuracy of either point cloud type. To avoid small shifts betweenand / for similar distributed raster cells due to the binning process, we calculate the JSD three times, where / is moved by one bin in the range of 6 1, 17. The minimal JSD is set as final JSD value for the raster cell.

Class changes (CC)
After mapping the points into their appropriate raster cell, the class labels {9 , … , 9 } for all available < ∈ > classes in the dataset, which are given from a prior classification process, are utilized to calculate the class change probability. The class probability ?% | ' is defined by the class labels of all points within a raster cell given height distribution at time . The probability of class is then defined as ?% , | ' with ∑ ?% , | ' 1 . Additionally, we define the probability of the majority class , AB within a raster cell at time as ?C , AB | D as denoted by Eq. (3).

XOR Changes
The XOR class change HIJ simply checks if the majority class , AB has changed between the two points in time within a raster cell. It is defined as For building change detection, one of , AB or , AB must be from type building.

Class Probabilities
The class change method L M considers the frequency of the transition probability ?% | ' for the whole dataset (see. Eq. (5)). For each raster cell, the value in ?% | ' is used for calculation, which matches the majority class combination of , AB and , AB . This value is further noted as ?C , AB | , AB D . As class changes are rare, ?C , AB | , AB D on the minor diagonal is small. As such, it is subtracted from 1 to get the class change probability. In addition, we add a filter to isolate changes regarding buildings, since ?C , AB | , AB D per se does not restrict either , AB or , AB to belong to the building class itself. A simple function %N' filters the results on each raster cell, which returns the value of N , where N 1 ?C , AB | , AB D , if at least one building point is contained in the raster cell in either point cloud at time . If that is not the case, %N' will return 0. %N' will also return 0, if , AB , AB as there should not be any class change, if the majority class stays the same in both points in time. %N' differs from HIJ in a way that it does not require building points to be the majority in a cell, but still filters out any irrelevant raster cells from the detection. L M O1 ?C , AB | , AB DP (5)

Change Combinations
Different combination strategies are applied to evaluate the influence of the height and class probabilities of section 3.1 and 3.2 towards the change detection result. We experimented with different operations combining HC and CC and decided to use the element-wise multiplication as it yielded the best results in our experiments. Additionally, HC and CC are evaluated individually to determine their influence of the change detection results (see Eq. (6-8)).
Independent of the chosen combination method, a threshold generates a binary output from the detected changes, where all changes are set as foreground. Since the detected change values are between 0 and 1, where values closer to 1 indicate a change, any values below 0.5 are not likely to contain changes anymore. Similarly, assuming that both HC and CC are very close to 1 for all real changes, which is required to have the combined detection to be higher than 0.9, is also not very likely. Consequently, we experimented with threshold ∈ 60.5, 0.97 to figure out the best splitting point for the proposed changes.

Evaluation
The building objects are evaluated on an object-based level. As a first step, a connected component algorithm with an 8-way neighbourhood extracts the detected building objects in the detected changes and reference image, respectively. Second, a sparse transition matrix is built that connects the building IDs from the detection with those from the reference. If a building consists of several small regions in the predicted image, all building parts are matched to the corresponding reference building ID in that matrix. Third, for each building pair, the amount of true positive (TP), false positive (FP) and false negative (FN) pixels is calculated.
Finally, the F1-Score for each building is calculated separately following Eq. (9).
We exclude cells that contain no points in either point cloud from the evaluation process. As our method only considers the changes cell by cell, this would otherwise bias the evaluation metrics negatively as empty cells are not able to detect a change without any underlying data to support it. Therefore, only raster cells with at least one point in either point cloud are considered. To evaluate and compare our methods using different values for threshold , we calculate the mean F1-Score (mF1-Score) for all buildings within the dataset.

Data and Software Availability
The point cloud data and code are not available due to licence issues with the cooperating parties. The workflow underlying this paper was partially reproduced by an independent reviewer during the AGILE reproducibility review and a reproducibility report was published at https://doi.org/10.17605/OSF.IO/rsf4m.

ALS and DIM point clouds
We use three point cloud datasets from 2012 and 2016 to evaluate our proposed method. The point clouds are provided by the NMA of Mecklenburg-Vorpommern (AFGVK), Germany. All three point clouds cover an area of 15 km² in southern Rostock, Germany. The region is characterized by high and low density, urban buildings. There are two ALS point clouds, which were acquired in national aerial flight missions in the autumn of 2012 and 2016 and which have approximate point densities of 5 and 12 points/m², respectively. The absolute point accuracy is 30cm horizontally and 15cm vertically for both ALS point clouds. The image data for the third point cloud was acquired in the summer of 2016 using a Vexcel camera. The AFGVK used the software SURE to derive the DIM point cloud (Rothermel et al., 2012). The point cloud has an average point density of 96 points/m² with a horizontal and vertical absolute point accuracy of 20cm and 30cm, respectively. All point clouds have been classified into the classes' ground, building, water, non-ground and bridge using the method of (Politz et al., 2020). The classification using the original point height concluded with an overall accuracy of 92% for the ALS 2012 and with 97% for the ALS 2016 datasets, where the building class achieved a 91% and 94% F1-Score, respectively. The classification of the DIM point cloud achieved an overall F1-Score of 92% with a F1-Score for buildings of 90%.
Changes between both point clouds have been manually digitalized as reference in vector format using GIS software. A changed building is included in the reference as long as point clouds from both points in time covered most of the buildings and were actually classified as such. For evaluation purposes, the vector data is transformed into a raster format using a 1m² resolution, which is also used for the change detection. The reference data contains a total of 723 changed building objects. The amount of changed buildings depending on their ground area are shown in Fig. 2

Changes in ALS-ALS
The mF1-Scores for all buildings within ALS-ALS are listed in Fig. 3 -5.
The mF1-Scores of ℎT<U VW using yield a stable value of 49.89% (see Fig. 3). This is due to its calculation as shown in Eq. (1), which turns to a binary detection and is consequently indifferent to the changing values of . In comparison, our proposed height change method ()* exhibits a parabolic behaviour with a maximal mF1-Score of 60.54% at 0.8, which is roughly 10% higher than .
The two nearly horizontal lines for ℎT<U WW for different as depicted in Fig. 4 are not surprising as HIJ only outputs binarized values, which are indifferent to varying (see. Eq. (4)). As L M only changes its value concerning %N' and ?C , AB | , AB D, it only outputs a few differentiated values. As such, the mF1-Score is mostly indifferent towards and only changes with 0.9 as building objects with a L M smaller than that drop out of the change detection. Until ≤ 0.8 , HIJ and L M result in similar mF1-Scores for ℎT<U WW within a 1% difference.
The combined method embraces the characteristics of height and class changes as shown in Fig. 5. While the mF1-Scores of the threshold combinations stay around 52%, the JSD combinations have a decreasing pattern with increasing . As the mF1-Scores peak at 0.8 for ℎT<U VW and ℎT<U WW , it is not surprising that ℎT<U W M has its maximal mF1-Score at 0.6 as their combined detection value. Overall, the combined change detection performs more than 10% better than ℎT<U VW and by around 3% better than ℎT<U WW using ≤ 0.7 and ()* .

Comparing ALS-ALS with ALS-DIM
The mF1-Scores for all tests concerning ALS-DIM can be found in Fig. 6 -8.
In contrast to ALS-ALS, the results for ALS-DIM yield higher mF1-Scores for larger values in ℎT<U VW and ℎT<U WW . JSD still outperforms the threshold method by 5-7% for ℎT<U VW using 0.9, which is slightly smaller than the ALS-ALS results. Both variants for ℎT<U WW result for ALS-DIM in similar mF1-Scores of about 56% for ≤ 0.7, which is 12% lower than for ALS-ALS. At 0.7, the mF1-Score for L M increases to 56.28% as falsely detected objects drop out of the detection. Since ℎT<U VW and ℎT<U WW require higher thresholds to achieve their best values, the peak for ℎT<U W M at 0.7 is also higher than the one using ALS-ALS at 0.6.
Similar to the results for ALS-ALS, the combined approach for ALS-DIM with ()* achieves more than 15-20% higher mF1-Scores than with .
For the remainder of this paper, figures and tables show the results using ℎT<U W M with ()* and L M unless noted otherwise. is set to 0.6 for ALS-ALS and to 0.7 for ALS-DIM.        Fig. 9c and 9d even surpass the reference, since they contain an additional building (red circle), which was missed during the reference labelling process. Similarly, the ALS-ALS detection in Fig. 9c reveals an additional building within the blue triangle, which has been built during the summer and autumn acquisition of the DIM and ALS data, respectively. Other examples of change detection results for ALS-ALS and ALS-DIM can be found in Fig. 18 and 19 in the Appendix.
However, there are also some misjudgements in the detection results, which generate FPs and FNs. Most of these originate from the discretisation within our method or from the differences in ALS and DIM characteristics. The overall TP, FP and FN values are listed in Tab. 2. The number of raster cells, which return as FP, are 0.29% or 0.46% higher than the number of TP raster cells for the ALS-ALS and ALS-DIM dataset, respectively. Most of the FPs are covering single, isolated raster cells at the border of a roof as shown in Fig. 9 (yellow rectangles). Due to some remaining misalignments or simply due to the coarse point accuracy between the point clouds, the rasterization process sometimes splits points at the border of a roof into two different raster cells. Likewise, the 17% more FPs for ALS-DIM compared to ALS-ALS in Tab. 2 indicate, that the distinct characteristics in the two point cloud types are another reason for the high FP number (see Fig. 10). This is especially the case for vegetation. While the laser beam in ALS can penetrate the foliage and return points on the tree and the ground below, DIM point clouds mostly contain points on the treetop. Consequently, this difference in height distributions causes incorrect height changes (see Fig. 10c). In addition, the transition from building to close vegetation is often fluent due to the smoothing term in the DIM process, which causes misclassifications on the border and trigger incorrect class changes as visible in Fig. 10d. Contrarily, FNs often occur on buildings with translucent glass roofs as they have the same issue as vegetation (see Fig. 11). In ALS, the laser beam often penetrates the glass and returns a signal from the ground as well as some roof parts, while in DIM only the roof is reconstructed.

Results depending on building size
The achieved F1-Scores of our method demonstrate an increasing trend with respect to area size. This is to be expected as the evaluation takes place on a raster level. If the same number of pixels are wrongly detected, this will affect the ratio between TP, FP and FN for smaller buildings more drastically than for larger ones. The F1-Scores for the ALS-ALS and ALS-DIM detections are plotted depending on their ground area in Fig. 12 and 13, respectively. Especially the median F1-Scores for building areas less than 10m² are around 10-20% for ALS-ALS and around 30-40% for ALS-DIM lower than the groups with larger ground areas. Contrarily, the 5% percentiles remain at 30% for buildings until 30m² in ALS-ALS (see Fig. 12) and even as low as 10% until a 50m² area in ALS-DIM (see. Fig. 13). These are caused by the volatile FP, TP and FN ratios for smaller buildings, which are also mostly affected by the FPs and FNs as discussed in section 4.3. At even larger areas, the majority of the detected buildings has a F1-Score in a concise area around 80-90% for both datasets.

Discussion
The results of the proposed change detection method show median F1-Scores for ALS-ALS and ALS-DIM of more than 70% for buildings larger than 10m² (see Fig. 12 and 13), which indicates a reliable support to detect changed buildings for practical applications. However, as discussed in section 4.3 and 4.4, some issues regarding the proposed method remain. The discretisation on behalf of the rasterization and the binning for the height distributions cause FPs. Using simple raster-based post-processing or incorporating a hierarchical approach, which assumes that errors may not occur in a different resolution, could reduce small errors at the border of roofs as visible in Fig. 9, 11, 18 and 19. Equally, using exclusively the points from the first or last pulse in ALS point clouds may reduce the amount of incorrectly detected height changes caused by vegetation for ALS-DIM and ALS-ALS, respectively. Similarly, the filter %N' from Eq. (5) permits class changes for any arbitrary class in the datasets, but it is too simple to remove misclassifications as shown in Fig. 10d. In addition, the restriction , AB , AB of %N' was supposed to remove unchanged buildings from the class change result.
However, this restriction also prohibits detecting buildings, which are under construction or built-over, since one building is changing to another (see Fig. 19). A more sophisticated class change calculation such as a Bayes Network or Deep Learning may improve these shortcomings.
There are multiple strategies to enhance the proposed method and its results. As neighbouring cells should have a similar behaviour as long as they belong to the same object, enhancing the detection method with an additional object level should improve the detection integrity of a changed building as a whole. Equally, using the class information as a prior or by non-linear weighting of height and class changes, the overall detection results may improve. Finally, introducing different types of changes such as new, demolished, under construction or built-over may enhance readability and understanding of the resulting detections for practical applications.

Conclusions
We introduced a new method to detect height changes in point clouds using the Jensen-Shannon distance on a rasterized grid, which yielded superior results by 10 -20% in the mean F1-Score in all experiments compared to the state-of-the-art threshold method. Introducing simple class change methods and combining them with our height changes achieved mean F1-Scores about 71% for changes between two ALS point clouds and about 60% for changes between an ALS and DIM point cloud. Even though there are still some issues concerning incorrectly detected buildings by our method, the presented results suggest that it is able to detect changed building objects quite effectively with median F1-Score of more than 70% for any point cloud type and for all buildings with a ground area larger than 10m². This is crucial for practical approaches as it allows detecting changed building objects automatically and in an efficient manner.
In future work, the issues and improvements discussed in section 5 will be investigated. For example, incorporating a more complex class change calculation or combination approach such as using a Bayesian network or by utilizing a Deep Learning model as nonlinear weighting method may solves the discussed weighting issues, improves the building detection, and replaces the need to set parameters manually.  Similarly, an object-based approach using a Conditional Random Field or Deep Learning may improve the results as neighbouring raster cells are likely to behave similarly. Finally, more sophisticated methods will be explored to cope with class changes, data gaps in general or more complex building change situations.