-
-
Notifications
You must be signed in to change notification settings - Fork 11
/
Copy pathengine2.qmd
307 lines (236 loc) · 14.8 KB
/
engine2.qmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
```{r,echo=FALSE,message=FALSE,warning=FALSE}
library(lidR)
library(stars)
r3dDefaults = rgl::r3dDefaults
m = structure(c(0.921, -0.146, 0.362, 0, 0.386, 0.482, -0.787, 0,
-0.06, 0.864, 0.5, 0, 0, 0, 0, 1), .Dim = c(4L, 4L))
r3dDefaults$FOV = 50
r3dDefaults$userMatrix = m
r3dDefaults$zoom = 0.75
knitr::opts_chunk$set(
comment = "#>",
collapse = TRUE,
fig.align = "center")
rgl::setupKnitr(autoprint = TRUE)
```
# LAScatalog processing engine (2/2) {#sec-engine2}
@sec-engine showed how to use the `LAScatalog` processing engine to apply `lidR` functions on a collection of files. This included how to process acquisitions in user defined chunk sizes, loading buffers on-the-fly, saving the output to disk, and parallelizing tasks. These examples all used existing functions provided with the `lidR` package. The power of the engine goes beyond limited used cases, allowing developers to design their own applications.
Let's imagine we want to normalize a data set **and** colorize it using RGB aerial data. According to the previous sections we could do something like this:
``` r
opt_output_files(ctg) <- paste0(tempdir(), "/{*}_normalized") # specify output normalized las disk save location
ctg_norm <- normalize_height(ctg, tin()) # normalize
# loop through catalog files and apply rgb colour
rgb <- terra::rast("rgb.tif")
for (file in ctg$filename)
{
newname <- tools::file_path_sans_ext(basename(file))
newname <- paste0("folder/", newname, "_colored.laz")
las <- readLAS(file)
las <- merge_spatial(las, rgb)
writeLAS(las, newname)
}
```
This method is however far from optimal. First, **the whole** point cloud is read two times:
1. To normalize
2. To colorize
This takes a significant amount of time. It also implies the need to create copies of the acquisition after each processing step, taking up a lot of disk storage. Then, to finish, a custom `for` loop is used because `merge_spatial()` cannot be applied on a `LAScatalog` natively.
**What if we were able to do that in a single run?**
## `catalog_apply()` {#sec-engine-catalog-apply}
In this section we outline the `catalog_apply()` function, which is used subliminally in every function seen in the @sec-engine. Here we show how to leverage its versatility to design tools that do not yet exist, like the normalization + colorization example from above, but also more innovative processes.
Again, the `LAScatalog` class and the `LAScatalog` engine are deeply documented in a two dedicated vignettes available [here](https://cran.r-project.org/web/packages/lidR/vignettes/lidR-LAScatalog-class.html) and [here](https://cran.r-project.org/web/packages/lidR/vignettes/lidR-LAScatalog-engine.html). The purpose of these examples is to provide alternative, hands-on documentation with more applied examples and real use cases.
In this section we will use the same 9 files used in @sec-engine and an RGB raster that encompasses the collection of files.
```{r plot-ctg-engine2, fig.height=4, fig.width=4}
ctg <- readLAScatalog("data/ENGINE/catalog/")
plot(ctg)
```
```{r, echo=FALSE}
opt_progress(ctg) <- FALSE
```
The RGB image is available for download [here](data/ENGINE/catalog/rgb.tif).
```{r, echo=FALSE, eval=FALSE}
url="https://servicesmatriciels.mern.gouv.qc.ca:443/erdas-iws/ogc/wmts/Inventaire_Ecoforestier/Inventaire_Ecoforestier/default/GoogleMapsCompatibleExt2:epsg:3857/{z}/{y}/{x}.jpg"
mffp = list(src="Aerial Photography", q=url, sub=NA, cit='')
bbox <- sf::st_buffer(sf::st_as_sfc(sf::st_bbox(ctg)), 300)
bbox <- sf::st_bbox(sf::st_transform(bbox, "EPSG:4326"))
rgb <- maptiles::get_tiles(bbox, crop = TRUE, provider = mffp, zoom = 16, verbose = T)
rgb <- terra::project(rgb, st_crs(ctg)$wkt)
terra::plotRGB(rgb)
plot(ctg, add = T)
bb = ext(ctg)
bb = bb + 100
rgb = terra::crop(rgb, bb)
terra::plotRGB(rgb)
plot(ctg, add = T)
terra::writeRaster(rgb, "data/ENGINE/catalog/rgb.tif", overwrite=TRUE)
```
```{r plot-rgbmap, fig.height=8, fig.width=8, warning=FALSE}
rgbmap <- terra::rast("data/ENGINE/catalog/rgb.tif")
terra::plotRGB(rgbmap)
```
## Create user-defined function {#sec-engine-user-function-generic}
First we need to create a function that combines the normalization + colorization steps. Let's call it `norm_color()`. It's simple and uses only 2 functions.
```{r}
norm_color <- function(las, rgbmap) # create user-defined function
{
nlas <- normalize_height(las, tin()) # normalize
colorized <- merge_spatial(nlas, rgbmap) # colorize
return(colorized) # output
}
```
Let's try it on a sample plot.
```{r plot-las-rgb-colored, rgl = TRUE}
las <- clip_circle(ctg, 338800, 5238500, 40)
nlasrgb <- norm_color(las, rgbmap) # apply user defined function
plot(nlasrgb, color = "RGB", bg = "white", size = 6) # some plotting
```
It works! We have a workable function that performs on a point cloud. Remember that `norm_color()` is just an example. You could instead choose to make an algorithm for individual tree detection or a method to sample point of interest, *let your imagination run wild!*
So far it works only when using a `LAS` object. Now let's make it working with a `LAScatalog`.
## Create an intermediate function for `catalog_apply()` {#sec-engine-user-function-extended}
The simple way is:
``` r
new_ctg <- catalog_map(ctg, norm_color, rgbmap = rgbmap)
```
However, here we will learn the hard way in order to understand better how it works internally and understand the full potential of `catalog_apply()` for which `catalog_map()` is only a simplified version.
The core engine is the function `catalog_apply()` but this function does not work with any user-defined function. User-defined functions must respect a specific template (take a look at the documentation in R) and is expected to perform some specific tasks. First, the primary variable must be a chunk from the catalog (see `chunk` below), the function must read the chunk, check that it is not empty, then perform the computation.
A valid function is the following:
```{r}
norm_color_chunk <- function(chunk, rgbmap) # user defined function
{
las <- readLAS(chunk) # read the chunk
if (is.empty(las)) return(NULL) # check if it actually contain points
nlasrgb <- norm_color(las, rgbmap) # apply computation of interest
return(nlasrgb) # output
}
```
This introduces an additional level of complexity that is crucial. First, catalog chunks will be sequentially fed into the user's function. Each chunk is read inside the user-defined function with all processing options being automatically respected (`filter`, `select`, `chunk_size`, `chunk_buffer` etc.). The `las` variable in the code snippet above is a point cloud extracted from the catalog that contains the chunks + a buffer. It's important to check that the loaded portion of the collection is not empty or subsequent code will likely fail. This may happen depending on the size of the chunk and the `filter` options chosen.
This function is now `catalog_apply()` compatible and can be applied over the entire ALS acquisition. The output will be a point cloud, so we need to pay attention mitigate memory issues and save to disk. At this stage, we haven't mitigated that problem yet so we add it below.
```{r, echo=FALSE}
# I'm adding that just to compute faster in the example where we do not look at the point cloud anyway
opt_filter(ctg) <- "-keep_class 2"
```
```{r}
opt_output_files(ctg) <- paste0(tempdir(), "/{*}_norm_rgb") # write to disk
output <- catalog_apply(ctg, norm_color_chunk, rgbmap = rgbmap) # implement user-defined function using catalog_apply
str(output)
```
We can see that the output is a `list` with the name of each file written to disk. This is the default behavior of the engine. It returns a `list` with one element per chunk. This isn't really convenient. We have seen in @sec-engine that in the case of `LAS` files we can return a `LAScatalog`, which is far more convenient.
We can use the option `automerge = TRUE` so `catalog_apply()` will automatically merge the list into something more user-friendly.
```{r plot-ctg-colored-overlaps, fig.height=4, fig.width=4}
opt_output_files(ctg) <- paste0(tempdir(), "/{*}_norm_rgb")
options <- list(automerge = TRUE) # merge all the outputs
output <- catalog_apply(ctg, norm_color_chunk, rgbmap = rgbmap, .options = options)
output
```
We now have a `LAscatalog`. It is much better but:
```{r}
plot(output)
```
We still have a problem here. The files are overlapping because we read each file with a buffer and then the outputs have been written **with their buffer**. This is bad practice. It's important to always remove the buffer after the computation. In this specific case, the output is a `LAS` file. When `readLAS()` reads a chunk from the catalog the points in the buffer are flagged so they can be easily manipulated. Since they are flagged, we can easily remove these points by adding this step to our user-defined function. It is always the role of the user to remove the buffer of the output.
```{r plot-ctg-colored, fig.height=4, fig.width=4}
norm_color_chunk <- function(chunk, rgbmap)
{
las <- readLAS(chunk)
if (is.empty(las)) return(NULL)
nlasrgb <- norm_color(las, rgbmap)
nlasrgb <- filter_poi(nlasrgb, buffer == 0) # remove buffer
return(nlasrgb)
}
opt_output_files(ctg) <- paste0(tempdir(), "/{*}_norm_rgb")
options <- list(automerge = TRUE)
output <- catalog_apply(ctg, norm_color_chunk, rgbmap = rgbmap, .options = options)
plot(output)
```
When the outputs are spatial vectors or spatial rasters they can be cropped using the bounding box of the chunk accessible via `extent(chunk)` or `bbox(chunk)` or `st_bbox(chunk)` or `ext(chunk)` that match respectively with `raster`, `sp`, `sf/stars` and `terra`.
Nice! We created a custom function -- `norm_color()` -- and we upscale its definition to capably and efficiently process an entire ALS acquisition made up of hundreds of files. Using options described in @sec-engine, we can also introduce parallelization, chunk size control, and buffer size control.
``` r
library(future)
plan(multisession)
opt_filter(ctg) <- "-keep_class 2"
opt_chunk_size(ctg) <- 300
opt_chunk_buffer(ctg) <- 40
opt_output_files(ctg) <- paste0(tempdir(), "/{*}_norm_rgb")
options <- list(automerge = TRUE)
output <- catalog_apply(ctg, norm_color_chunk, rgbmap = rgbmap, .options = options)
```
We can check that it worked by loading a sample from somewhere in our new catalog. According to the options put above, we are expecting to get a normalized (we normalized), colored (we merged RGB image), ground points (we used `-keep_class 2`).
```{r plot-cliped-gnd-colored, rgl = TRUE}
opt_output_files(output) <- ""
las <- clip_circle(output, 338800, 5238500, 60)
plot(las, color = "RGB", size = 6, bg = "white")
```
## Make a user-friendly function for third party users {#sec-engine-user-function-lascatalog}
When designing tools that will be used by third parties, we would like to hide `catalog_apply()` and intermediate functions inside a more high level and user-friendly function similarly to `lidR` functions that work the same both with a `LAS` or a `LAScatalog`. Moreover we would like to make the function foolproof for users. To do this we can create wrap our codes if `if-else` statements to apply to correct code to the correct object.
```{r}
# Create a generic function
norm_color <- function(las, rgbmap)
{
if (is(las, "LAScatalog")) {
options <- list(automerge = TRUE)
output <- catalog_apply(ctg, norm_color, rgbmap = rgbmap, .options = options)
return(output)
} else if (is(las, "LAScluster")) {
x <- readLAS(las)
if (is.empty(x)) return(NULL)
nlasrgb <- norm_color(x, rgbmap)
nlasrgb <- filter_poi(nlasrgb, buffer == 0)
return(nlasrgb)
} else if (is(las, "LAS")) {
nlas <- normalize_height(las, tin())
colorized <- merge_spatial(nlas, rgbmap)
return(colorized)
} else {
stop("Not supported input")
}
}
```
We now have a single function that can be used seamlessly on a `LAS` or a `LAScatalog` object.
``` r
las_norm_colored <- norm_color(las, rgbmap)
opt_output_files(ctg) <- paste0(tempdir(), "/{*}_norm_rgb")
ctg_norm_colored <- norm_color(ctg, rgbmap)
```
The different steps to apply to a `LAScluster` (i.e. `readLAS` + remove buffer of the output) are, in this case, very common. It is possible to use `catalog_map()` instead, that is capable of doing this tasks for you. But in some cases the output may be not supported, because the cropping step is more complex. In those cases, `catalog_apply()` must be used because it allows for a more accurate control.
```{r}
# Create a generic function
norm_color <- function(las, rgbmap)
{
if (is(las, "LAScatalog")) {
output <- catalog_map(ctg, norm_color, rgbmap = rgbmap, .options = options)
return(output)
} else if (is(las, "LAS")) {
nlas <- normalize_height(las, tin())
colorized <- merge_spatial(nlas, rgbmap)
return(colorized)
} else {
stop("Not supported input")
}
}
```
## Make a safe function for third party users {#sec-engine-user-function-safe}
At this stage our function is almost finished. However it is not foolproof. What if the user of this new function does not provide any output path template? The point cloud in each chunk will be loaded in memory and will be retained in memory until it eventually becomes full and R crashes. We must prevent the use of the function if no path to the disk is given.
Also, the computation of a DTM requires a buffer. If the user sets a 0 m buffer the output of this function will be incorrect. We must prevent the use of the function if a buffer of 0 is given. These two cases are covered by the engine with the options `need_output_file = TRUE` and `need_buffer = TRUE`. To finish we would like to disable the ability to tune the `select` option to ensure no attribute is lost.
```{r}
norm_color <- function(las, rgbmap)
{
if (is(las, "LAScatalog")) {
opt_select(las) <- "*" # disable select tuning
options <- list(automerge = TRUE, need_output_file = TRUE, need_buffer = TRUE) # require output path & buffer size
output <- catalog_map(ctg, norm_color, rgbmap = rgbmap, .options = options)
return(output)
} else if (is(las, "LAS")) {
nlas <- normalize_height(las, tin())
colorized <- merge_spatial(nlas, rgbmap)
return(colorized)
} else {
stop("Not supported input")
}
}
```
We can test what happens if we use this function naively.
```{r, error = TRUE}
opt_output_files(ctg) <- ""
ctg_norm_colored <- norm_color(ctg, rgbmap)
```
It failed with an informative message!
We are done. This is exactly how the `lidR` package works and we have presented all the tools needed to extend it with new applications.
More examples can be found further along in this book. For example, @sec-outbox-distance-returns presents how to compute a raster of the average distance between first and last returns, and @sec-outbox-rumple-index presents how to compute a rumple index of the canopy from the point cloud and, of course, the documentation of the package itself is more comprehensive than this book.