Scripting using text editor

Like I mentioned in the first section, you can use a code editor to process a series of lastools commands to make a processing workflow. I have prepared a simple, standardized lidar workflow that I use often and modify to fit my project needs. Ive pasted that script in the Example Script section.

Starting your own script

Lastools command line code is executed using .bat files. These files are executable files that are run by double clicking on them. To make your own .bat file simply open a new document on notepad++ and save it as example.bat. You will notice that the saved version of the file has an icon with 2 gears on it.

.bat files are scripted using the microsoft DOS programming language, though we need only a very basic understanding of some commands to make things work.

If you copied and pasted code into the file you will also notice that it automatically colour codes the contents.

Setting global variables and folder hierarchy

The first step in processing your data should be specifying where the initial input data are located, and where you want your output directory to be located. See the following piece of the example script.

:: Set global environments
set f_in=D:\lasin
set f_out=D:\lasout

set cores=4

In this script we use the set function, which assigns values to a variable. In this case we are assigning the path D:\lasin to the variable f_in. Consider this to the same as f_in <- "D:\lasin" in R. We do the same with f_out. We also set cores = 4, which we will use for multi core processing using the -cores switch.

In the DOS language whenever we want to call a variable that has been defined using set we need to use the %<VARIABLE>% format. This save us from having to write the file path everytime (which we will do often). You’ll also notice that the :: denotes a code comment.

Ouput folder hierarchy

Now that we have defined our root output folder %f_out%. We should think about what we actually want to do with our lidar files - i.e. how are we going to process them; what outputs will we make. This is important to think about because we want to produce a organized directory of processed data.

In the example script we do the following:

  1. Determine information about each file - lasinfo
  2. Tile data - lastile
  3. Optimize lidar data - lasoptimize
  4. Compute boundary of lidar files - lasboundary - CAN BE VERY SLOW
  5. Filter data for noise - lasnoise
  6. Classify data - lasground
  7. Normalize data - lasheight
  8. Make DTM and DSM - blast2dem & lasgrid
  9. Make CHM - lasgrid
  10. Make image of raster density - lasgrid

See the following piece of code:

:: processing stream
:: if a folder doesnt already exist -> make that folder
IF NOT EXIST %f_out% MKDIR %f_out%
IF NOT EXIST %f_out%\~_reports MKDIR %f_out%\~_reports
IF NOT EXIST %f_out%\01_tile MKDIR %f_out%\01_tile
IF NOT EXIST %f_out%\02_optimized MKDIR %f_out%\02_optimized
IF NOT EXIST %f_out%\03_class MKDIR %f_out%\03_class
IF NOT EXIST %f_out%\04_norm MKDIR %f_out%\04_norm
IF NOT EXIST %f_out%\05_raster\dtm MKDIR %f_out%\05_raster\dtm
IF NOT EXIST %f_out%\05_raster\dsm MKDIR %f_out%\05_raster\dsm
IF NOT EXIST %f_out%\05_raster\chm MKDIR %f_out%\05_raster\chm
IF NOT EXIST %f_out%\05_raster\den MKDIR %f_out%\05_raster\den

IF NOT EXIST specifies that if the folder following is not already created, make it MKDIR makes the specified folder

This means that if the folder defined by %f_out doesnt exist, that IF NOT EXIST %f_out% MKDIR %f_out% will make our root output directory.

This also works for sub-directories. When I process data I want each stage of processing outputs to be located in their own appropriately named folder within the root directory. This means that if I am tiling the data I will want it in a folder that contains tile, and normalizing norm etc.

Running the script for the first time

Now that we understand the folder hierarchy that will be created we can start to process some data. Before we do that however, paste the the word pause into the example script after the IF NOT EXIST parameters.

The pause function tells the program to… pause processing. This makes it easy for us to stop processing in particular locations to see if outputs are correct. You can resume processing by pressing enter in the command prompt window. The code will run until the next pause command, or until the script is finished or it finds an error.

After entering the pause command, save your example script and double click on the batchfile. The command prompt will open, you will see the processed code in the window, and you will see pause... at the bottom with a blinking indicator.

If this is what you see (great), (if not, not great - go back) navigate to the output folder location you specified with %f_out% and see whether the folders you speficied have been created (they should all be there!).

1. lasinfo

Now that your output folders are created you are ready to start processing your data using the lastools commands. The first command we see is lasinfo. Like we said earlier, we see a number of switches denoted with -, but we also see ^, which is a piping operator similar to %>% from the tidyverse in R. You need to use ^ if you want to script your code on multiple lines.

::Info about data
lasinfo -i %f_in%\*.las ^
    -cd ^
    -stdout ^
    -odir %f_out%\~_reports ^
    -otxt ^
    -cores %cores% 

::The code above is the same as a the code below
lasinfo -i %f_in%\*.las -cd -stdout -odir %f_out%\~_reports -otxt -cores %cores% 
    

In this command line example we do the following:

  1. Specify we want to use the lasinfo program
  2. Specify input data location -i are in %f_in%\*.las.
  3. -cd specifies that we want to compute point density in the files
  4. -stdout specifies that we want a standard output format
  5. Specify the output directory -odir as %f_out%\~_reports
  6. Specify that we want these outputs to be text files -otxt
  7. Specify the number of cores we want to use -cores %cores%

This is the first time we encounter the *.las operator phrasing. The * is a wild-card operator, which means that all files with a .las appendage will be included in processing. This means that we are processing all .las files within the specified folder.

Now that we know what this code does, lets try running it. Again, put the pause command after the code in your example then save and double click. You should see the script processing your files and text files with the name of each file should be written to the %f_out%\~_reports folder.

Before running the code enter a pause command on a new line. After it runs and your satisfied with the output, remove the pause command and highlight your code. Press Ctrl-Q to comment it out so it is not run again later.

2. lastile

We made some summary reports about our data using lasinfo. Now lets start manipulating the lidar data we have. We will start with tiling the data - a vary important step for efficient processing in future stages.

::Tile 500m - flag buffered points as withheld for simple drop later
lastile -i %f_in%\*.las ^
    -tile_size 500 ^
    -buffer 10 ^
    -flag_as_withheld ^
    -odir %f_out%\01_tile ^
    -olaz ^
    -cores %cores%

In this command line example we do the following:

  1. Specify we want to use the lastile program
  2. Specify input data location -i are in %f_in%\*.las.
  3. -tile_size 500 specifies that we want to tile our data to 500 m
  4. -buffer 10 specifies that we want each 500m tile to include a 10 m overlap with adjacent tiles
  5. -flag_as_withheld specifies that we want all buffer points to be flagged as withheld in the meta data.
  6. -odir %f_out%\01_tile specifies the output folder location. This becomes in the input for the next step.
  7. -olaz specifies that we want these outputs to be .laz files
  8. Specify the number of cores we want to use -cores %cores%

Why tile?

Tiling is very important for a number of reasons. Firstly, it means we can subdivide our data into smaller sections so that processing is less memory intensive. Lidar data files can be fairly large, so its good to reduce their size by tiling. I tend to think that files up to ~80MB are good for tiling, though if you have a number of files larger than 100MB its not a big deal. File size for tiling isnt really a rule I follow, but it something to be mindful of when processing

Why buffer?

We specified that we wanted a tile buffer of 10 m. The buffer is especially important when processing in lastools when making rasterized outputs like DEM, DSM, CHM etc. The buffer created overlap of adjacent tiles, which means that we can make seamless rasters. If we do not tile the data and create these rasters we often see artifacts in a grid accross the image outlining the tiling extent of the files.

lidR vs lastools — buffering is a fundamental difference in processing methodology

You might be asking.

“OK, so I’ve tiled and buffered by data. Now I can process it in lidR if i want to right?”

The short answer is yes. But its important to know that the lidR package automatically buffers your data for you when working with a lascatalog. That means that if you buffer your data in lastools that there will be alot of duplicate points that need to be removed. This is why we use the -flag_as_withheld switch.

When/If you decide to use lidR after lastools, you need to make sure to include the -drop_withheld switch like so: lidR::readLAS(<DATA>,filter="-drop_withheld") or lidR::catalog(<DATA>,filter="-drop_withheld").

These lidR commands will causes the process to ignore these flagged buffer points to eliminate redundancy.

To cut to the chase, buffer your data but use -flag_as_withheld so you can easily remove the buffer if you want to process in lidR.

Before running the code enter a pause command on a new line. After it runs and your satisfied with the output, remove the pause command and highlight your code. Press Ctrl-Q to comment it out so it is not run again later.

3. lasoptimize

This command is fairly new to lastools. It optimizes the data to make it more storage efficient.

:: Optimize las
lasoptimize -i %f_out%\01_tile\*.laz ^
    -odir %f_out%\02_optimized ^
    -cores %cores%

These “optimized” .laz data become the input data for the next step.

Before running the code enter a pause command on a new line. After it runs and your satisfied with the output, remove the pause command and highlight your code. Press Ctrl-Q to comment it out so it is not run again later.

4. lasboundary

You will immediately notice that the next line of code is commented out. This is largely because lasboundary can be exceptionally slow. The program detects the extents of the data within each lidar specified by -i and begins to construct a shapefile.

::Generate boundary shapefile
REM lasboundary -i %f_out%\02_optimized\*.laz -drop_withheld -merged -o %f_out%\boundary.shp

The code does the following

  1. sets input
  2. drops withheld points (buffered points)
  3. -merged specifies that the output will be a merged file
  4. -o %f_out%\boundary.shp specifies that the ouput will be a shapefile called boundary.shp

Before running the code enter a pause command on a new line. After it runs and your satisfied with the output, remove the pause command and highlight your code. Press Ctrl-Q to comment it out so it is not run again later.

Generating a boundary shapefile using lidR

If you need a very exact outline of your data you can use the code above, but if a more generalized extent of your will do just fine, you can use the spatial delineation generated from lascatalog objects from lidR. See example R code below.

## processing to get a rough extent of your lidar coverage
library(lidR)
library(sf)

ctg <- readLAScatalog("PATH_TO_FILES") #read catalog
cs <- as.spatial(ctg) #convert to spatial polygon dataframe

plot(cs) #plot shapes
shapefile(cs,"boundary.shp") #write shapefile
plot(cs)

Before running the code enter a pause command on a new line. After it runs and your satisfied with the output, remove the pause command and highlight your code. Press Ctrl-Q to comment it out so it is not run again later.

5. lasnoise

Filtering for noise points is standard practice with lidar data. We want to measure height accurately, and sometimes the sensor can have noise points from atmospheric interactions, birds flying, flying saucers etc. We want to remove these noise points to make sure our data is quality controlled.

::Denoise
lasnoise -i %f_out%\02_optimized\*.laz ^
    -step 2 ^
    -isolated 3 ^
    -odix _denoised ^
    -olaz ^
    -cores %cores% 

To filter noise i do the following:

  1. Set input directory
  2. -step 2 -isolated 3 specify the moving window and number of isolated points to act on
  3. -odix _denoised append "_denoised" to each file
  4. -olaz spcify output as .laz
  5. Specify the number of cores we want to use -cores %cores%

Before running the code enter a pause command on a new line. After it runs and your satisfied with the output, remove the pause command and highlight your code. Press Ctrl-Q to comment it out so it is not run again later.

6. lasground

A fundamental processing step. lasground classifies each point within files as ground (class 2) or not. This is very important for consequent height normalization and creating a terrain model.

::Classify ground
lasground -i %f_out%\03_class\*_denoised.laz ^
    -odir %f_out%\03_class ^
    -olaz ^
    -cores %cores%  

This is pretty straight forward:

  1. Set input
  2. Set output location
  3. specify .laz output
  4. specify number of cores to use

Before running the code enter a pause command on a new line. After it runs and your satisfied with the output, remove the pause command and highlight your code. Press Ctrl-Q to comment it out so it is not run again later.

7. lasheight

In order for us to generate wall-to-wall structural metrics of our lidar data we need to remove the influence of terrain. This is known as height normalization, where, simply described, all points classified as ground in the previous step are set to having a height of 0, so that all points are relative to a flat surface. This means that we can compare the height of vegetation on the top of a mountain to the a tree in a deep valley without the influence of terrain.

::Normalize height
lasheight -i %f_out%\03_class\*_denoised.laz ^
    -replace_z ^
    -odir %f_out%\04_norm ^
    -olaz ^
    -cores %cores%

Here we:

  1. Set input
  2. -replace_z - specifies that we replace the Z coordinate value to 0
  3. Set output directory
  4. Specify .laz output
  5. Specify number of cores

Before running the code enter a pause command on a new line. After it runs and your satisfied with the output, remove the pause command and highlight your code. Press Ctrl-Q to comment it out so it is not run again later.

Rasters

There are a number of standard raster products generated from lidar data. The most common ones include a digital terrain model, digital surface model, and canopy height model. Lets make those now.

Digital Terrain Model & Surface model

Creation of the terrain and surface model both use the non-normalized, classified pointcloud. To start, we use the blast2dem program to create rasters of each file. The program las2dem also exists, however blast was made to process very large data sets extremely efficiently.

We specify the points we want to use based on classification code (ground = 2), and the spatial resoltuion we want our raster to be (In this case I used 1 m).

:: make DTM
blast2dem -i %f_out%\03_class\*_denoised.laz ^
    -keep_class 2 ^
    -step 1 ^
    -use_tile_bb ^
    -odix _dtm ^
    -obil ^
    -odir %f_out%\05_raster\dtm ^
    -kill 200 ^
    -cores %cores%

Here we:

  1. Specify input data (denoised, classfied data)
  2. -keep_class 2 specify that we want to use only the points classified as ground (class 2)
  3. -step 1 specify we want a 1 m spatial resolution on outputs
  4. -use_tile_bb specify that the algorithm should use the bounding box of individual tiles for outputs
  5. odix _dtx append "_dtm" to ouputs
  6. -obil specify we want output files to be .bil (a raster format that is very storage efficient)
  7. specify output directory
  8. -kill 200 specify max distance for interpolation (200 m).
  9. Specify number of cores

Before running the code enter a pause command on a new line. After it runs and your satisfied with the output, remove the pause command and highlight your code. Press Ctrl-Q to comment it out so it is not run again later.

After this runs, you will have a folder full of .bil files for each individual lidar file. While this is great, ideally we want the files to be merged together so we have a seamless DTM. To do that we use lasgrid to merge all the .bil files together.

::merge DTM .bil files
lasgrid -i %f_out%\05_raster\dtm\*.bil -merged ^
    -step 1 ^
    -o %f_out%\05_raster\DTM_.tif

Here we:

  1. Specified input folder with .bil files
  2. Specify that all those .bil files need to be merged together
  3. Specify a single output file named DTM_.tif

Note that we output a single file using -o and multiple files using -odir

Before running the code enter a pause command on a new line. After it runs and your satisfied with the output, remove the pause command and highlight your code. Press Ctrl-Q to comment it out so it is not run again later.

After this finishes, open the file up in ArcMap, or in a imager viewer like Irfanview and see what it looks like. Instead of .tif you can also choose other output file types like .png or .bil.

Digital Surface model

The code for making the digital surface model is very similar to the terrain models so i paste it here with few details. The only real difference is we do not use -keep_class, which defaults the point selection to the maximum.

:: make DSM
blast2dem -i %f_out%\03_class\*_denoised.laz ^
    -step 1 ^
    -kill 200 ^
    -use_tile_bb ^
    -odix _dsm  ^
    -obil ^
    -odir %f_out%\05_raster\dsm ^
    -cores %cores%

::merge DSM .bil files
lasgrid -i %f_out%\05_raster\dsm\*.bil -merged ^
    -step 1 ^
    -o %f_out%\05_raster\DSM_.tif

Before running the code enter a pause command on a new line. After it runs and your satisfied with the output, remove the pause command and highlight your code. Press Ctrl-Q to comment it out so it is not run again later.

Canopy Height Model

The code for making a canopy height model is very similar to that of the DTM and DSM. The major difference is that we need to use the normalized point cloud data. The CHM is the rasterization of the maximum height for normalized point clouds. So we need to specify that we we have to use the max height. It is also common practice to use only the first returns of the data so we use a switch to specify that aswell.

:: make CHM
lasgrid -i %f_out%\04_norm\*.laz ^
    -max ^
    -first_only ^
    -step 1 ^
    -odir %f_out%\05_raster\chm ^
    -odix _chm ^
    -obil ^
    -cores %cores%
    
lasgrid -i %f_out%\05_raster\chm\*.bil -merged ^
    -step 1 ^
    -o %f_out%\05_raster\CHM_.tif

Here we:

  1. Specify normalized input data
  2. -max specify that we want to rasterize the maximum height of the data
  3. -first_only specify that we want to use only first returns
  4. -step 1 specify that we want a spatial resolution of 1 m for our rasters
  5. Specify output directory
  6. -odix _chm append "_chm" to file names
  7. -obil specify output as .bil
  8. Specify number of cores to use
  9. Merge all created .bil files into a seamless raster with 1 m spatial resolution called CHM_.tif

Before running the code enter a pause command on a new line. After it runs and your satisfied with the output, remove the pause command and highlight your code. Press Ctrl-Q to comment it out so it is not run again later.

Point Density Image

Using the details we learned from making the DTM, DSM, and CHM, we can also make rasters of other metrics like point density or structural metrics. I dont describe these in detail as we have covered alot fo this already, but run the code and see what the ouputs look like.

:: Compute density image
lasgrid -i %f_out%\04_norm\*.laz ^
    -step 1 ^
    -point_density_16bit ^
    -odir %f_out%\05_raster\den ^
    -odix _den  -obil ^
    -cores %cores%

:: tif
lasgrid -i %f_out%\05_raster\den\*.bil -merged ^
    -step 1 ^
    -o %f_out%\05_raster\density_allreturns_.tif

:: png      
lasgrid -i %f_out%\05_raster\den\*.bil -merged ^
    -step 1 ^
    -false -set_min_max 0 5 ^
    -o %f_out%\05_raster\density_allreturns_0_5_.png

Structural metrics in lastools is straightforward. We use the lascanopy program and specify the percentiles we want to calculate using the-p switch. Calcualting other metrics is possible, just consult the README.

:: Compute structural metrics
lascanopy -i %f_out%\04_norm\*.laz ^
    -merged ^
    -p 10 25 50 75 99 ^
    -step 20 ^
    -o %f_out%\05_raster\struc.tif