Using R and Python to get forest resource data through the EVALIDator API

October 25, 2023
forest inventory forest inventory and analysis EVALIDator analytics

If you work with forestry data across the United States, you’ve probably had to turn to Forest Inventory and Analysis (FIA) data at one time or another to look at the status and trends in our forest resources. The FIA database has information contained from over 130,000 forest inventory plots across the country.

One of the tools that analysts often use to access FIA data is the the EVALIDator web interface. EVALIDator is a web-based Application Programming Interface (API) that generates population estimates for core forestry metrics. It provides the ability to query FIA information at the state level or a circular retrieval of information around a fixed geographic point.

There was an excellent webinar offered in August hosted by the FIA National User Group that highlighted some new features available through EVALIDator. For a long while, analysts have used it’s web interface to point and click their way to estimating forest resources after selecting the forestry metrics and geographic scale they’re interested in. As a long-time EVALIDator user, this was a great way to access FIA data. But dealing with the output in an HTML format and getting it into different software to analyze was always an additional step.

I was happy to learn in the webinar about new ways to directly access FIA estimates through R and Python using the EVALIDator API. For me, this can save time and effort given most of my analyses, graphing, and reporting are done through R.

All of the documentation for these steps can be found in the FIADB-API & EVALIDator user documentation. I’ll go through a few examples where I access data from the state of Maine.

Using R to access FIA data at the state level

This first example will access the total amount of carbon stored in the live aboveground portions of trees growing on forestland in Maine, measured in metric tonnes. I’ll ask for estimates to be provided by forest type group and ownership.

First, I’ll use R and will load a few packages to access the data:

library(tidyverse)
library(httr)
library(jsonlite)
library(rlist)
library(knitr)

If you look at the bottom of the documentation webpage, you’ll see example use cases for extracting FIA estimates using R and Python. There are two examples for each that use GET and POST scripts. The GET scripts allow you to enter a complete URL if you know the attributes you’re interested in. The POST script, which I copy here with the fiadb_api_POST() function, allows you to directly input the variables you’re interested in in R:

# fiadb_api_POST() will accept a FIADB-API full report URL and return data frames
# See descriptor: https://apps.fs.usda.gov/fiadb-api/

fiadb_api_POST <- function(argList){
  # make request
  resp <- POST(url = "https://apps.fs.usda.gov/fiadb-api/fullreport", 
               body = argList, 
               encode = "form")
  # parse response from JSON to R list
  respObj <- content(resp, "parsed", encoding = "ISO-8859-1")
  # create empty output list
  outputList <- list()
  # add estimates data frame to output list
  outputList[['estimates']] <- as.data.frame(do.call(rbind,respObj$estimates))

  # if estimate includes totals and subtotals, add those data frames to output list
  if ('subtotals' %in% names(respObj)){
    subtotals <- list()
    # one subtotal data frame for each grouping variable
    for (i in names(respObj$subtotals)){
      subtotals[[i]] <- as.data.frame(do.call(rbind,respObj$subtotals[[i]]))
    }
    outputList[['subtotals']] <- subtotals

    # totals data frame
    outputList[['totals']] <- as.data.frame(do.call(rbind,respObj$totals))
  }

  # add estimate metadata
  outputList[['metadata']] <- respObj$metadata

  return(outputList)
}

The first item to know before accessing FIA data is the numeric code corresponding to the variable you’re interested in. In my case, this is snum = 98, corresponding to the variable that represents “Forest carbon pool 1: live aboveground, in metric tonnes, on forest land.” A note of caution: there are thousands of variables to choose from, but I suspect the most popular ones are listed toward the top of the page.

The second item to know is which wc code you’re interested in. I have no idea why it’s abbreviated “wc”, but it connects the state FIPS code with the 4-digit FIA inventory year. For example, I’m interested in wc = 232021 to obtain data from Maine’s (FIPS code 23) most recent inventory, collected in 2021. You could go back to inventories collected decades ago if you were interested in looking at trends, and those codes are described here.

Finally, you can use the rselected and cselected statements to identify the variables you’d like to group by in rows and columns. In our case this is forest type group and ownership group. I’ll obtain the data in an NJSON format, but you can also obtain the data in HTML, XML, or other formats. You can specify these parameters in arg_list and then store the data in a data frame called all_rows. I love how the data are presented in a long and “tidy” format:

arg_list_maine <- list(snum = 98,
               wc = 232021,
               rselected = 'Forest type group', 
               cselected = 'Ownership group - Major',
               outputFormat = 'NJSON')

# submit list to POST request function
post_data_maine <- fiadb_api_POST(arg_list_maine)

# estimate data frame
all_rows_maine <- post_data_maine[['estimates']]

print(all_rows_maine)
##     ESTIMATE                                    GRP1          GRP2 PLOT_COUNT
## 1    3907321 `0001 White \\/ red \\/ jack pine group  `0001 Public         22
## 2   34794633 `0001 White \\/ red \\/ jack pine group `0002 Private        248
## 3   14315831              `0002 Spruce \\/ fir group  `0001 Public        110
## 4  105408801              `0002 Spruce \\/ fir group `0002 Private       1022
## 5   62430.73 `0005 Loblolly \\/ shortleaf pine group `0002 Private          1
## 6   200511.4            `0018 Exotic softwoods group  `0001 Public          1
## 7   530919.7            `0018 Exotic softwoods group `0002 Private          5
## 8   823248.1                `0020 Oak \\/ pine group  `0001 Public          5
## 9   10605584                `0020 Oak \\/ pine group `0002 Private         73
## 10  992879.7             `0021 Oak \\/ hickory group  `0001 Public          8
## 11  11742843             `0021 Oak \\/ hickory group `0002 Private         85
## 12  409025.6  `0023 Elm \\/ ash \\/ cottonwood group  `0001 Public          7
## 13   6008678  `0023 Elm \\/ ash \\/ cottonwood group `0002 Private         85
## 14  14742242   `0024 Maple \\/ beech \\/ birch group  `0001 Public        111
## 15 136478067   `0024 Maple \\/ beech \\/ birch group `0002 Private       1244
## 16   3260893             `0025 Aspen \\/ birch group  `0001 Public         28
## 17  29090608             `0025 Aspen \\/ birch group `0002 Private        330
## 18  5419.386             `0030 Other hardwoods group  `0001 Public          2
## 19  734723.1             `0030 Other hardwoods group `0002 Private         26
## 20  2668.444                        `0034 Nonstocked  `0001 Public          1
## 21  56632.09                        `0034 Nonstocked `0002 Private         10
##          SE SE_PERCENT     VARIANCE
## 1  990560.1   25.35138 981209246740
## 2   2439110   7.010017 5.949256e+12
## 3   1347882   9.415325 1.816786e+12
## 4   3094724   2.935926 9.577318e+12
## 5  69482.77   111.2958   4827855237
## 6  197334.1   98.41544  38940765038
## 7    322112   60.67057 103756168849
## 8  383432.3   46.57554 147020293831
## 9   1405341   13.25095 1.974982e+12
## 10   376683   37.93843 141890069265
## 11  1392590   11.85905 1.939306e+12
## 12 210157.7   51.38008  44166257399
## 13 792561.9   13.19029 628154421011
## 14  1316509   8.930182 1.733196e+12
## 15  3598029   2.636342 1.294581e+13
## 16   719189   22.05497 517232753389
## 17  1931400   6.639256 3.730306e+12
## 18 4325.627   79.81766     18711045
## 19   191230   26.02749  36568920234
## 20 2380.922    89.2251      5668789
## 21 31032.09   54.79594    962990504

This makes the output easy to plot immediately. Here’s a graph of the output, where you can see the vast amount of carbon stored on private land in Maine, mostly in the spruce/fir and maple/beech/birch forest type groups:

ggplot(all_rows_maine, aes(x = as.character(GRP2), 
                           y = as.numeric(ESTIMATE))) +
  geom_bar(stat = "identity") +
  facet_wrap(~as.character(GRP1)) +
  labs(x = "Ownership",
       y = "Forest carbon (metric tonnes)")

Note you’ll need to play with the names of variables a bit to tidy them up, e.g., turning “`0001 Public” to simply “Public”. But the R functions allow you to quickly obtain the data of interest. You can also grab the subtotals of the output to sum all values within each forest type or ownership category:

subtotals_maine <- post_data_maine[['subtotals']]

print(subtotals_maine)
## $GRP1
##     ESTIMATE                                    GRP1 PLOT_COUNT       SE
## 1   38701954 `0001 White \\/ red \\/ jack pine group        269  2619361
## 2  119724633              `0002 Spruce \\/ fir group       1127  3346654
## 3   62430.73 `0005 Loblolly \\/ shortleaf pine group          1 69482.77
## 4   731431.1            `0018 Exotic softwoods group          6 377163.5
## 5   11428833                `0020 Oak \\/ pine group         78  1452041
## 6   12735722             `0021 Oak \\/ hickory group         92  1432194
## 7    6417703  `0023 Elm \\/ ash \\/ cottonwood group         92 815615.3
## 8  151220309   `0024 Maple \\/ beech \\/ birch group       1348  3791562
## 9   32351501             `0025 Aspen \\/ birch group        357  2057666
## 10  740142.5             `0030 Other hardwoods group         28 191278.9
## 11  59300.54                        `0034 Nonstocked         11 31123.29
##    SE_PERCENT     VARIANCE
## 1    6.768034 6.861054e+12
## 2    2.795292 1.120009e+13
## 3    111.2958   4827855237
## 4    51.56515 142252326043
## 5    12.70507 2.108422e+12
## 6    11.24549 2.051181e+12
## 7    12.70884 665228392598
## 8     2.50731 1.437594e+13
## 9    6.360341 4.233989e+12
## 10   25.84353  36587631279
## 11     52.484    968659293
## 
## $GRP2
##    ESTIMATE          GRP2 PLOT_COUNT      SE SE_PERCENT     VARIANCE
## 1  38660040  `0001 Public        274 1660378   4.294817 2.756855e+12
## 2 335513920 `0002 Private       2891 3375482   1.006063 1.139388e+13

Finally, you can grab the total number that sums all values. In this case, we learn that there’s about 374 million metric tonnes of carbon stored in the live aboveground portions of trees in Maine forests:

totals_maine <- post_data_maine[['totals']]
print(totals_maine)
##    ESTIMATE PLOT_COUNT      SE SE_PERCENT     VARIANCE
## 1 374173960       3143 3432307  0.9173025 1.178073e+13

Using R to access FIA data around a fixed geographic point

A favorite use of EVALIDator by many is the ability to query FIA data around a specific location. For example, a user can generate population estimates for a 50-mile radius around a proposed mill that uses wood.

Here’s an example that queries FIA data in a 25-mile radius surrounding Bangor, Maine. It estimates the total forestland area (snum = 2) by stand age class in 20-year increments and stand size class (large-, medium-, or small-diameter trees). The latitude and longitude coordinates are specified in the function:

arg_list_bangor <- list(lat = 44.809,
                        lon = -68.771, 
                        radius = 25,
                        wc = 232021,
                        snum = 2,
                        rselected = 'Stand-size class', 
                        cselected = 'Stand age 20 yr classes (0 to 100 plus)',
                        outputFormat = 'NJSON')

# submit list to POST request function
post_data_bangor <- fiadb_api_POST(arg_list_bangor)

# estimate data frame
all_rows_bangor <- post_data_bangor[['estimates']]

print(all_rows_bangor)
##    ESTIMATE                  GRP1               GRP2 PLOT_COUNT       SE
## 1   14155.6  `0001 Large diameter  `0002 21-40 years          4 7338.611
## 2  45473.05  `0001 Large diameter  `0003 41-60 years         10    15372
## 3  90436.76  `0001 Large diameter  `0004 61-80 years         18 22385.23
## 4    107109  `0001 Large diameter `0005 81-100 years         23 24368.78
## 5  41243.83  `0001 Large diameter   `0006 100+ years          7  15696.3
## 6  51611.41 `0002 Medium diameter  `0002 21-40 years         12 16379.57
## 7  113830.6 `0002 Medium diameter  `0003 41-60 years         25 24352.08
## 8  199754.6 `0002 Medium diameter  `0004 61-80 years         37 33455.07
## 9  103521.8 `0002 Medium diameter `0005 81-100 years         21 24195.48
## 10   5758.7 `0002 Medium diameter   `0006 100+ years          1 6011.665
## 11  19655.8  `0003 Small diameter   `0001 0-20 years          4 10104.88
## 12 90910.76  `0003 Small diameter  `0002 21-40 years         16 23393.71
## 13    40205  `0003 Small diameter  `0003 41-60 years          7 15304.59
## 14 13209.02  `0003 Small diameter  `0004 61-80 years          3 8759.869
## 15 21404.85  `0003 Small diameter `0005 81-100 years          3  11056.3
## 16 5305.999  `0003 Small diameter   `0006 100+ years          1 5045.363
##    SE_PERCENT   VARIANCE
## 1    51.84246   53855209
## 2    33.80465  236298512
## 3    24.75236  501098555
## 4    22.75138  593837455
## 5    38.05733  246373784
## 6    31.73633  268290228
## 7    21.39326  593023809
## 8    16.74808 1119241412
## 9    23.37235  585421221
## 10   104.3927   36140117
## 11   51.40919  102108689
## 12   25.73261  547265614
## 13   38.06637  234230329
## 14   66.31735   76735306
## 15   51.65323  122241669
## 16    95.0879   25455691

You can plot these data directly to visualize the trends within the 25-mile radius:

ggplot(all_rows_bangor, aes(x = as.character(GRP2), 
                            y = as.numeric(ESTIMATE),
                            fill = as.character(GRP1))) +
  geom_bar(stat = "identity") +
  labs(x = "Stand age",
       y = "Forestland area (acres)") +
  theme(axis.text.x = element_text(angle = 45, hjust = 1))

Conclusion

The new features available in the EVALIDator API that allow you to access data through R or Python can help to speed up data analysis using FIA data. You can access the entire history of FIA data and analyze forest resources data by state or a circular retrieval from a fixed geographic point. Special thanks to the USDA Forest Service staff that have made this available!

By Matt Russell. Email Matt with any questions or comments.

Preliminary thoughts on using the new National Scale Volume and Biomass Estimators

February 26, 2024
forest inventory and analysis forest carbon forest inventory forest measurements NSVB FIA analytics

A quick and easy way to estimate forestland area change across US states

February 16, 2024
forest inventory forest ownership forest inventory and analysis forestland area FIA

A list of R packages for forestry applications

November 24, 2023
analytics R R packages statistics data science forestry