Ask how you want your data - and be specific

October 7, 2020
analytics data management Communicating data forest measurements

I have been analyzing the submissions to the University of Minnesota Extension’s Tree and Woodland Carbon Capture Challenge. The Challenge is an activity that asked Minnesotans to measure trees and woodlands to observe how much carbon they sequester over a year.

The audience that participated in this activity included homeowners and private woodland owners with a love for trees. In total, 26 people measured an individual tree and typed their measurements into a Google Form answering the question “What is the diameter of your tree, measured in inches?”

Here are the responses I received:

Entries in response to the question: What is the diameter of your tree?

I received answers that mixed decimals and fractions, had different significant digits, and mixed character and numeric values. Needless to say, the data were messy and difficult to immediately analyze.

The take-home message is to be specific about how you ask for data and give good examples of what you expect. I did not do this for the activity, and while the quality of the data were good, it’s taking more time to cleaning and reorganize the data.

I could have used data validation features in the Google Form I provided. I could have gave an example like “Measure and enter data for your tree to the nearest tenth of an inch, for example, enter”10.7" for a tree that is 10.7 inches in diameter at breast height.

Audience matters too. If I asked professional foresters the same question, I would have likely received a data set of diameters measured to the nearest tenth of an inch, without me asking specifically for it.

This activity is a reminder to know your audience when asking for data. Automating a data analysis workflow by using data validations can benefit the data analyst and lead to fewer problems when data are ultimately used to make decisions.

By Matt Russell. Sign up for my monthly newsletter for in-depth analysis on data and analytics in the forest products industry.

Preliminary thoughts on using the new National Scale Volume and Biomass Estimators

February 26, 2024
forest inventory and analysis forest carbon forest inventory forest measurements NSVB FIA analytics

A list of R packages for forestry applications

November 24, 2023
analytics R R packages statistics data science forestry

A handy R function for getting ecological division from FIA data

November 18, 2023
data management forest carbon forest inventory and analysis forest inventory ecodivision