workshop

View the Project on GitHub glburgin/workshop

Nieves Response

Upon reading Nieves, et.al, we gather that they are using the random forest machine learning technique (as previously mentioned by Stevens) to predict human population densities in thirty-two, low- and middle- income countries. The random forest method works by taking, “individual decision trees that are considered ‘weak learners’ and combin[ing] them to create a ‘strong learner’”. The technique uses ancillary and census data to then predict human population densities and allows for multiple machine learning methods to be combined and produce more accurate data. These random forests generate data in the form of decision trees and once the decision trees have grown to the desired capacity, their data is then taken and used or measured for error. Additionally, a dasymetric population allocation is when you take in geospatial information and then accurately distribute data to grid cells and selected boundaries. This goes hand in hand with the random forest technique. After gathering and examining the data, the topographical covariates were shown to be the most important in determining human population density in the thirty-two low/middle-income countries studied. More specifically, the top five most important covariates were, “urban/suburban extents (0.32), built environment and urban/suburban proxies (0.35), climatic/environmental variables (0.37), populated place covariates (0.42) and transportation networks (0.50).” Across the globe, the more important covariates were pertaining to the built environment and urban/suburban extents. This data gathered was shown to be more beneficial overall.