-------------------
GENERAL INFORMATION
-------------------


1. Title of Dataset 
Data for: Tree-planting programs in Himachal Pradesh India 2019

2. Author Information


  Principal Investigator Contact Information
        Name: Pushpendra Rana
           Institution: Indian Forest Service
           Address: Himachal Pradesh Forest Department
           Email: pranaifs27@gmail.com
	   ORCID: 0000-0001-8626-3351

  Associate or Co-investigator Contact Information
        Name: Forrest Fleischman
           Institution: University of Minnesota	
           Address: Department of Forest Resources
           Email: ffleisch@umn.edu
	   ORCID: 0000-0001-6060-4031

  Associate or Co-investigator Contact Information
           Name: Vijay Ramprasad
           Institution: University of Minnesota	
           Address: Department of Forest Resources
           Email: vrampras@umn.edu
	   ORCID: 0000-0003-2636-0090

 Associate or Co-investigator Contact Information
           Name: Kangjae Lee
           Institution: University of Seoul
           Address: 163, Seoulsiripdae-ro, Dongdaemun-gu
           Email: kasbiss@gmail.com
	   ORCID:0000-0002-2857-6496
 
3. Date of data collection (single date, range, approximate date) <suggested format YYYYMMDD>

June 2019 to Oct 2019

4. Geographic location of data collection (where was data collected?): 

Plantation and forest polygon data was requested and publicly released from the Himachal Pradesh Forest Department. All other data is open access public data. 

5. Information about funding sources that supported the collection of the data:
 
The participation of PR (a portion of his time), FF and VR on this project was funded by a grant from the NASA LCLUC program (NNX17AK14G). 



--------------------------
SHARING/ACCESS INFORMATION
-------------------------- 


1. Licenses/restrictions placed on the data:
CC0 1.0 Universal (CC0 1.0): Public Domain Dedication 

2. Links to publications that cite or use the data:
Publication under review

3. Links to other publicly accessible locations of the data:
NA

4. Links/relationships to ancillary data sets:
NA

5. Was data derived from another source?
           If yes, list source(s): Part of this data is derived from the Himachal Pradesh Forest Department's plantation database


6. Recommended citation for the data:

Rana, Pushpendra, Fleischman, Forrest, Ramprasad Vijay, Lee Kangjae 2020. Data for: Tree-planting programs in Himachal Pradesh India 2019.  Retrieved from the Data Repository for the University of Minnesota, https://doi.org/10.13020/8x0d-gb23.   


---------------------
DATA & FILE OVERVIEW
---------------------


1. File List
   A. Filename:     forest_polygons_data_2019   
      Short description:        Comprises 16,674 forest polygons, and covers 33 forest divisions of Himachal Pradesh

   B. Filename:     Test_data_2147plantations_2019  
      Short description:        Comprises 2,147 forest polygons where plantations happened from 1st Jan, 2016 to 31st July, 2019
        
   C. Filename:     plantation_prediction_RcodeSubmitted.R
      Short description:        R code for data analysis and processing

--------------------------
METHODOLOGICAL INFORMATION
--------------------------


1. Description of methods used for collection/generation of data: 
<Include links or references to publications or other documentation containing experimental design or protocols used in data collection>

 Overview of the prediction procedure
We develop a predictive algorithm that forecasts three potentials: [1] probabilities of tree cover loss based on [2] fit of an area for plantation activity, both of which assist in estimating [3] wasteful expenditure. We build this algorithm using data on government tree plantations in the western Indian Himalayan state of Himachal Pradesh, which has experienced decades of afforestation programs within a heavily forested landscape. This region provides a wide range of plantation areas in varying biophysical contexts to test the efficacy of the algorithm for predicting tree cover loss and fit. For example, elevations range from 350 to 6975 meters, and rainfall varies from highs of 1035 mm in lower slopes and plains to low 395 mm in high altitude deserts in the rain shadow of the Great Himalaya range1. Himachal Pradesh has also spent an estimated US $248.24 million on afforestation since 2002, covering an area of 236,686 Ha 2 making it an excellent location to study the effectiveness of afforestation expenditures (Supplementary Fig. 3 and 4). Specifically, we apply machine-learning to 16,674 georeferenced forest polygons to predict probable tree cover loss for each polygon. 
We use tree cover loss as a proxy measure for evaluating plantation survival potential. We find tree cover loss a useful measure because 1) it can be used in more generalized contexts worldwide, 2) reflects the presence of enabling site conditions that support tree establishment, 3) captures management practices and human use effectively on a large scale. 

Data and Variables
We apply machine-learning to 16,674 georeferenced forest polygons to predict probable tree mortality for each forest polygon, and we operationalize tree cover loss as the decline in tree canopy cover observed between 2003 and 2015 using Forest Survey of India data 3. Himachal Pradesh Forest Department GIS Lab has provided all 18,672 forest polygons belonging to 33 forest divisions out of the total 43 forest divisions. We removed 1998 polygons due to missing data. 
According to Himachal Pradesh Forest Department records, there were 2809 plantations planted during 2016 to 2019. Out of these total plantations, 785 plantations have missing data for afforestation spending. For this reason, we could use 2024 plantations with complete budget information for comparing predicted plantation mortality with afforestation spending for the purposes of this study. We mention here that such kinds of budgetary documents are beyond the reach of researchers and are kept in total secrecy.  The expenditure data on studied plantations became public only when a local Member of Legislature (MLA) asked a question on the floor of Himachal Pradesh Legislative Assembly. Our study shows how availability of similar data in other states and in other part of the world can make analysis of tree-planting efforts more effective. We make a call for such data to be made available more widely, in India and across the world. 
This dataset does not contain any plantation carried out in cold and dry desert regions of Himachal Pradesh, although we do know that such plantations were carried out from other data. For example, trees were planted in Loser beat (8 Ha), Pagma beat (3 ha) and Kee beat (4.41 ha) in Spiti Valley, a cold desert mountain valley of Himachal Pradesh in 20164. No exact boundaries for these 2024 tree plantations exist, which can introduce some error in the analysis. We, however, have spatial boundaries of Forest Department forest polygons within which these plantations occur. We believe that predicting the tree cover loss in the entire forest polygon reasonably predicts the tree cover loss in plantation area (or the plantation mortality). 
Tree cover loss estimates in studied forest polygons are constructed based on fit of plantation activity to the area that vary in dependence, soil and biophysical characteristics, canopy cover before planting activity, and management practices. In the model, we included data on population, forest dependents, farmers, literates, road density, grazing density and economic activity as indicative of higher forest dependence. Data on these social indictors were calculated based on values of census villages that fell within forest polygon under study. Values for population, forest dependents, farmers, literates were summed up, whereas values for road density, grazing density and economic activity across villages were averaged within forest polygons. Baseline data on forest cover, cropland, grassland and bare-land area within each forest polygon was also included. Soil quality factors included in the model are soil depth, soil carbon, soil organic carbon, bulk density, cation exchange capacity, soil PH and available soil water capacity. In addition, we included information on altitude, slope, area, precipitation, temperature and forest fires in the predictive model. More details about the model predictors are provided in the Supplementary Table 1. 
We find that our calculated predicted tree cover loss varies in expected ways with individual forest polygon characteristics. For example, we found our predicted tree cover loss probabilities to vary linearly with the proportion of area under southern aspect in each plantation polygon (Supplementary Fig. 5). Areas on the southern aspect have direct exposure to sun and therefore, lack adequate moisture to support any long-term tree growth. Moreover, the performance of our algorithm is comparable to other recent prediction algorithms that explain social-ecological phenomenon such as poverty 5,6. 

Fitting the ensemble predictor
We use an ensemble of Extreme Gradient Boosting, Random Forest and Naïve Bayes to generate tree cover loss predictions for studied forests (n= 16,674). In the model, we assign tree cover loss as positive and tree cover gain as negative values, and then randomly split the data into a “training” dataset (70%) and a “test” dataset (30%). We develop the predictive algorithm for the training dataset and then, use the resulting algorithm to generate tree cover loss predictions for the test dataset. 
In the model, we use 10-fold cross validation on the training dataset using three different models (Extreme Gradient Boosting, Random Forest and Naïve Bayes). We center and scale the variables, reduce multi-dimensionality of the algorithm using principal component analysis (PCA), exclude near-zero variance and highly correlated predictors to enhance the performance of algorithm. We also optimize ROC (Receiver Operating Characteristics) for our three machine-learning models. 
Our chosen parameters for each model are:
(i)	eXtreme Gradient Boosting: We use 10-fold cross validation and ROC is used to select the optimal model using the largest value (0.64). Sensitivity of the model is 0.78. The final values for the selected model includes: nrounds = 100, max_depth = 2; eta = 0.3; colsample_bytree = 0.8; min_child_weight = 1 and subsample = 0.75. 
(ii)	Random forest: The model include 10-fold cross validation. ROC was used to select the optimal model using value (0.64). Sensitivity is 0.75. Final mtry =2. 
(iii)	Naïve Bayes: We use 10-fold cross validation. ROC is used as a parameter to select the model with the largest value (0.62). We obtain sensitivity as 0.73. The final values of the model include: laplace =0; usekernel = TRUE and adjust =1.
Then, we train a stacked ensemble model on these three meta-models with a boosted decision-tree algorithm with the objective of maximizing recall. Our model put more value on recall as missing a true positive (tree cover loss) may lead to serious ramifications for biodiversity and forest cover in the area. Our chosen stacked ensemble model resulted in higher values for balanced accuracy (unbalanced nature of our test set), recall and specificity. 

The chosen model parameters include:
Stacked Ensemble Model: Predictive accuracy is 64% (95% Confidence intervals: 62 to 65%). Kappa = 0.24; Sensitivity = 0.74; Specificity = 0.50; Precision = 0.66; Recall = 0.74; F1 = 0.69. 

Finally, we use our selected ensemble model to estimate predicted tree cover loss probabilities for a new set of 2024 plantation polygons (planted between January, 2016 and July, 2019) and compare these predicted probabilities with afforestation spending and tree canopy densities. 

We also created an interpolated tree cover loss probabilities using predicted tree cover loss probabilities of 2024 plantation polygons using Kriging. We used Ordinary Kriging with stable prediction model in Geostatistical Analyst tool in ArcMap (10.7.1) to generate the interpolated tree cover loss as shown in Fig. 2 (b). We chose the model on the basis of normality and anisotropy parameters. Our Kriging model semivariogram has 12 number of lags with a lag size of 651.27 meters with a standard neighborhood type (max neighbors = 4, minimum neighbors =2). The prediction model has a root mean square error of 0.14 and a root mean square standardized error of 1.01. 

More details about the model predictors are provided below, and this information is also included in the Supplementary Table, which is attached to the DRUM record.


Number of households: 	Total number of HHs in villages that are inside a forest polygon 	Census (2001), India, http://censusindia.gov.in/ 

Total population	Total population of the villages that fall inside a forest polygon	Census (2001), India, http://censusindia.gov.in/ 

Number of cultivators (farmers)	Total number of farmers in villages that fall inside a forest polygon	Census (2001), India, http://censusindia.gov.in/ 

Scheduled caste population	Total number of total SC population in villages that fall inside a forest polygon	Census (2001), India, http://censusindia.gov.in/ 

Total number of literates	Total number of literates in villages that fall inside a forest polygon	Census (2001), India, http://censusindia.gov.in/ 

Total marginal workers	Total number of marginal workers in villages that fall inside a forest polygon	Census (2001), India, http://censusindia.gov.in/ 

2003–2008, 0.56 km spatial resolution	1 to 63 (values) (Average for villages that fall inside a forest polygon)	Version 4 DMSP-OLS Nighttime Lights Time Series 

Road density,	Km/km2 (Average for villages that fall inside a forest polygon)	CIESIN (Data Center in NASA's Earth Observing System Data and Information System (EOSDIS)) (https://sedac.ciesin.columbia.edu/data/sets/browse )

Number of small land-holdings less than 0.5 ha	Number of smallholdings less than 0.5 ha in Census Tehsils where that forest polygon falls. 	Agricultural census (2005), India 

Grazing density	Number of grazing animals (buffaloes, goats, sheep, cattle)/area of the tehsil in ha	Number/ha (Average for villages that fall inside a forest polygon)	Livestock census (2007), India 

Area of the forest polygon	ha	Forest records, HP Forest Department, India

Area under crop acreage 	2000, 30 m resolution	ha	J. Chen, J. Chen, A. Liao, X. Cao, L. Chen, X. Chen, C. He, G. Han, S. Peng, M. Lu, Global land cover mapping at 30 m resolution: A POK-based operational approach. ISPRS Journal of Photogrammetry and Remote Sensing. 103, 7–27 (2015).

Area under grass coverage	2000, 30 m resolution	ha	W. R. Wieder, J. Boehnert, G. B. Bonan, M. Langseth, Regridded harmonized world soil database v1. 2. Data set. Available on-line [http://daac. ornl. gov] from Oak Ridge National Laboratory Distributed Active Archive Center, Oak Ridge, Tennessee, USA (2014).

Area under bare land acreage	2000, 30 m resolution	ha	(Weider et al 2014)

Soil depth 	2000, reference soil depth, average	cm	(Weider et al 2014)

Available soil water capacity	2000, available soil water storage capacity, average	Coded values 1 to 7; 1 = 15 cm water per m of the soil unit, 2 = 12.5 cm, 3 = 10 cm, 4 = 7.5 cm, 5 = 5 cm, 6 = 1.5 cm, 7 = 0 cm.	(Weider et al 2014)

Topsoil Carbon Content	Topsoil and subsoil carbon content (T_C and S_C) are based on the carbon content of the dominant soil type in each regridded cell rather than a weighted average.	kg C m-2	(Weider et al 2014)

Subsoil Carbon Content		kg C m-2	(Weider et al 2014)

Topsoil Organic Carbon		% weight	(Weider et al 2014)

Subsoil Organic Carbon		% weight	(Weider et al 2014)

PH (Top Soil)	Topsoil pH (in H2O)	-log(H+)	(Weider et al 2014)

Top Soil Bulk Density	Reference bulk density values are calculated from equations developed by Saxton et al. (1986) that relate to the texture of the soil only. 	kg dm-3	(Weider et al 2014)

Top Soil Cation Exchange Capacity 	Cation exchange capacity of the clay fraction in the topsoil	cmol per kg	(Weider et al 2014)

Sub Soil Cation Exchange Capacity	Cation exchange capacity of the clay fraction in the subsoil		cmol per kg	(Weider et al 2014)

Location (altitude)
	2000, 90 m resolution	m	SRTM (Shuttle Radar Topography Mission), 90 m resolution, 2000 SRTM 90m Digital Elevation Database v4.1. CGIAR-CSI (2017), (available at https://cgiarcsi.community/data/srtm-90m-digital-elevation-database-v4-1/).

Slope	2000, 90 m resolution	degree	SRTM (Shuttle Radar Topography Mission), 90 m resolution, 2000 SRTM 90m Digital Elevation Database v4.1. CGIAR-CSI (2017), (available at https://cgiarcsi.community/data/srtm-90m-digital-elevation-database-v4-1/).

Baseline forest cover 2003, 24 m resolution 	Forest cover = Open forest + Moderately dense forest + Very dense forest	Forest Survey of India, 2005 http://www.fsi.nic.in/publications

Number of forest fires	2003–2008	Number	NASA, active fire data, MODIS C6 FIRMS, (available at https://firms.modaps.eosdis.nasa.gov/map).

Temperature	2001–2008, 30 km resolution, average 	°C 	CRU (Climatic Research Unit) TS dataset, version 4.0,  gridded dataset of monthly terrestrial surface climate
http://www.cru.uea.ac.uk/  
Precipitation 	2001–2008, 30 km resolution, average 	mm 	CRU (Climatic Research Unit) TS dataset, version 4.0, gridded dataset of monthly terrestrial surface climate
http://www.cru.uea.ac.uk/ I. Harris, P. D. Jones, T. J. Osborn, D. H. Lister, Updated high-resolution grids of monthly climatic observations – the CRU TS3.10 Dataset. International Journal of Climatology. 34, 623–642 (2014)

Land surface temperature 	2001–2008, 5.5 km spatial resolution, average 	K 	MODIS/Aqua Land Surface Temperature/Emissivity Monthly L3 Global CMG V005 Global Change Master Directory (GCMD), (available at https://gcmd.gsfc.nasa.gov/).

Outcomes (O)			
Tree cover loss/Mortality 

	24 m resolution
FC_CHANGE15_03 =  FC_2015HA – FC_2003HA	If FC_CHANGE15_03< 0, MORTALITY = 1, OTHERWISE = 0	Forest Survey of India (2005); Forest Survey of India (2017) 


2. Methods for processing the data: <describe how the submitted data were generated from the raw or collected data>



3. Instrument- or software-specific information needed to interpret the data:

R statistical software, Excel

4. Standards and calibration information, if appropriate:


5. Environmental/experimental conditions:


6. Describe any quality-assurance procedures performed on the data:


7. People involved with sample collection, processing, analysis and/or submission:
All coauthors and no one else.



-----------------------------------------
DATA-SPECIFIC INFORMATION FOR: forest_polygons_data_2019
-----------------------------------------
<data belong to 16,674 georeferenced forest polygons, 33 forest divisions in Himachal Pradesh, India>


1. Number of variables: 33


2. Number of cases/rows: 16,674


3. Missing data codes:
        Code/symbol        Definition
        Code/symbol        Definition


4. Variable List
<Note: Sources for each of the variable listed below is given in the publication under review - link...>


Column: Variable Name, description, unit of measurement

A: COMPT, forest polygon ID, 
Unique ID number assigned to each forest polygon	
				
B: CompAreaGISha	
Area of forest polygon,ha
				
C: Number_of_Households	
Number of households,Total number of HHs in villages that are inside a forest polygon 
				
D: Total_Population
Total population, Total population of the villages that fall inside a forest polygon
				
E: Forest_Dependents	
Number of marginal people (scheduled caste population),Total number of total SC population in villages that fall inside a forest polygon
				
F: Literates	
Number of literates,Total number of literates in villages that fall inside a forest polygon
				
G: Number_of_Farmers
Number of cultivators (farmers)	Total number of farmers in villages that fall inside a forest polygon
				
H: Number_of_Unemployed_Persons
Total marginal workers	Total number of marginal workers in villages that fall inside a forest polygon
				
I: Grazing_animals_density
Number of grazing animals (buffaloes, goats, sheep, cattle)/area of the tehsil in ha, Number/ha (Average for villages that fall inside a forest polygon)
				
J: Number_of_Smallholdings
Number of small land-holdings less than 0.5 ha	Number of smallholdings less than 0.5 ha in Census Tehsils where that forest polygon falls. 
				
K: Altitude	
Location (altitude), 2000, 90 m resolution, m
				
L: Slope	
Slope, 2000, 90 m resolution, degree
				
M: Av_temp03_08
Temperature, 2001–2008, 30 km resolution, average, °C 
				
N: Av_preci03_08	
Precipitation, 2001–2008, 30 km resolution, average, mm 
				
O: Av_lst03_08
Land surface temperature, 2001–2008, 5.5 km spatial resolution, average, K 
				
P: Av_nit03_08	
Economic activity, 2003–2008, 0.56 km spatial resolution, 1 to 63 (values) (Average for villages that fall inside a forest polygon)
				
Q: AvailableSWC	
Available soil water capacity, 2000, average, Coded values 1 to 7; 1 = 15 cm water per m of the soil unit, 2 = 12.5 cm, 3 = 10 cm, 4 = 7.5 cm, 5 = 5 cm, 6 = 1.5 cm, 7 = 0 cm.
				
R: Soil_depth	
Soil depth, 2000, reference soil depth, average, cm
				
S: TopSoil_Carbon	
Topsoil Carbon Content, based on the carbon content of the dominant soil type in each regridded cell rather than a weighted average, kg C m-2
				
T: SubSoil_Carbon	
Subsoil Carbon Content, based on the carbon content of the dominant soil type in each regridded cell rather than a weighted average, kg C m-2
				
U: TopSoil_OC	
Topsoil Organic Carbon, % weight
				
V: SubSoil_OC	
Subsoil Organic Carbon, % weight
				
W: TopSoil_PH	
PH (Top Soil), Topsoil pH (in H2O), -log(H+)
				
X: TopSoil_BulkDen	
Top Soil Bulk Density, Reference bulk density values are calculated from equations developed by Saxton et al. (1986) that relate to the texture of the soil only, kg dm-3
				
Y: TopSoil_CEC	
Top Soil Cation Exchange Capacity, Cation exchange capacity of the clay fraction in the topsoil, cmol per kg
				
Z: SubSoil_CEC	
Sub Soil Cation Exchange Capacity, Cation exchange capacity of the clay fraction in the subsoil, cmol per kg
				
AA: number_of_fires	
Number of forest fires, 2003–2008, Number
				
AB: Road_Density	
Road density, Km/km2 (Average for villages that fall inside a forest polygon)
				
AC: GL2000_Crop_area	
Area under crop acreage	2000, 30 m resolution, ha
				
AD: GL2000_Grass_area	
Area under grass coverage, 2000, 30 m resolution, ha
				
AE: GL2000_Bareland_area	
Area under bare land acreage, 2000, 30 m resolution, ha
				
AF: FC_2003HA	
Baseline forest cover, 2003, 24 m resolution, Forest cover = Open forest + Moderately dense forest + Very dense forest
				
AG: MORTALITY	
Tree cover loss/Mortality, 24 m resolution, If FC_CHANGE15_03< 0, MORTALITY = 1, OTHERWISE = 0; FC_CHANGE15_03 = FC_2015HA – FC_2003HA



----------------------------------------
DATA-SPECIFIC INFORMATION FOR: Test_data_2147plantations_2019
-----------------------------------------
<Plantation test data: 2147 plantation polygons (planted between January, 2016 and July, 2019). Budget data matched for 2024 plantations for analysis in paper.. >


1. Number of variables: 32


2. Number of cases/rows: 2147




3. Missing data codes:
        Code/symbol        Definition
        Code/symbol        Definition


4. Variable List
<Note: Sources for each of the variable listed below is given in the publication under review - link...>


Column: Variable Name, description, unit of measurement

A: PlantationID
Unique ID number assigned to each forest plantation (n=2024)	
				
B: CompAreaGISha	
Area of forest polygon,ha
				
C: Number_of_Households	
Number of households,Total number of HHs in villages that are inside a forest polygon 
				
D: Total_Population
Total population, Total population of the villages that fall inside a forest polygon
				
E: Forest_Dependents	
Number of marginal people (scheduled caste population),Total number of total SC population in villages that fall inside a forest polygon
				
F: Literates	
Number of literates,Total number of literates in villages that fall inside a forest polygon
				
G: Number_of_Farmers
Number of cultivators (farmers)	Total number of farmers in villages that fall inside a forest polygon
				
H: Number_of_Unemployed_Persons
Total marginal workers	Total number of marginal workers in villages that fall inside a forest polygon
				
I: Grazing_animals_density
Number of grazing animals (buffaloes, goats, sheep, cattle)/area of the tehsil in ha, Number/ha (Average for villages that fall inside a forest polygon)
				
J: Number_of_Smallholdings
Number of small land-holdings less than 0.5 ha	Number of smallholdings less than 0.5 ha in Census Tehsils where that forest polygon falls. 
				
K: Altitude	
Location (altitude), 2000, 90 m resolution, m
				
L: Slope	
Slope, 2000, 90 m resolution, degree
				
M: Av_temp03_08
Temperature, 2001–2008, 30 km resolution, average, °C 
				
N: Av_preci03_08	
Precipitation, 2001–2008, 30 km resolution, average, mm 
				
O: Av_lst03_08
Land surface temperature, 2001–2008, 5.5 km spatial resolution, average, K 
				
P: Av_nit03_08	
Economic activity, 2003–2008, 0.56 km spatial resolution, 1 to 63 (values) (Average for villages that fall inside a forest polygon)
				
Q: AvailableSWC	
Available soil water capacity, 2000, average, Coded values 1 to 7; 1 = 15 cm water per m of the soil unit, 2 = 12.5 cm, 3 = 10 cm, 4 = 7.5 cm, 5 = 5 cm, 6 = 1.5 cm, 7 = 0 cm.
				
R: Soil_depth	
Soil depth, 2000, reference soil depth, average, cm
				
S: TopSoil_Carbon	
Topsoil Carbon Content, based on the carbon content of the dominant soil type in each regridded cell rather than a weighted average, kg C m-2
				
T: SubSoil_Carbon	
Subsoil Carbon Content, based on the carbon content of the dominant soil type in each regridded cell rather than a weighted average, kg C m-2
				
U: TopSoil_OC	
Topsoil Organic Carbon, % weight
				
V: SubSoil_OC	
Subsoil Organic Carbon, % weight
				
W: TopSoil_PH	
PH (Top Soil), Topsoil pH (in H2O), -log(H+)
				
X: TopSoil_BulkDen	
Top Soil Bulk Density, Reference bulk density values are calculated from equations developed by Saxton et al. (1986) that relate to the texture of the soil only, kg dm-3
				
Y: TopSoil_CEC	
Top Soil Cation Exchange Capacity, Cation exchange capacity of the clay fraction in the topsoil, cmol per kg
				
Z: SubSoil_CEC	
Sub Soil Cation Exchange Capacity, Cation exchange capacity of the clay fraction in the subsoil, cmol per kg
				
AA: number_of_fires	
Number of forest fires, 2003–2008, Number
				
AB: Road_Density	
Road density, Km/km2 (Average for villages that fall inside a forest polygon)
				
AC: GL2000_Crop_area	
Area under crop acreage	2000, 30 m resolution, ha
				
AD: GL2000_Grass_area	
Area under grass coverage, 2000, 30 m resolution, ha
				
AE: GL2000_Bareland_area	
Area under bare land acreage, 2000, 30 m resolution, ha
				
AF: FC_2003HA	
Baseline forest cover, 2003, 24 m resolution, Forest cover = Open forest + Moderately dense forest + Very dense forest
			
----------------------------------------
DATA-SPECIFIC INFORMATION FOR: plantation_prediction_Rcode
-----------------------------------------
<R code used in the analysis>