- Find resources on random forests
- Dive into data
- Set up weekly meetings
- Create weekly work schedule (i.e. timing of work sessions)
- Address expected data preprocessing challenges
- Hong Kong. Pick a city to start with.
- *Using contest data *
Get Landsat images from USGS Earth Explorer for that city on it's two dates - Using contest data.
Clip image to cover city -
Figure out what ENVI FLAASH isUsing contest data - Only for curiosity sake, using contest data.
Read about bilinear reampling with spatial data
- Solidify plan of attack for data preprocessing.
- Schedule Oral Exam
- Migrated to week 3.
Start writing background info/introduction - Go through 2+ random forest resources
- Data preprocessing steps
- Determine how you get raster data into R.
- How can you get the raster data to be a data frame (& does it even make sense to do that?). It's seriously just
as.data.frame(dat, xy=TRUE)
- Determine how you choose to only use certain bands as input data? Can you just filter it? Yes
- How do I know that the columns of my dataframe are actually what I'm looking for? Where are the bands? See next primary goal below (from meeting notes issue) for strategy
- Migrated to week 4.
How did they randomly and evenly divide the polygons? Test if time allows
- Attempt to get from Tiffs to a data frame appropriate for feeding into random forest function.
- Start references
- Finish data preprocessing
- Combine bands and get rid of ones not used in paper
- Migrated to week 4.
Figure out how to/how they randomly and evenly divide the polygons.
- Go through 2+ random forest resources
- What are the things/arguments that I control when I run that randomForest function?
- Start writing background info/introduction
- Fit Different Classification Schemes
- How did they randomly and evenly divide the polygons? Test.
- Go through 2+ random forest resources
- Start writing methods
- Randomly and Evenly Divide the Polygons.
- Fit Different Classification Schemes
- At least set up accuracy metrics (add calculation formats into R somewhere)
- Organize repo, especially docs section
- Cite some of claims in introduction
- Finish Methods Section
- Description of data (actual volume of data involved + how this data relates to the model)
- Explanation of how a typical decision tree is built
- Explain Gini Impurity & Information Gain. What "best" means
- Start preparing results figures
- Table/Plot that shows the OA over the tuning parameters tried
- Variable importance plot
- One complete map of predictions
- Which models look the best? Why?
- Create F1 metric function
- Write Results
- Write conclusion
- Final Evaluation of where things are at and setting what "finished" means by Thursday Meeting (18)
- Finish writing by Thursday Meeting (25th)
- Finish polishing figures
- Prepare Presentation
- Submit paper March 2
- Oral Exam
- All degree requirements submitted