NYGeog
About Code Data Maps Resume/CV Notes

Machine Learning Course Paper: Predicting Coreference labels from Spoken Dialogue Tasks with Machine Learning: One-Hot Encoding and Feature Importances using Extremely Randomized Trees

May 18, 2016

Recently, I had to compete in a Kaggle competition for our Machine Learning for Data Science course at Columbia University. I used Sci-Kit Learn and some Python coding to execute my algorithm. I thought I’d share a diagram to illustrate the steps executed and the paper writeup.

The flow: ml

The full paper text:

  • NYGeog

Geographer/GIS Analyst/Solutions Engineer/Data Scientist working at Carto (CartoDB). Formerly of Columbia University, AECOM. Studied at the Institute for Data Science and Engineering at Columbia University. Geospatial Python programmer. Geoprocessing, data-work, application development with ArcPy, Shapely, GDAL/OGR, FOSS4G, PostGIS, Pandas, Flask, AWS, CartoDB, Javascript, Flask and online/desktop mapping/GIS tools.