Project 4 at Metis is focused around unsupervised machine learning and
Natural Language Processing. But we also incorporate NoSQL for our data storage,
Flask as a Python based web framework, and visualization with D3.
But what topic should I work on? I really enjoy doing crossword puzzles and I’ve
been working on them in the mornings before Metis to get my brain started.
Maybe I can combine crossword puzzles with data science?!
So this is the final week of Metis and I’m busy working on the final project
with crossword puzzles, but I also had some time to reflect of my journey so far.
I’ve put in a lot over these 12 weeks, worked really hard, learned a lot, and
have some great projects to demonstrate. But at the same time I feel like I
know less about data science today then when I did coming in. Sounds bad but
it’s good. Here’s what I mean.
Another 2 weeks has passed swiftly at Metis. We worked on and presented our 4th
project around natural laguage processing and unsupervised machine learning, and
we learned the basics of distributed computing with Hadoop, MapReduce, Hive, and
Spark. Here’s a summary of what I worked on and some of my thoughts:
On Wednesday, I went to NYC Open Data Meetup where Dr. Kirk Borne spoke. This
guy’s literally a genius. Right now he’s the Principal Data Scientist at Booz
Allen Hamilton. He has a PhD in Astrophysics, taught at George Mason University
and worked with large scale data systems for decades, including 18 at NASA. He
talked extensively about data science, but here’s a snippet of my notes: