David Robinson of Stack Overflow came to speak at Metis on Thursday night. He was real and shared extensively about his experiences. I know most programmers find Stack Overflow to be essential when they hit a problem, so it was cool to hear his insights about the company. Here are some of the stuff he shared.
Favorite data science package? ggplot2 in R Advice for budding data scientists? Make public artifacts.
David did a PhD at Princeton in Computational Biology and was actually recruited by Stack Overflow because he was actively participating and answer question on the site. The question that got Stack Overflow’s attention was a stats question: “what is the intuition behind beta distribution”. David was able to explain it clearly with a baseball example.
Some other blogs that he recommended:
And some data scientists on twitter:
- Jeff Leek, Roger Peng and the Simply Statistics Blog
- Hadley Wickham, creator of ggplot2 in R
- Wes McKinney, creator of Pandas in Python
- Hilary Parker of Etsy
- And David’s own handle @drob
His advice for steps to take on a data science project?
- Start with a specific question
- Find appropriate data
- Look at a few examples
- Then scale
And here are some of my personal takeaways from this excellent talk:
- Setup profile on Stack Overflow’s career/job site
- Start contributing on Stack Overflow (starting from smaller tags)
- Keep writing this blog
- Fix up my Github and annotate my repeatable code clearly
- Make as much projects and code as I can public