The Yhat Blog


machine learning, data science, engineering


Machine Learning and Data Science Resources You Should Know About

by Elise |



Mental Kaleidoscope


If you're reading this, you already know (or could reasonably conclude by powers of deduction) that we (Yhat) have a blog. The tagline of our blog is simple, machine learning, data science, engineering. Those are the things our team writes about a few times a week.

We like to think we have some pretty good ideas (flying a semi-autonomous drone around the office, for example), but those ideas are really just some combination of the team's thoughts, our reader's ideas, and what we read and steal from all of our favorite bookmarked sites/newsletters/blogs and community forums.

In the words of Austin Kleon (a very cool writer/artist), "Every new idea is just a mashup or a remix of one or more previous ideas."

Not surprisingly, someone clever said something similar over a century ago. "There is no such thing as a new idea. It is impossible. We simply take a lot of old ideas and put them into a sort of mental kaleidoscope..."

All that to say, here's some of the places we go to fill our mental kaleidoscopes.

Community Forums


Reddit

The first place I go to get a pulse for what's happening in ML and data science is Reddit. Some subreddits (how Reddit entries are organized) have a bad reputation for being troll-y or heartless, but IMO the community is honest and engaged at least for the pages that I frequent.

Yes, users get rather opinionated, but good posts make their way to the top via upvotes and the comment threads often have a lot of really good feedback.

  • Python Subreddit Great for staying up to date about new libraries/releases, tutorials and blogposts make their way here, plus lots of community support/active discussion here if you have a python question

  • Machine Learning Subreddit Smaller audience, slightly more academic feeling than the Python subreddit, lots of good arxiv papers, TensorFlow and Deep Learning are all the rage right now

  • R Subreddit For whatever reason, the R community on Reddit is smaller and less engaged than Python. TBH I don't check this one too often, though you can find the occasional gem.

Upside: Fastest moving channel I've found. Very specific thanks to good subreddits. Downside: If you post poo, you will be made aware of it, rather publicly.

DataTau

Obvious one to bookmark. Datatau is a Hacker News for data scientists started by a then grad student, Rohit Sivaprasad, in 2013.

In his words, "I want people — if they work on something cool in data science, I want them to post it there."

Upside: Short titles. Quick scan of what's happening in data science. Downside: It hurts my eyes. Not very discussion oriented.

HackerNews

Oldie, but goodie. Hacker News has been around since 2007. It's run by Paul Graham's startup incubator, Y Combinator (Yhat was YC Class of Winter 2015), and is a social news website focusing on computer science and entrepreurship. Well, sort of, anyway. "Anything that gratifies one's intellectual curiosity" can be submitted.

Upside: Broad and interesting. Downside: Easy to lose an afternoon to it. Less focused than the other sources I've mentioned.

Newsletters


DataScience Weekly

I love Thursdays. DataScience Weekly is a no-fluff newsletter with the editors' (Hannah & Sebastian-lovely folks) picks for data science articles & videos that appeared on the internet that week.

They also include a few job postings, training and resources, plus the occasional O'Reilly book.

Upside: Succinct and well curated. They've been at it for a while (Issue 142 tomorrow). Downside: You have to wait till Thursday to get it!

Python Weekly

Thursday is actually a double-whammy. Python Weekly also arrives that day. Pretty similar layout to DataScience Weekly, except it's also includes a nice roundup of interesting Python projects, tools and libraries that came out that week.

There's also a nice section for meetups/events/webinars for folks looking to get plugged in to the Python community.

Upside: Excellent descriptions of new Python tools/libraries. Downside: Not all article/tutorial links have descriptions--takes a little longer to figure out which ones are worth checking out.

Blogs


  • Fast ML Machine Learning made easy. FastML probably grew out of a frustration with papers you need a PhD in math to understand and with either no code or half-baked Matlab implementation of homework-assignment quality. Also, every blog should have a content page like his. So. helpful.

  • Airbnb Engineering The Airbnb engineering team has an awesome rep for being awesome nerds, and rightfully so. Organized into Code | Tech Talks | Open Source | News | Data all of which are worth checking out

  • DataCamp A blog to show you how to do data analysis like a pro Great for folks learning R & Python. The blog also links to DataCamp's "Open Coures"--mini courses that are free to make and take

  • Google Research Google is awesome about pushing Machine Learning forward and open sourcing their stuff, and releasing really good content (videos, papers, tutorials) about actually putting that stuff to use

  • Dataquest Nice mix of code-based posts and practical 'how-tos' for budding data scientist (e.g. how to build a data science portfolio)

  • Springboard Collection of articles focused mostly on launching a data science carrer (data science interview advice, what factors affect data scientists' salaries, and the like)

  • Yhat Blog That hyperlink is a little inception-y/repetitive. But if you liked what you read, subscribe to our blog up at the top of the page to get notified when we publish a new post



Our Products


Rodeo: a native Python editor built for doing data science on your desktop.

Download it now!

ScienceOps: deploy predictive models in production applications without IT.

Learn More

Yhat (pronounced Y-hat) provides data science solutions that let data scientists deploy and integrate predictive models into applications without IT or custom coding.