The Yhat Blog


machine learning, data science, engineering




  • ML Pitfalls: Measuring Performance (Part 1)

    by Eric | Mar 03 2015

    Common machine learning pitfalls and how to avoid them.


  • Base R Plots

    by Greg | Feb 23 2015

    Introduction to plotting and graphics in R (without ggplot2)


  • What is Linear Regression? A Qualitative Exploration

    by Greg | Feb 19 2015

    A high level introduction to what linear regression is and how it works.


  • 11 Python Libraries You Might Not Know

    by Greg | Jan 20 2015

    A highlight of 10 lesser-known Python libraries, that even you experienced Pythonistas may have not seen!


  • Running R in Parallel (the easy way)

    by Greg | Jan 14 2015

    Running code in parallel is tricky. This post shows how to quickly (and easily) parallelize your R code.


  • Currency Portfolio Optimization Using ScienceOps

    by Ryan J. O'Neil | Jan 05 2015

    Create a currency portfolio optimization algorithm and deploy it to ScienceOps


  • Scraping and Analyzing Baseball Data with R

    by Greg | Dec 23 2014

    A quick howto on scraping and analyzing MLB data using R.


  • Reducing your R memory footprint by 7000x

    by Greg | Dec 17 2014

    R can be a bit bloated someitmes. Learn how to make your R models more effecient.


  • Naive Bayes in Python

    by Greg | Dec 11 2014

    How to implement your own naive bayes classifier in Python and a detailed explanation of how it all works.


  • Introducing db.r

    by Greg | Dec 04 2014

    db.py but for R. A database library that makes working with SQL in R a little more enjoyable.