The Yhat Blog


machine learning, data science, engineering


ScienceOps Spotlight Pt. 1: Systems Monitoring

by Elise |


Intro

For those of you who aren’t familiar with Yhat (first of all), our mission is to make it easy for data scientists to deploy predictive models.

Our flagship product, ScienceOps, enables data scientists to put predictive models into production applications without the hassle of translating code. The secret? ScienceOps makes R and Python models immediately accessible via standard REST API requests.

In the next two blog posts, we’ll spotlight two new additions to ScienceOps: systems monitoring and predictive analytics. Today we’ll focus on systems monitoring--why it matters and what we built.

Without further ado...

Why it matters

Data science teams and admins need to know whether the machines ScienceOps is running on are healthy so they can detect and remedy failed services or processes immediately and keep their applications running smoothly. Simple, yet critical.

Not what you want your sys admin to be asking.

What we built

Systems Overview

The System Overview page is a convenient dashboard that shows the overall health of your ScienceOps system and how resources are being utilized by your Master and Worker nodes (more on ScienceOps architecture here).

Each metric can exist in either a green or a red state. A state of green means utilization is low or a system check has passed. A red state means that a system check failed or resource utilization is above 80% of the total available to the ScienceOps process. If a check transitions to a red state you should contact your sys admin to investigate further.

Green is good; red is bad. We were feeling innovative.

Graphite Integration

In addition to monitoring your system within our software, ScienceOps also ships with a built-in Graphite integration for tracking server side metrics.

If you aren’t familiar with Graphite, it’s a tool for monitoring and graphing the performance of computer systems. Lots of big companies like Etsy, Google and Orbitz use it because it’s enterprise-scale, runs well on cheap hardware, and is super simple to send your data to. Also, it’s free under the open source Apache 2.0 license.

Once you configure ScienceOps to point to your own Graphite server, the master node collects and sends a variety of system, network and prediction metrics from each worker node to the specified server. You can check out the full list of metrics ScienceOps tracks here.

ScienceOps users at National Funding, a Fintech company specializing in small business loans, use these Graphite metrics to gain greater visibility into the load on their virtualized cluster’s resources.

“Using the Grafana dashboard [to visualize Graphite data] we’re able to easily tell if our master node or worker nodes are having issues by their utilization of memory and CPU,” explains Abe Burnett, Data Scientist.

National Funding monitors the load on their virtualized cluster's resources using a Grafana dashboard.

Debug Server

Still haven’t monitored till your vigilant heart's content?

Last but not least, ScienceOps also ships with a debug server that serves yet more extensive metrics for each worker. For a full list of the System and Docker checks our debugger’s JSON response object serves, check out our docs here.

Más?

Like what you see and interested in deploying and managing predictive models with ScienceOps? Check out our site to get more info, download the data sheet or request a live demo.

Also, be sure to snag an invite to our Applied Data Science + ScienceOps Demo Webinar tomorrow, Thursday April 28th at 1 PM EST with Yhat cofounders Austin Ogilvie and Greg Lamp.

Last but not least, stay tuned for next week’s post about Predictive Analytics in ScienceOps!



Our Products


Rodeo: a native Python editor built for doing data science on your desktop.

Download it now!

ScienceOps: deploy predictive models in production applications without IT.

Learn More

Yhat (pronounced Y-hat) provides data science solutions that let data scientists deploy and integrate predictive models into applications without IT or custom coding.