The Yhat Blog


machine learning, data science, engineering


Rodeo v2.3.0: Ridiculous speed and pop-out plots

by Dane |



So what's new?!

  • Major performance improvements for how tabs work (I wrote a long technical post how I increased browser rendering performance by about 400% below, if you're into that sort of thing)
  • "Pop-out" the plots tab (e.g. view plots alongside Rodeo on another monitor or split screen)
  • Change the working directory of the Console by clicking on the bottom gray bar beneath
  • Save empty files
  • New style of tabs
  • Added Noto fonts to better represent any unicode characters that are not in Roboto

The stuff above, but in a video

Summary of bug fixes

  • Rodeo used to look for "python" even if a different path was set when it was starting up. Now it uses the different path if available
  • New buttons to clear, interrupt and restart the Console (although there were menu and keyboard shortcuts before)
  • Removed Serif fonts that everyone (including me) hated, and replaced them with Roboto (or Helvetica Neue if available)
  • "Run Script" code now appears in the console history, for consistency

The Highly Technical Long Story

Performance

As I mentioned above, I should start out with an apology. I disappeared from the forums to focus on a particular bit of restructuring that should give us some ridiculous speed improvements for how Rodeo renders in the browser. As people know, Rodeo runs in Electron, which runs in Chromium, which is basically Google Chrome. As such, performance could be measured by the number of DOM elements changed after a user action, and since browsers use a painters algorithm to redraw the screen, it would be accurate.

React, which is the rendering engine that I'm using, assumes that if any element contains more elements, that it should be redrawn every time there is any change whatsoever to the webpage. As a result, every action the user was doing in Rodeo was causing an almost complete redraw of the entire screen, because Rodeo is a very complex application with a lot of elements containing other elements. You can see this when you resize the window, or drag around the gray bar between the tab sections.

The only way to improve the situation is to override React's default behavior. If I could tell it exactly what had changed we would get a ridiculous performance improvement, because only a small section of the screen (and the DOM tree) would be updated after a user action.

However, before this update Rodeo used a structure that made it difficult to map the DOM elements being rendered to the state of the app that has changed. It was grouping all the types of components together into lists, and then the tabs were remembering which component they held. As I continued to add features to this pattern, the complexity skyrocketed and it was taking longer to make new features. No good.

This new update (2.3.0) resolves this problem by introducing an Immutable data structure for application state. There are plenty of very interesting articles about the benefits of immutable data, but in our case I could arrange the state of the application to match the structure of the DOM tree, and then to determine if a component should update, I only have to check if the object references attached to that particular component had changed. Using a library called seamless-immutable, I threw away the old application state structure and rebuilt it completely -- you can check the changelogs, I've been very busy.

function shallowEqual(instance, newProps, newState) {
  return equal(instance.props, newProps) && equal(instance.state, newState);
}

If the object references of the properties or the state of a component has changed, then that component should redraw.

React has a built-in component that I could have inherited to do this -- however, the reference to the children of a component always changes and that built-in component does not take that into account, so I had to do something different.

const key = aKeys[i],
  aValue = a[key],
  // the 'children' key doesn't matter; it's not in our control
  isChildren = key === 'children',
  // functions don't matter either
  isFunction = typeof aValue === 'function',
  isValueEqual = aValue === b[key];

if (!isChildren && !isFunction && !isValueEqual) {
  return false;
}

Anyway, hopefully the app feels better now. 😅

Pop-out Plots

One of the major new features of this version of Rodeo is the ability to pop out a new window. This feature is highly experimental, but takes advantage of my use of Redux for changes to application state. I'm copying a server architecture pattern by using the Redux dispatcher to share all new actions between all browser windows as if it were just a stream of events that components can subscribe to with their reducers. If that sentence didn't make sense, well, just imagine that every action a user takes or that emerges from Python as an event flowing through a pipe that is being fed into each window. All the parts of the window are listening to events that they're interested in.

This means that I should be able to pop out any part of the app into their own windows, and everything will continue to work as if they were in the same window.

So far I've enabled this feature with plots, but there is enough code that is making strange assumptions about various things that I want to roll out this feature slowly, testing everything along the way. Next week, I'll add the environment variables as something we can pop out, and the week after maybe something else.



Our Products


Rodeo: a native Python editor built for doing data science on your desktop.

Download it now!

ScienceOps: deploy predictive models in production applications without IT.

Learn More

Yhat (pronounced Y-hat) provides data science solutions that let data scientists deploy and integrate predictive models into applications without IT or custom coding.