TL:DR
Machine Learning empowered Node Autocomplete is a feature in Dynamo that takes Dynamo’s existing Node Autocomplete (Node Type Match) and supercharges it by recommending nodes. Using hierarchically ranked results, trained on real Dynamo graphs, it will give you much more relevant options to build your graphs at speed, with precision. Over time the data that trains these models will get better and better, meaning higher and higher precision. #rockOn
What is Autocomplete?
Imagine you’re writing a story or a sentence, but you don’t want to type the entire word every time. Luckily, your word processor has a feature that predicts the word you’re going for and fills it in for you. When you’re using a scripting language like Python or C#, you write lines of code to tell the computer what to do. But sometimes, these languages have a lot of words, commands, and functions that you need to remember and that get’s overwhelming fast! #cognitiveOverload
So, Autocomplete is a special feature that that helps you write code faster and with fewer mistakes. When you start typing something, like a variable name or a function or the beginning of a word, the programming environment or text editor your using will suggest possible options based on what you’ve written so far. It looks at the words you’re typing and tries to guess what you might want to write next. Pretty cool eh!
You can see how Dynamo uses Autocomplete features below in DesignScript (By using a period, or dot, after a Method name) and in Python in much the same way.
Node Autocomplete is a feature of Dynamo that uses this same type of intelligent code completion, but in a Nodal form. When you double click a port, Node Autocomplete will show you a list of nodes to choose from to continue building your graph. This is super cool because unlike other traditional scripting Autocompletes, you can do this in Dynamo both directions: Upstream to start at the end of your graph and work backwards, or Downstream in the traditional way and you can trigger it from both input and output ports. This unlocks the ability for you to start your graph from anywhere! Just starting on your Dynamo journey, and know you want to place a Wall in Revit? Then start with a Search for “Wall”, place the node that looks best and use Node Autocomplete to see where it takes you!
And, what is Machine Learning?
Machine Learning (Commonly referred to as ML) is seen as subset of Artificial Intelligence (AI), that specifically seeks to use computer algorithms that improve automatically through experience. These ML algorithms build a mathematical model using training data (The ground truth), that then make predictions on new, previously unseen data. Contrary to traditional algorithmic practice (Deterministic, rule-based approach), ML excels in two primary areas: Where you cannot code the rules, or you cannot scale using the traditional approach . In these cases, the machine can be more effective if it creates its own algorithm that allows the ML model to learn from existing data and discover patterns to then apply to new data. In short, ML is teaching computers to learn and make decisions by themselves. #ohHelloFutureISeeYou
But how do Node Autocomplete and Machine Learning intersect?
Up until now, Node Autocomplete in Dynamo was build to match Node Types (Or more accurately, port data types), where it truncated the vast amount of Nodes in the library to only those that match the correct object type, or had a variable input (Which means they could accept anything). This meant instead of trying to understand the relationships between 1000-2000 nodes, you only needed to understand a more reasonable subset and did this by limiting the otherwise boundless nodal options in the Library to only those nodes that match object types (For example string to string, or number to number). Yay for simplifying things.
But this approach was also problematic. It started to address issues of scale and knowledge overwhelm (Awesome), but had countless edge cases where expected nodes wouldn’t show up (Such as All Elements of Category after a Category node) or a somewhat random list showed up under variable input ports that could take anything (Not quite as awesome). Enter the Recommended option, an approach that leans into ML to use real-world behaviors to predict which nodes you may want and ranks them accordingly. This gets around the scale problem as we will always have in a hierarchically relevant ranked list, and the edge case problem as nodes being used in the real-world by users such as you will show up in these ranked lists from data training sets.
OK, but why the Recommended option in Node Autocomplete?
Traditional content systems, such as Libraries, are fantastic in certain cases, but don’t scale very well. A library system of a few hundred nodes is fine, but what happens if a future Dynamo has 10,000 nodes? How can any one person understand the entirety of that list not just by name, but by purpose and how to use them? Quite the tall order for the old brain.
The Recommended option in Node Autocomplete is an alternative method to graph building that leans on communal experience to help people build Dynamo graphs that are sensible and useful in real-world use cases. As data is the lifeblood of ML models, they will only get better over time and safeguard against mental overwhelm as Dynamo expresses more and more nodes or connects to more and more APIs (Things you can do). #workingBetterTogether.
All data used to train the ML model are ethically sourced, from opted in data-sets, sample graphs, the Dynamo Dictionary, a bunch of Jacob Small’s and my play graphs, graphs from the Dynamo Office Hours series etc. It does not look at general user data, which also means it will not take into account what you, specifically, do most often in your own graphs. One point I would highlight is that the recommendations represent the data distribution, in other words if there are holes in the ML-based recommendations it reflects holes in our data. And there are cases we expect this will happen (such as new nodes, nodes that have been renamed etc.), in other cases it is a matter of not having enough representation for certain workflows/packages/Dynamo integrators etc. Any feedback regarding identified holes would be welcome to help us plug them.
And why the Recommended option over Node Type Match?
They both do work well, for different scenarios, but the static Node Type Match approach will always require more brainpower than the ML version, as the ML version will get smarter and smarter over time. They each do have their place, but we see much more potential to empower you with the ML approach. Take for example, a node that is higher in the object inheritance tree (For example a Curve.Length node which works on any type of Curve) that works at lower levels (Such as a Line, Arc, or PolyCurve). In the ML approach this is in your list of options, but in the Node Type Match, it is not because it technically doesn’t match type.
You should use the Node Type Match approach when discovering possibilities within a single shelf in the Library (Or more technically accurate, within the same tier of inheritance), and you should use the Recommended approach by default unless it doesn’t offer suggestions you want. In time, the Recommended approach will get more and more precise, allowing for swifter and more accurate graph creation.
So, how can I use the Recommended option?
Just like the Node Type Match, you trigger Node Autocomplete in the same way, and have much the same features. The major difference is the hierarchically ranking based on confidence, that represents a probability that the node is the correct choice for you:
- You
trigger
Node Autocomplete bydouble clicking on any port
– input or output, to pop up the Node Autocomplete dialog. - You can
switch
between both methods: Node Type Match or Recommended, and Dynamo will remember your choice between sessions. - Reading the
Extended Help
by clicking on the information icon (i
) allows you to go deeper to explore meaning behind the How and the What of Node Autocomplete. Hovering
over the confidence pops up an in-depth tooltip.- ML uses this notion of
Confidence
which involves the ML model estimating the result of an ML algorithm on previously unseen data. The confidence is a way of quantifying the uncertainty of an outcome. In the context of our model it represents how likely is for an event (next node) to happen among all other possible events (all other nodes). Here, all results add up to near 100% and are hierarchical in nature. A set of 5 results with the highest being 45% is no better nor worse than a set of 20 results with the highest being 17%.
Cool, can you tell me what is going on Technically?
The Node Autocomplete feature ranks the predictions based on the probability number, so the most likely node-port combinations appear at the top of the recommendation list. To build the current model, we focused on usage patterns at the level of pairs of nodes. The probability of a node coming up next depends only on the node and port that triggered a Node Autocomplete query, not considering the rest of the graph. This stage of work provided more insight into our data and hopefully will help us through user feedback identify holes in our data that we can incorporate in our next steps.
The ML Node Autocomplete service is hosted, requiring network traffic to work and thus will not work when disconnected from the internet. Up triggering the service on any given port the first time, there will be a slight delay to query the results, but subsequent triggering should is swift as we cache those results temporarily.
Do you have next steps?
Noting that the scaling of a library won’t meet the demands in a a more complex world, and the fact that Dynamo is connecting to more and more things we have big plans to push this approach further to dramatically reduce the burden on you, the end user of Dynamo. We are looking at ways to make it more contextually aware, have a continuous training pipeline so the model stays up to date with new nodes, package nodes and to changing usage trends, to explore how to place not just one, but multiple nodes at once and much more! #theFutureIsBright
Call to Action!
So how can you help? Please do test Recommended Node Autocomplete (ML Empowered) in your work and give us any feedback in this forum thread. This will allow us to identify real-world scenarios where we have missing holes, that we need to patch up in our data-set. This model will get better with time, and we can do that even faster with help from all of you. #buildingABetterDynamoTogether