Paranet Actor Versioning
A key goal of the paranet is to enable distributed programming to be done as easily as writing monolithic programs. A critical part of writing any software involves upgrading to newer versions of existing software. Paraflow has special handling for upgrades and that special handling must be extended to the paranet as well. This document details how paraflow and the paranet handles upgrades.
Paraflow
Paraflow actors are capable of running multiple versions of themselves. In the paranet there is a distinction between a “base actor” and an “actor” that is used to handle this versioning split. In the context of paraflow, the “base actor” corresponds to the one running paraflow runtime instance for the actor, and the “actor” corresponds to a specific version of that base actor, all of which are run in the same paraflow runtime. Specifically, the versioned actors are defined with this ActorModel and paraflow “base actors” are defined using an internal representation that is common to all actor versions. This internal model includes things like the common ID, public key and connection details that are used by the runtime.
Paraflow is a language for planning and executing workflows. Workflow state is stored in a database local to the runtime, and when a workflow is created the version of the actor that created it is stored in the workflow state. If a new actor version is created and deployed, and an old workflow then makes progress the runtime will then download the correct actor version and execute it to continue the workflow. For a concrete example, consider:
Actor A has versions: 1.1.1
- An event creates new goal with workflow w1
- 1.1.1 is latest available, it is used to run w1
- Insert state: w1, A, 1.1.1, state data…
- w1 is paused waiting for new data
- A version 1.2.0 is deployed, runtime is restarted
- An event creates new goal with workflow w2
- 1.2.0 is latest available, it is used to run w2
- Insert state: w2, A, 1.2.0, state data…
- New data for w1 arrives, triggering w1 to make progress
- Load state for w1
- See version 1.1.1
- Runtime downloads A@1.1.1
- w1 is executed using A@1.1.1
This system allows for breaking changes to be introduced midway through a workflow. Workflows in paraflow can be very long running (real world examples have run for many months).
Upgrading Workflows
Consider the example above with A@1.1.1 and A@1.2.0. Suppose when 1.2.0 is created the developer knows that there are no breaking changes. In this case, when A@1.2.0 is deployed the developer can specify to do an upgrade from 1.1.x to 1.2.0. The paraflow runtime will then scan all existing non-complete workflows and update the state database to point them to 1.2.0 instead of 1.1.1.
Distributed Workflow
Paraflow actors can distribute their workflow to other actors via pncp delegation. As such, the paranet needs to be aware of paraflow versions, and the skill requests they can issue in order to keep this behavior consistent across all distributed workflows. This is detailed below in the Paranet section.
PNCP Callbacks
When a skill request, other PNCP message or observation is sent from the broker to a paraflow runtime, the callback request includes the version number of the actor that it expects to execute the request. It is up to the paraflow runtime to determine how to handle this request. It has two possible branches:
-
Non-upgrade: Download/load the model of the version and use it to execute the corresponding event/workflow.
-
Upgrade: Look for an update rule for the version number to a newer version number. That is, the runtime will lookup in a table of version upgrades to see if this version can be automatically upgraded to a newer model version. This corresponds to the state upgrade case described above. This functionality allows one actor to be upgraded without forcing all other actors to upgrade their lock files.
Paranet
Most paranet models are versioned, most notably the ActorModel and the SkillSetModel are versioned. The ActorModel is used for each of the three types of paranet actors, software, human, and node actors. The broker’s skill matching algorithm is version aware which is the key feature of enabling version upgrading with distributed workflows.
All skills are represented in a SkillSetModel, even if they are generated from actor skills.
Skill Matching
The ActorModel in the paranet broker has a field which is computed by the broker itself which is the lock
field. When para registers a package it registers all of the actor models and the broker uses the list of
skill requests to store the current compatible versions in the lock. When that actor is configured to use its
lock and it makes a request, the lock is used to select the right version of the actor to call on the other
end.
Algorithm example:
Suppose we have two actors: A, B with these configurations:
A has version 1.1.1, it calls skill a/b locked to version 1.1.1
B has version 1.2.3, 1.3.0 and implements skill a/b in both versions
Then consider the following sequence:
- Actor A@1.1.1 issues skill request a/b to broker
- A@1.1.1 is configured to be locked
- Broker looks up lock for A@1.1.1 to find the version number on the SkillSetModel for a/b
- Suppose it is a/b@1.2.3
- Broker ensures data matches the spec of a/b@1.2.3
- Broker finds all possible implementors of a/b@1.2.3
- B@1.2.3 is an implementor (note: B@1.3.0 is not an implementor of this version)
- Broker selects B@1.2.3 as it is the only valid candidate
- Broker sends the callback request to runtime of B with the request and version 1.2.3
- B runtime receives the callback and processes it according to its upgrade rules or not
This sequence could be carried out in the middle of an existing workflow of A, even after a new version of A has been deployed. This allows us to fulfill our requirement of processing existing workflows after upgrade in a distributed manner.
Cross Node
Cross node locks are not yet implemented, however, we can detail an implementation here.
Cross node actors actors have a regular ActorModel, but the registration process requires special handling to account for version upgrades.
Given two nodes, A and B, which are affiliating skills between each other, the process begins by one node sending a request to reconcile skills to the other. Suppose A affiliates with B, it would send it a request with the following fields:
-
Skills implemented by A for B to use and their versions. These can be “internal” (i.e. defined by A itself) or a common set of named skillsets used by both A and B.
-
Skills called by A to B.
Once B receives this request and verifies it can understand these skills and allows A to call the skills in B, it will generate a lock file of compatible versions and send it back to A. A will then store this lock on its end.
When an upgrade to any set of actors happens which call any of the skills in B, A will need to reinitiate this process in order to regenerate its lock.
Human Actors
Human actors also have versions, however semantically unusual that seems. This is required to simplify the model of lock generation in the paranet. However, at various times we have suggested that we should introduce the concept of a Role which is fulfilled by human actors at various times. This role would be versioned and contain skills it implements as well as skills that it is expected to call in order to fulfill the role. This corresponds nicely to the current ActorModel definition, and would allow us to define an “abstract human actor” in a paranet which can be fulfilled by actual human actors at runtime. This role type being versioned is more natural and a generally useful abstraction for us to have. The details of its implementation can be explored at a later time.