Compute API
The Compute API is a library for working with DCP, the Distributed Compute Protocol, to perform arbitrary computations on the Distributed Computer.
Record of Issue
Date |
Author |
Change |
---|---|---|
May 06 2020 |
Wes Garland |
Stamp Compute API 1.5.2 |
May 05 2020 |
Nazila Akhavan |
Add getJobInfo, getSliceInfo |
Feb 24 2020 |
Ryan Rossiter |
Add job.status, job.runStatus, clarify marketRate |
Dec 09 2019 |
Ryan Rossiter |
Update requirements object property descriptions |
Oct 08 2019 |
Jason Erb |
Clarified distinction between Worker and Sandbox |
Sep 23 2019 |
Wes Garland |
Compute API 1.51 Release. Improved language. |
Sep 20 2019 |
Wes Garland |
Compute API 1.5 Release. |
Jul 15 2019 |
Wes Garland |
Glossary update; Generator->Job, Worker Thread->Sandbox, Miner->Worker |
Feb 13 2019 |
Wes Garland |
- Added offscreenCanvas and sandboxing info to Worker Environment |
Nov 23 2018 |
Wes Garland |
- Deprecated Application |
Oct 31 2018 |
Wes Garland |
Initial Release |
Oct 29 2018 |
Wes Garland |
Second Draft Review |
Oct 23 2018 |
Wes Garland |
Moved toward generator-oriented syntax |
Oct 19 2018 |
Wes Garland |
First Draft Review |
Intended Audience
This document is intended for software developers working with DCP. It is organized as a reference manual / functional specification; introductory documentation is in the DCP-Client document.
Overview
This API focuses on jobs, both ad-hoc and from published appliances, built around some kind of iteration over a common Work Function, and events. The API entry points are all exports of the DCP compute
module.
See Also
DCP-Client package
Wallet API
Implementation Status
As of this writing (September 2019), the Compute API is very much a “work in progress”, with core functionality finished and well-tested, but finer details unfinished or omitted entirely. This document intends to document the final product, so that early-access developers can write future-proof or future-capable code today.
Note: The list below is not complete, and may not be up to date. Caveat Developtor!
Implemented
compute .do
compute.for
compute.resume
Range Objects
Distribution Objects
Partially Implemented
Module system; public modules inside sandboxes used by work functions are complete, however there is no automatic bundling of private modules, no access restrictions, no automatic version maintenance, etc.
Wallet API is well-implemented on NodeJS but browser support falls back to the keychain API (asking users to upload keystore files all the time)
Not Implemented
Appliances
Proxy Keys
Identity Keys
Consensus Jobs
Reputation
console
events’same
counter.
Definitions
Task - A unit of work which is composed of one or more slices, which can be executed by a single worker.
Slice - A unit of work, represented as source code plus data and meta data, which has a single entry point and return type.
Job - The collection consisting of an input set, Work Function, arguments, and result set.
Module - A unit of source code which can be used by, but addressed independently of, a job. Compute API modules are similar to CommonJS modules.
Package - A group of related modules
Scheduler - The daemon which connects both workers and clients together, doles out work, and sends information about DCC movement to the Bank. The scheduler groups slices into tasks based on a variety of criteria such as slice cost, leaf-node cache locality, and sandbox capabilities.
Worker - A software program having one supervisor and one or more sandboxes. This program communicates with the scheduler and module server to retrieve tasks and modules, which are sent to and executed on the worker.
Sandbox - An ES execution environment in the worker which executes modules and tasks supplied by the supervisor.
Supervisor - The part of the worker which manages sandboxes, mediates communication with sandboxes, and so on.
Client - A program which uses DCP that is executed/used by the end-user.
Appliance - a named collection of well-characterized Work Functions
Distributed Computer A distributed supercomputer consisting of one or more Schedulers and Workers.
About the Compute API
The compute module is the holding the module for classes and configuration options (especially default options) related to this API. Throughout this document, it is assumed that this module’s exports are available as the global symbol compute
.
Most computations on the Distributed Computer operate by mapping an input set to an output set by applying a work function to each element in the input set. Input sets can be arbitrary collections of data, but are frequently easily-described number ranges or distributions.
Work functions can be supplied directly as arguments in calls to API functions like compute.do
and compute.for
.
note - When work is a function, it is turned into a string with Function.prototype.toString
before being transmitted to the scheduler. This means that work cannot close over local variables, as these local variables will not be defined in the worker’s sandbox. When work is a string, it is evaluated in the sandbox, and is expected to evaluate to a single function.
Jobs
Jobs associate work with input sets and arguments, and enable their mapping and distribution on the Distributed Computer. Jobs with ad-hoc Work functions can be created with the compute.do
and compute.for
APIs.
Fundamentally, a Job is
an input set,
a work function, and
arguments to the work function
an output set, which is result of applying the work function and its arguments to each element of the input set.
The Scheduler on the Distributed Computer is responsible for moving work and data around the network, in a way which is cost-efficient; costs are measured in terms of CPU time, GPU time, and network traffic (input and output bytes).
Clients specify the price they are willing to pay to have their work done; Workers specify the minimum wage they are willing to work for. The scheduler connects both parties in a way that allows Workers to maximize their income, while still serving the needs of all users on the Distributed Computer. Essentially, the higher the wage the client is willing to pay, the more Workers will be assigned to compute the job.
Work Characterization
Work is characterized throughout the lifetime of the job. CPU time and GPU time are constantly normalized against benchmark specifications, and network traffic is measured directly. Work starts out as uncharacterized.
Uncharacterized Work
Uncharacterized work is released slowly into the network for measurement. Funds are escrowed during deploy to cover the static offer price for all slices.
Well-Characterized Work
Well-characterized work can be deployed on the network more quickly, skipping the estimation phase. Client developers can specify work characterization (sliceProfile objects) while submitting work; these can be either directly specified or calculated locally via the job estimation facilities. Additionally, work originating from Appliances is always well-characterized.
Static Methods
compute.cancel
This function allows the client to cancel a running job. This function takes as its sole argument a Job id and tells the scheduler to cancel the job. This method returns a promise which is resolved once the scheduler acknowledges the cancellation and has transitioned to a state where no further costs will be incurred as a result of the job.
compute .do
This function returns a JobHandle (an object which corresponds to a job), and accepts one or more arguments, depending on form.
Argument |
Type |
Description |
---|---|---|
n |
Number |
number of times to run the work function |
work |
String |
the work function to run. If it is not a string, the toString() method will be invoked on this argument. |
arguments |
Object |
an optional Array-like object which contains arguments which are passed to the work function |
form 1:
compute.do(work, arguments)
This form returns a JobHandle and accepts one argument, work. Executing this job will cause work to run a single task on the Distributed Computer. This interface is in place primarily to enable DCP-related testing during software development. When it is executed, the returned promise will be resolved with a single-element array containing the value returned by work.form 2:
compute.do(n, work, arguments)
This form returns a JobHandle which, when executed, causes work to run n times and resolve the returned promise with an array of values returned by work in no particular order. For each invocation of the work function, work will receive as its sole argument a unique number which is greater than or equal to zero and less than n.
compute.for
This function returns a JobHandle (an object which corresponds to a job), and accepts two or more arguments, depending on form. The work is scheduled for execution with one slice of the input set, for each element in the set. It is expected that work could be executed multiple times in the same sandbox, so care should be taken not to write functions which depend on uninitialized global state and so on.
Every form of this function returns a handle for a job which, when executed, causes work to run n times and resolve the returned promise with an array of values returned by work, indexed by slice number (position within the set), where n is parallel the number of elements in the input set.
When the input set is composed of unique primitive values, the array which resolves the promise will also have an own property entries
method which returns an array, indexed by slice number, which contains a {key: value} object, where key is in the input to work, and value is the return value of work for that input. This array will be compatible with functions accepting the output of Object.entries() as their input.
The for
method executes a function, work, in the worker by iterating over a series of values. Each iteration is run as a separate task, and each receives a single value as its first argument. This is an overloaded function, accepting iteration information in a variety of ways. When work returns, the return value is treated as result, which is eventually used as part of the array or object which resolves the returned promise.
note - When work is a function, it is turned into a string with Function.prototype.toString
before being transmitted to the scheduler. This means that work cannot close over local variables, as these local variables will not be defined in the worker’s sandbox. They can be provided to the arguments
argument and will be given to the work function after the iterator value. When work is a string, it is evaluated in the sandbox, and is expected to evaluate to a single function.
Argument |
Type |
Description |
---|---|---|
rangeObject |
Object |
see Range Objects, below |
start |
Number |
the first number in a range |
stop |
Number |
the last number in a range |
step |
Number |
optional, the space between numbers in a range |
work |
String |
the work function to run. If it is not a string, the toString() method will be invoked on this argument. |
arguments |
Object |
an optional Array-like object which contains arguments which are passed to the work function |
form 1:
for (rangeObject, work, arguments)
This form accepts a range object rangeObject, (see below) and this range object is used as part of the job on the scheduler. What this means is that this form is very efficient, particularly for large ranges, as the iteration through the set happens on the scheduler, and one item at a time. When the range has{ start:0, step:1 }
, the returned promise is resolved with an array of resultant values. Otherwise, the returned promise is resolved with an object whose keys are the values in the range.form 2a:
for (start, end, step, work, arguments)
- start, end, and step are numbers used to create a range object. Otherwise, this is the same as form 1.form 2b:
for (start, end, work, arguments)
- exactly the same as form 2a, except step is always 1.form 3:
for ({ ranges: [rangeObject, rangeObject...] }, work, arguments)
Similar to form1, except with a multi range object containing an array of range objects in the key ranges. These are used to create multi-dimensional ranges, like nested loops. If they were written as traditional loops, the outermost loop would be the leftmost range object, and the innermost loop would be the rightmost range object.
The promise is resolved following the same rules as in form 1, except the arrays/objects nest with each range object. (See examples for more clarity)
form 4:
for (iterableObject, work, arguments)
iterableObject could be an Array, ES6 function* generator, or any other type of iterable object.
Future Note - form4 with an ES6 functionjob argument presents the possibility where, in a future version of DCP, the protocol will support extremely large input sets without transferring the sets to the scheduler in their entirety. Since these are ES6 function generators, the scheduler could request blocks of data from the client even while the client is ‘blocked’ in an await call, without altering the API. This means DCP could process, for example, jobs where the input set is a very long list of video frames and each slice represents one frame.
Result Handles
Result handles act as a proxy to access the results (a mapping from input set to output set) of a job. The result handle for a job is returned by exec
, returned by the complete
event, and located in job.results
.
The result handle is an Array-like object which represents a mapping between slice number (index) and a result. Additional, non-enumerable methods will be available on this object to make marrying the two sets together more straightforward. These methods are are based on methods of Object
.
entries() - returns an Array of
[input, output]
pairs, in the same order as that the data appear in the input set, where work(input
) yieldsoutput
. If the input to work had more than argument,input
will be an Array that is equivalent to the arguments vector that work was called with.fromEntries() - returns an Object associating
entries()
’sinput
withoutput
, for all elements in the input set whereinput
is an ES5 primitive type. In the case where there are key collisions, the key closest to the end of the input set will be used. Equivalent toObject.fromEntries(results.entries())
(see https://tc39.github.io/proposal-object-from-entries/).keys() - returns the entire input set, in the order specified when the job was created
values() - returns the entire output set, in the same order as the input set
key(n) - Returns the nth item in the input set (e.g.
results.keys[n]
), taking steps to avoid re-generating the entire input set if not necessary.lookupValue(input) - Returns the value corresponding to the value returned from the job for the slice matching input. For example, if a work function was invoked with the input “blue”, then lookupValue(“blue”) would return the value which that slice returned. If the arguments are not part of the input set, ERANGE is thrown.
When job.collateResults
was set to true
(the default), the result handle will be automatically populated with the results as they are computed. Otherwise when job.collateResults
is false
,
the fetch
method will have to be used before results are available in the handle. Attempting to access a result before it is in memory will cause an error to be thrown.
There are 4 methods provided for accessing or manipulating results that are stored on the scheduler:
list (
rangeObject
) - returns a Range Object for the slices whose results have been computed and whose slice numbers match the provided ranged object. If no range object is provided, all slices whose results have been computed will be returned.delete (
rangeObject
) - delete the results currently on the scheduler corresponding to the slices whose slice numbers match the provided ranged object. If no range object is provided, all results which have been computed will be deleted from the scheduler.fetch (
rangeObject
,emitEvents
) - fetches results from the scheduler corresponding to the slices whose slice numbers match the provided ranged object (or all if not provided). The fewtched results will be populated onto the result handle. IfemitEvents
is true,result
events will be emitted for every slice that wasn’t already present in the result handle, and ifemitEvents
is “all” then aresult
event will be emitted for every slice.stat (
rangeObject
) - returns a set of objects for the slices on the scheduler corresponding to the slices whose slice numbers match the provided ranged object. The set will have the same API as an output set, but rather than the results for a work function, each element of the set will be an object with the following properties:distributedTime: the timestamp documenting when the slice was initially distributed to a Worker
computedTime: the timestamp document when the scheduler received the result for this slice
payment: the amount of DCC which was charged to the payment account to compute this slice
compute.status
This function allows the client to query the status of jobs which have been deployed to the scheduler.
form 1:
compute.status(job)
: returns a status object describing a given job. The argument can be either a job Id or a Job Handle.form 2:
compute.status(paymentAccount, startTime, endTime)
: returns an array of status objects corresponding to the status of jobs deployed by the given payment account. If startTime is specified, jobs older than this will not be listed. If endTime is specified, jobs newer than this will not be listed. startTime and endTime can be either the number of milliseconds since the epoch, or instances ofDate
. The paymentAccount argument can be a Keystore object, or any valid argument to the Keystore constructor (see: Wallet API).
compute.getJobInfo
This async function accepts job Id as its argument and returns information and status of a job specified with jobID.
compute.getSliceInfo
This async function accepts job Id as its argument and returns status and history of a slice(s) of the job specified with jobID.
compute.marketRate
This function allows the client to specify “going rate” for a job, which will be dynamically set for each slice as it is paid out.
NYI: For now the scheduler will use 0.0001465376 DCC as the market rate, multiplied by the factor if provided.
form 1:
compute.marketRate
: Using marketRate as a property will use the most recent market rate for each sliceform 2:
compute.marketRate(factor = 1.0)
: Calling marketRate as a function will cause the job to use the daily calculated rate, multiplied by the provided factor.
compute.getMarketValue
This function returns a promise which is resolved with a signed WorkValueQuote object. This object contains a digital signature which allows it to be used as a firm price quote during the characterization phase of the job lifecycle, provided the job is deployed before the quoteExpiry and the CPUHour, GPUHour, InputMByte and OutputMByte fields are not modified. This function ensures the client developer’s ability to control costs during job characterization, rather than being completely at the mercy of the market. Note: Market rates are treated as spot prices, but are calculated as running averages.
compute.calculateSlicePayment
This function accepts as its arguments a SliceProfile object and a WorkValue object, returning a number which describes the payment required to compute such a slice on a worker or in a market working at the rates described in the WorkValue object. This function does not take into account job-related overhead.
job.setSlicePaymentOffer(1.0001 * compute.calculateSlicePayment(job.initialSliceProfile, await job.scheduler.getMarketRate()))
Data Types
Range Objects
Range objects are vanilla ES objects used to describe value range sets for use by compute.for()
. Calculations made to derive the set of numbers in a range are carried out with BigNumber, eg. arbitrary-precision, support. The numbers Infinity and -Infinity are not supported, and the API does not differentiate between +0 and -0.
Describing value range sets, rather than simply enumerating ranges, is important because of the need to schedule very large sets without the overhead of transmitting them to the scheduler, storing them, and so on.
Range Objects are plain JavaScript objects with the following properties:
start
: The lowest number in the rangeend
: The highest number in the rangestep
: The increment used between iterations, to get from start to end.group
: The number of consecutive elements in the range which will be processed by each slice. If this value is specified, even if it is specified as 1, the function called in the sandbox (eg. work) will receive as its argument an array of elements in the set as its input. When this value is not specified, the function called in the worker sandbox will receive a single datum as its input.
When end - start
is not an exact multiple of step
, the job will behave as though end
were the nearest number in the range which is an even multiple of step, offset by start. For example, the highest number generated in the range object {start: 0, end: 1000, step: 3}
would be 999.
Sparse Range Objects
Range Objects whose values are not contiguous are said to be sparse. The syntax for specifying a sparse range object is
{ sparse: [range object, range object...]}
Any range object can be used in the specification of a sparse range object, except for a sparse range object.
Distribution Objects
Distribution objects are used with compute.for
, much like range objects. They are created by methods of the set exports of the stats module, and are used to describes input sets which follow common distributions used in the field of statistics. The following methods are exported:
normalRNG(n, x̄, σ) – generates a set of numbers arranged in a normal distribution, where – n represents the size of the set – x̄ represents the mean of the distribution – σ represents the standard deviation of the distribution
random(n, min, max) – generates a set of randomly-distributed numbers, where – n represents the size of the set – min represents the smallest number in the set – max respresents the smallest number which is greater than min but not in the set
randomInt(n, min, max) – generates a set of randomly-distributed integers, where – n represents the size of the set – min represents the smallest number in the set – max respresents the smallest number which is greater than min but not in the set
let stats = require('stats')
let job = compute.for(stats.set.normalRNG(10, 100, 0.2), (i) => i)
Job Handles
Job handles are objects which correspond to jobs, and are instances of compute.Job
. They are created by some exports of the compute module, such as compute.do
and compute.for
.
Properties
requirements - A object describing the requirements that workers must have to be eligible for this job. See section Requirements Objects.
initialSliceProfile - an object describing the cost the user believes each the average slice will incur, in terms of CPU/GPU and I/O (see: SliceProfile Objects). If defined, this object is used to provide initial scheduling hints and to calculate escrow amounts.
slicePaymentOffer - an object describing the payment the user is willing to make to execute one slice of the job. See section PaymentDescriptor Objects.
paymentAccount - a handle to a Keystore Object that describes a bank account which is used to pay for executing the job.
work - an EventEmitter object with emits arbitrary events that are triggered by the work (user code running in a sandbox).
requirePath - initially initialized as an empty array, this object is captured during deployment, and is pushed on to
require.path
in the worker before work is evaluated in the main module.modulePath - initially initialized as an empty array, this object is captured during deployment, and is pushed on to
module.path
in the sandbox’s main module before work is evaluated in the main module.collateResults - when this property is false, the job will be deployed in such a way that results are not returned to the client unless explicitly requested (e.g. via the
result
event listener or theresults
method of the job handle). Changing this property after the job has been deployed has no effect.status - an object having the properties
runStatus
,total
,distributed
, andcomputed
. This object reflects a “live status” while the job handle is awaiting theexec
Promise resolution/rejection.runStatus
: The current run status of the job.total
: Once the job is ready (payment authorized, etc), this value represents the total number of slices; until the job is ready, this value will be undefined.distributed
: The total number of unique slices that have been sent to Workers for computation.computed
: The total number of slices in this job for which the scheduler has results.
public - an object describing the properties used to describe and label the work when it is deployed, it contains the following properties:
name
: A string containing the name of the job, this is displayed on the progress bar in the browser worker.description
: A string containing a description of the job, this is displayed when a browser worker user hovers over the progress bar for a job.link
: A string containing a URL for providing more info about the job, browser worker users can visit the URL by clicking on the job row. (May be deprecated)
contextId - An optional string identifier. This can be used to indicate to caching mechanisms that keystores with the same name are different. For example, if several “default” keystores must be used for different jobs,
contextId
should be set to a different string on each job to prevent incorrect caching.scheduler - an instance of URL object defining the location of the scheduler on which the job is to be deployed. If not specified, the value used will be
dcpConfig.scheduler.location
.bank - an instance of URL object defining the location of the bank. If not specified, the value used will be
dcpConfig.bank.location
. Note that this must correspond to the bank used by the scheduler or the job will be rejected.
Present once job has been deployed
id - a unique string assigned by the scheduler which can be used to resume or cancel the execution of a previously-deployed job.
receipt - a cryptographic receipt indicating deployment of the job on the scheduler
meanSliceProfile - a SliceProfile object which contains the average costs for the slices which have been computed to date. Until the first result is returned, this property is undefined.
results - a Result Handle object used to query and manipulate the output set. (See *Result Handles)
Methods
cancel - Cancel the job. This method returns a promise which is resolved once the scheduler acknowledges the cancellation and has transitioned to a state where no further costs will be incurred as a result of this job.
resume - Ask the scheduler to resume distributing slices for the job. This method returns a Promise which will resolve once the scheduler has resumed distributing slices, or may reject with an error indicating why it was unable to resume (eg. ENOFUNDS if there are not enough credits escrowed).
exec - tells the scheduler to deploy the job. This method accepts arguments
slicePaymentOffer
,paymentAccount
andinitialSliceProfile
, which, when not undefined, update the related properties of the JobHandle.paymentAccount
can be any valid argument to thewallet.Keystore
constructor.
This method returns a promise which is resolved with a result handle once the scheduler notifies the client that the job is done via thecomplete
event.
Additionally, calling this on a job that is already deployed will fetch the partially or fully computed results ifcollateResults
is set to true. Otherwise,job.results.fetch
will need to be called to access the results stored on the Distributed Computer.
This method can reject the promise with errors from the scheduler. Any of the errors below imply that the scheduler has paused the job:
ENOFUNDS - insufficient credit in the account associated with the payment account. This error can happen immediately upon deployment, or partway through the job if the scheduler discovers that tasks are using more compute than estimated and it needs to re-escrow. Callingexec
on this JobHandle after an ENOFUNDS error will cause the scheduler to attempt to resume the job as described above. If the promise is rejected with the errors below, the scheduler has cancelled the job:
ENOPROGRESS - the scheduler has determined that too many sandboxes are not receiving regular progress update messages, are receiving invalid progress messages, or that tasks are completing without emitting at least one progress update message.
ETOOMANYTASKS - the scheduler has determined that this job exceeds the maximum number of allowable tasks per job
EWORKTOOBIG - the scheduler has determined that the combined size of the work function and local modules associated with this job exceeds the maximum allowable size
ETOOBIG - the scheduler has determined that this job exceeds the maximum allowable amount of work
ESLICETOOSLOW - the scheduler has determined that individual tasks in this job exceed the maximum allowable execution time on the reference core
ETOOMANYERRORS - the scheduler has determined that too many work functions are terminating with uncaught exceptions for this job
Any other rejections should be treated as transient errors, and client developers should assume that the results could be retrieved eventually by callingawait compute.resume(job.id).exec()
. These transient errors include, but are not limited to:
EPERM - client tried to do something prohibited, such as updating the payment account of a job.localExec: This function is identical to exec, except that the job is executed locally, in the client. It accepts one argument,
cores
. If cores is false, the execution will happen in the same JavaScript context as the client. Otherwise, cores is interpreted as the number of local cores in which to execute the job. Local modules will be loaded using regular filesystem semantics; modules which originate in packages will be loaded from the module server associated with the current scheduler.requires: This function specifies a module dependency (when the argument is a string) or a list of dependencies (when the argument is an array) of the work function. This function can be invoked multiple times before deployment. See the Modules and Packages section for more information.
estimate: This function returns a promise which is resolved with a SliceProfile object describing the resources consumed to run one slice of the job. The function accepts as its sole argument a sample datum, which may or may not be actual data from the input set. If no argument is specified, the first element of the input set will be used for the calculation.
setSlicePaymentOffer: Set the payment offer; i.e. the number of DCC the user is willing to pay to compute one slice of the job. This is equivalent to the first argument to
exec
. When this method is called for a job which is already deployed, any changes to the slice payment offer will be transmitted to the scheduler, and the scheduler will apply them on a best-effort basis to slices dispatched after the change request. Note that a new payment offer will generally cause the scheduler to alter the quantity of DCC currently in escrow for that job, which can cause the job to emit the ENOFUNDS event. There is no guarantee that changes to the slice payment offer will occur immediately.setPaymentAccountKeystore: Set the payment account. This is equivalent to the second argument to
exec
. Setting the payment account after the initial call toexec
will throw EPERM in the setter.
Events
The JobHandle is an EventEmitter (see EventEmitters, below), and it can emit the following events:
accepted - fired when job is deployed. Returns:
{
job // original object that was delivered to the scheduler for deployment
}
cancel - fired when job is cancelled
result - fired when a result is returned. Returns:
{
address, // address (id) of the job
task, // the address of the task (slice) that the result came from
sort, // the index of the slice
result: {
request: 'main',
result, // value returned
}
}
resultsUpdated - fired when the result handle is modified, either when a new
result
event is fired or when results are populated withresultHandle.fetch()
complete - fired when the job has finished running. Returns: The result handle for the job.
status - fired when a status update is received. Returns:
{
address, // address (id) of the job
total, // total number of slices
distributed, // number of slices that have been distributed
computed, // number of slices that have returned a result
runStatus, // the job's run status before any updates from a status event
}
error - fired when a slice throws an error and fails to complete. Returns:
{
address, // address (id) of the job
sliceIndex, // the index of the slice that threw the error
message, // the error message
stack, // the error stacktrace
name, // the error name
}
console - fired when
console.log
(or info, debug, warn, error) is called from within the work function. Returns:
{
address, // address (id) of the job
sliceIndex, // the index of the slice that produced this event
level, // the log level, one of 'debug', 'info', 'log', 'warn', or 'error'
message, // the console log message
}
Note: In the case where the most recent message is identical to the message that is about to be emitted, the message is not emitted, but rather a “same” counter is incremented. Eventually, a console
event on the JobHandle will be emitted; this event will have the sole property same, having the value of the “same” counter. This will happen when a new, different, message is logged, the worker terminates, or a progress update event is emitted; whichever comes first.
noProgress - fired when a slice is stopped for not calling progress. Contains information about how long the slice ran for, and about the last reported progress calls. Returns:
{
address, // address (id) of the job
sliceIndex, // the index of the slice that failed due to no progress
timestamp, // how long the slice ran before failing
progressReports: {
last: { // The last progress report received from the worker
timestamp, // time since start of slice
progress, // progress value reported
value, // last value that was passed to the progress function
throttledReports, // number of calls to progress that were throttled since last report
},
lastUpdate: { // The last determinate (update to the progress param) progress report received from the worker
timestamp,
progress,
value,
throttledReports,
}
}
}
noProgressData - Identical to
noProgress
, except that it also contains the data that the slice was executed with. Returns:
{
...{ noProgress event },
data, // the data that the slice was executed with
}
Error Statuses - Any of the scheduler errors listed above in Job Methods will be emitted from the job handle.
The jobHandle’s work
property is also an EventEmitter, which will emit custom events from the work function.
Worker Environment
Work functions (i.e. the final argument to compute.for()
are generally executed in sandboxes inside workers. These are the functions which map the input set to the output set.
Each work function receives as its input one element in the input set. Multi-dimensional elements, such as those defined in compute.for() form 3
, will be passed as multiple arguments to the function. The function returns the corresponding output set element, and must emit progress events.
The execution environment is based on CommonJS, providing access to the familiar require()
function, user-defined modules, and modules in packages deployed on Distributed Compute Labs’ module server. Global symbols which are not part of the ECMA-262 specification (such as XmlHTTPRequest
and fetch
) are not available.
Global Symbols
OffscreenCanvas – As defined in the HTML Standard, provides a canvas which can be rendered off-screen. If this interface is not available in a given worker, the worker will not report capability “offscreenCanvas”.
progress(n) – a function that returns
true
and which accepts as its sole argument a number between 0 and 1 (inclusive) that represents a best-guess at the completed portion of the task as a ratio of completed work to total work. If the argument is a string ending in the%
symbol, it will be interpreted as a number in the usual mathematical sense. If the argument is undefined, the slice progress is noted but the amount of progress is considered to be indeterminate.
This function emits a progress event. Progress events should be emitted approximately once per second; a task which fails to emit a progress event for a certain period of time will be cancelled by the supervisor. The argument to this function is interpreted to six significant digits, and must increase for every call. All work functions must emit at least one progress event - this requirement will be enforced by the estimator.The period of time mentioned above will be at least 30 wall-clock seconds and at least 30 benchmark-adjusted seconds
console.log() – a function which emits
console
events on the JobHandle. See Events:consoleconsole.debug() – exactly the same as
console.log()
, except with level = “debug”console .info() – exactly the same as
console.log()
, except with level = “info”console.warn() – exactly the same as
console.log()
, except with level = “warn”console.error() – exactly the same as
console.log()
, except with level = “error”. note - some ES environments (Chrome, Firefox) implement C-style print formatting in this method. This is currently not supported.work – an object that contains the following properties:
emit() – emit an arbitrary event, which will be fired by the work object on the JobHandle in the client. The first argument to this function is the event name; the second argument becomes the value passed to the event handler in the client. Note - only values which can be represented in JSON are supported.
job - an object with additional data about the job
public - The
public
data object that the job was created with, has propertiesname
,description
, andlink
Requirements Objects
Requirements objects are used to inform the scheduler about specific execution requirements, which are in turn used as part of the capabilities exchange portion of the scheduler-to-worker interaction.
Boolean requirements are interpreted as such:
if true, that capability must be present in the worker
if false, that capability must not be present in the worker
In the example above, only workers with a GPU, running ES7 on SpiderMonkey using fdlibm library would match. In the example below, any worker which can interpret ES7 but is not SpiderMonkey will match:
let requirements = {
engine: {
es2019: true,
spidermonkey: false
}
}
Requirements Object Properties
environment.fdlibm
: Workers express capability ‘fdlibm’ in order for client applications to have confidence that results will be bitwise identical across workers. This library is recommended, but not required, for implementaiton of Google chrome, Mozilla Firefox and the DCP standalone worker use fdlibm but Microsoft Edge and Apple Safari do not use it.environment.offscreenCanvas
: When present, this capability indicates that the worker environment has the OffscreenCanvas symbol defined.details.offscreenCanvas.bigTexture4096
: This capability indicates that the worker’s WebGL MAX_TEXTURE_SIZE is at least 4096.details.offscreenCanvas.bigTexture8192
: This capability indicates that the worker’s WebGL MAX_TEXTURE_SIZE is at least 8192.details.offscreenCanvas.bigTexture16384
: This capability indicates that the worker’s WebGL MAX_TEXTURE_SIZE is at least 16384.details.offscreenCanvas.bigTexture32768
: This capability indicates that the worker’s WebGL MAX_TEXTURE_SIZE is at least 32768.engine.es7
: This capability enforces that the worker is running an es7-compliant JavaScript engine.engine.spidermonkey
: The capability enforces that the worker will run in the SpiderMonkey JS engine.
GPUs
DCP supports WebGL and plans on supporting WebGPU. This functionality can be used by setting the gpu
flag listed under Requirements Object Properties.
EventEmitters
All EventEmitters defined in this API will be bound (i.e. have this
set) to the relevant job when the event handler is invoked, unless the event handler has previously been bound to something else with bind or an arrow function.
The EventEmitters have the following methods:
on
(eventName, listener)
: execute the function listener when the event eventName is triggered.addListener
(eventName, listener)
: same as onaddEventListener
(eventName, listener)
: same as onremoveListener
(eventName, listener)
: remove an event listener which has been attached with on or one of its synonyms.removeEventListener
(eventName, callback)
: same as removeListenerlistenerCount
(eventName)
: returns a count of the number of listeners registered foreventName
.
Modules
The work specified by the JobHandle.exec
method can depend on modules being available in the sandbox. This will be handled by automatically publishing all of the modules which are listed as relative dependencies of the job. Client developers can assume that dependencies loaded from the require.path are part of pre-published packages.
The DCP developer ecosystem offers the ability to run CommonJS-style modules and pure-ES NPM packages seamlessly and without transpilation steps on the following host environments
arbitrary web document
arbitrary web worker
DCP sandbox
NodeJS
For more information, see the DCP Modules document.
Appliances
Future versions of DCP will expand the Appliance concept to include the ability to support licensing and royalties. For example, developers will be able to publish appliances which
may only be executed by users holding the correct (chainable, revocable) cryptographic key
cause a payment to the Appliance author for each job deployed
cause a payment to the Appliance author for each slice computed
Appliance Handles
Appliance handles are objects which correspond to appliances. They are created by instantiating the compute.Appliance
class.
Constructor
The Appliance constructor is an overloaded object which is used for defining/publishing appliances and referencing appliances on the scheduler.
new Appliance() - form 1 - preparing to publish
This form of the constructor is used for creating/publishing appliances. It accepts three arguments: applianceName, version, and publishKeystore.
applianceName - this is a user-defined string, such as “ffmpeg-util”, which is used to reference the appliance on the Distributed Computer. It must be unique across the network.
version - this is the version number of the appliance that is being defined, represented as a string (e.g. “1.0.0”). The Compute API uses the same version number scheme as npm. It is an error to use the same version number twice for the same appliance under any circumstance.
publishKeystore - this is the keystore whose key is used to sign the appliance. If any version of the appliance has been published on a scheduler, only new versions of that appliance signed by the same key will be accepted for publication.
new Appliance() - form 2 - published appliance
This form of the constructor is used to access functions which have already been published. It accepts two arguments: applianceName, and version.
applianceName - this is a user-defined string, such as “ffmpeg-util”, which is used to reference the appliance on the Distributed Computer.
version - this is a string describing the version of the appliance that the client application developer wants to use. The Compute API supports the same syntax as npm uses in package.json, including the
"^1.0.0"
and"#versionIdHexString"
forms.
Methods
For new appliances
publish - Request publication of an appliance on the connected scheduler. Returns a promise which is an object having the following properties:
versionId - a unique hex string assigned by the scheduler which uniquely identifies this version of this appliance. Prepending this string with an octothorpe will make it a valid argument for the version property of the Appliance constructor, referencing exactly this version of the code.
receipt - a cryptographic receipt indicating that the appliance has been submitted for publication to the scheduler.
cost - the number of DCCs the publish wallet was debited for this transaction
All properties of an appliance must be redefined when a new version is published. Publish may reject the promise with the following errors:
ENAMESPACE - The appliance name has already been reserved
EEXIST - This version has already been published
EWORKTOOBIG - the combined size of the functions and local modules associated with this appliance exceeds the maximum allowable size
ENOFUNDS - insufficient credit in the account associated with the publish wallet
refund - accepts as its sole argument the object that publish resolved its promise with. This causes the scheduler to un-publish the appliance and refund the DCCs used to publish it. This action must be performed before the appliance has been used by a job, or 24 hours, whichever comes first.
defineFunction - Define a function in the appliance, return an instance of JobHandle and accepting four arguments:
the function’s name
its implementation, with identical semantics to the work function in a job (see:
compute.for()
)an Estimation Function (see below)
a slice profile object, used as the baseSliceProfile for the Estimation Function. Any changes to the
requirePath
ormodulePath
properties on this object will be stored during publish() and will be reflected automatically when the function is used in a job.
requirePath - initially initialized as an empty array, this object is captured during publish(), and is pushed on to
require.path
in the worker before requirePath is adjusted for the function (see requirePath in the Job Handles section)
For deployed appliances
do - same as
compute.do
, except instead of a function work, name of the function within the appliance is specificed. Returns an instance of JobHandle.for - same as
compute.for
, except instead of a function work, the name of the function within the appliance is specified. Returns an instance of JobHandle.
Properties
id - a unique string assigned by the scheduler which uniquely identifies this appliance.
versionId - a unique string assigned by the scheduler which uniquely identifies this version of this appliance; same as the string returned during the publish() phase.
These properties are optional, public, and probably displayed or used during mining
hashTags - meta data about the Appliance, used by workers. Comma-separated strings
name - human-readabled appliance name
description - human-readable description of the appliance
author - author name
organization - author’s organization (e.g. university, employer, etc)
email - author’s email address
website - website related to the appliance
Estimation Functions
Estimation functions are used by the scheduler to characterize slices derived from Appliances based on knowledge of the input set, without actually performing any work.
An estimation function receives as its arguments the first element in the input set for a given job, followed by any work function arguments. The function must return an object whose properties are numbers which represent linear scaling factors for the various resources (cpuHours, gpuHours and outputBytes) as defined in the baseSliceProfile. The inputBytes element is not used here as the Scheduler has the means to calculate that directly on a per-slice basis.
An undefined estimation function or estimation function result causes the work to deployed as though it came from an ad-hoc job.
Estimation functions which routinely yield slice characterizations bearing no resemblance to reality will eventually be blacklisted from the network; if the publishKeystore happens to be an Identity on the Distributed Computer, that person will be notified when the blacklisting happens.
SlicePaymentDescriptor Objects
SlicePaymentDescriptor are used to describe the payment that the user is offering to compute one (each) slice of the job. The Compute API defines three fixed value descriptors for use by DCP users; other descriptors can be specified as SlicePaymentDescriptor objects. The fixed value profiles are
Number literal - request computation at that value per slice; i.e.
0
requests free computation and100
offers to pay 100 DCC per slice.compute.marketValue
- request computation of the entire job at market value (as of the time when the job is executed), with no upper limit whatsoever.compute.marketValue(ratio, max)
- request computation at market value multiplied by ratio, with an upper limit per slice of max DCC. In this case, the market value used is from a snapshot taken at a regular interval; this is to prevent run-away valuation when ratio != 1 and the job dominates the scheduler’s work.
SlicePaymentDescriptor objects have the following properties:
offerPerSlice – the payment a user is offering to compute one (each) slice of the job.
Any interface which accepts a SlicePaymentDescriptor object (e.g. exec()
) must also handle literal numbers, instances of Number, and BigNums. When a number is used, it is equivalent to an object which specifies offerPerSlice. i.e., .exec(123)
is the same as .exec({ offerPerSlice: 123 })
Distributed Computer Wallet
The Distributed Computer acts as a wallet for two types of keys; “Identity” and “Bank Account”. Identity keys identify an individual; Bank Account keys identify accounts within the DCP bank and the DCC contract on the Ethereum network.
The preferred way to exchange keys between DCP client appliance, configuration files, end users, etc, is to use encrypted keystores. Distributed Compute Labs strongly discourages developers from writing code which requires users to possesses private keys, or enter passphrases to unlock to non-proxy keystores.
For more information, see the Wallet API and Proxy Keys documents.
Example Programs
All example programs are written for operation within any of the environments supported by DCP-Client, provided they are surrounded by appropriate initialization code for their respective environments.
NodeJS
async function main() {
const compute = require('dcp/compute');
const wallet = require('dcp/wallet');
/* example code goes here */
}
require('dcp-client').init().then(main).finally(() => setImmediate(process.exit));
Vanilla Web
<html><head>
<script src="https://scheduler.distributed.computer/dcp-client.js"></script>
</head>
<body onload="main();">
<script>
async function main() {
const { compute, wallet } = dcp;
/* example code goes here */
}
</script>
</body>
</html>
1. compute.for() form 2b
let job = compute.for(1, 3, function (i) {
progress('100%');
return i*10;
})
let results = await job.exec(compute.marketPrice);
console.log('results: ', results);
console.log('entries: ', results.entries());
console.log('fromEntries:', results.fromEntries());
console.log('keys: ', results.keys());
console.log('values: ', results.values());
console.log('key(2): ', results.key(2));
Output:
results: [ 10, 20, 30 ]
entries: [ [ '1', 10 ], [ '2', 20 ], [ '3', 30 ] ]
fromEntries: { '1': 10, '2': 20, '3': 30 }
keys: [ '1', '2', '3' ]
values: [ 10, 20, 30 ]
key(2): 20
2. compute.for() form 1, step overflow
const paymentAccount = keystore.getWallet();
let job = compute.for({start: 10, end: 13, step: 2}, (i) => progress(1) && i);
let results = await job.exec();
console.log(results)
Output: [ 10, 12 ]
3. compute.for() form 1 with group
let job = compute.for({start: 10, end: 13, group: 2}, (i) => progress(1) && i[1]-i[0]);
let results = await job.exec();
console.log(results);
Output: [ 1, 1 ]
4. compute.for() form 3
let job = compute.for([{start: 1, end: 2}, {start: 3, end: 5}],
(i,j) => (progress(1), i*j));
let results = await job.exec();
console.log(results);
Output: [[3, 4, 5], [6, 8, 10]]
5. compute.for(), form 3
let job = compute.for([{start: 1, end: 2}, {start: 3, end: 5}], function(i,j) {
return [i, j, i*j];
})
let results = await job.exec();
console.log(results);
Output: [[[1, 3, 3], [1, 4, 4], [1, 5, 5]], [[2, 3, 6], [2, 4, 8], [2, 5, 10]]]
6. compute.for() form 4
let job = compute.for([123,456], function(i) {
progress(1);
return i;
})
let results = await job.exec();
console.log(results);
Output: [ 123, 456 ]
7. Typical ENOFUNDS hander
job.on("ENOFUNDS", (fundsRequired) => {
await job.escrow(fundsRequired);
job.resume();
})