Tutorial

Developer Guide

Welcome to the beta release of the Distributed Computer! We're excited to have you aboard, and are confident that this document should get you from zero to distributed-hero in no time.

Overview

This tutorial will cover how to:

  1. Register for a Portal account
  2. Download and setup your keystores
  3. Setup your development environment
  4. Deploy your first job

Tutorial

DCP First Devs Program

Note: You currently cannot deploy any jobs on the Distributed Computer unless you are part of our beta program, which is open to join below! As of November 2020, we are inviting a limited set of developers into our beta program called First Developers. If you would like to be part of this "First Devs" cohort, you can sign up here!

1. Register For an Account

First, register for an account on our Portal. Once you register and verify your account, you'll now have access to the API keys you need to deploy compute jobs.

Here is a quick tour of the Portal:

Worker Tab

The Worker tab is where you can provide your computer's spare CPU & GPU resources to the Distributed Computer. After pressing "Start", your computer will begin computing work in exchange for "credits". You may even hear your fans rev up as your computer performs work! In addition, under Credits Earned you can see how many credits you have accumulated by doing work for other jobs on the Distributed Computer.

Worker Tab UI

The Worker tab runs in the background — even if you switch to another tab or go into screensaver mode — until you click the "Stop" button or close the browser tab it is operating in. This process does not materially impact other tasks that are running on your computer - it adapts in real-time to only use the processing time that would otherwise go to waste!

Accounts

The Accounts tab displays all of your Bank Keystores (click here to learn more about Keystores). You will see a default Bank Keystore, and if you have signed up for the First Developers program, you should have 10 credits in it after logging out and back in. You may create as many other Bank Keystores as you want, but we recommend naming them default.keystore when you download them individually!

If you wish to view your credits in any of these Bank Keystores, click the refresh icon (looks like a little roundabout icon next to "Balance" under each Bank Keystore) to retrieve the latest balance. You can even watch your balance increase as you run a worker (a node contributing its CPU and/or GPU power to the Distributed Computer). You may add en extra layer of encryption with a password for individual Bank Keystores by clicking the down arrow, then clicking "Change Keystore Passphrase".

Accounts Tab UI

2. Download and Setup Your Keystores

Download Bank Keystore

To download your Bank Keystore navigate to the Accounts tab in the Portal, click the down arrow and then "Download" for the specific one you wish to download. This Bank Keystore will be the the one that credits are taken away from as you deploy your own jobs on the Distributed Computer.

We recommend not changing the name of the downloaded .keystore file from default.keystore, as this is the default name that applications using the Distributed Computer will search for on your computer when you deploy a job.

Download Bank Keystore

Download Identity Keystores

If your Bank Keystore is like a wallet that you hold funds in, think about your Identity Keystore like a Passport or ID Card. With the Distributed Computer, you will need both your wallet (Bank Keystore) and Passport/ID Card (Identity Keystore) to deploy jobs on its network.

To find and download your Identity Keystore, navigate to the First Devs tab in the Portal. Click the green '+' icon under "Your Keys To DCP" near the bottom of the page. Feel free to give it a name that the Portal will recognize, then click the green download button next to that Identity Keystore to save it locally. When you click download, you can choose to add a passphrase to the keystore. This is optional, and if you want to leave it unencrypted simply click the "Continue" button without entering any passphrase in the fields.

Download Identity Keystore

Your Identity Keystore is unique to you, and it lets the Distributed Computer know who is deploying a job for security reasons.

Once it's downloaded, be sure to name the file id.keystore (or leave it as id.keystore if it already is). This is the default name that applications using the Distributed Computer will search for when they try to find your Identity Keystore.

Setting Up Your Keystores

Every application that uses the Distributed Computer looks for both your Bank Keystore and Identity Keystore in a default location on your local file system. If the application doesn't find them here, it won't be able to deploy a computational job. This is why it is important to make sure they are named correctly (default.keystore for your Bank Keystore, and id.keystore for your Identity Keystore), as well as located in the correct place (described below).

Windows

We recommend saving your Bank Keystore file in your home directory as C:\Users\<YOUR_USER_NAME>\.dcp\default.keystore. Similarly, we recommend saving your Identity Keystore at C:\Users\<YOUR_USER_NAME>\.dcp\id.keystore.

If .dcp is not in this directory on your computer, you can simply make a new folder and move your id.keystore & default.keystore files into it.

macOS & Linux

Both macOS & Linux machines have the same filepath for configuring keystores.

We recommend saving your Bank Keystore file in your home directory as /home/<YOUR_USER_NAME>/.dcp/default.keystore. Similarly, we recommend saving your Identity Keystore at /home/<YOUR_USER_NAME>/.dcp/id.keystore.

If .dcp is not in this directory on your computer, you can simply make a new folder and move your id.keystore & default.keystore files into it.

3. Setting Up Your Development Environment

The process for launching jobs on the Distributed Computer depends on the Javascript environment you prefer to use. Right now, you can launch a DCP application from:

  1. A local NodeJS instance on your device
  2. Simply using vanilla Javascript from a web project!
  3. Computational notebooks like Google Colab, Jupyter, Azure Notebooks, Observable, & more.

We'll show you how to do all three to get you started, though you might prefer a specific one for your own development.

Node.js

First, you have to install the latest version of Node.js. Next, you have to download the client for the Distributed Computer 'dcp-client' in order to launch your own computational jobs.

If this is a new project, be sure to create a new directory and in a terminal run npm init to create a package.json file.

To install dcp-client as a Node.js dependency, run the following command in your terminal:

npm install dcp-client

You're now ready to write and deploy your first job! Check out our Node example for a good example of launching your first Node.js application on the Distributed Computer.

Vanilla Web

If you prefer to launch jobs on the Distributed Computer from the web with HTML & Javascript, this approach is for you!

To gain access to the Distributed Computer within your web project, all you need to do is add the following script tag to the head of your HTML document before any other JavaScript code:

<script src="https://scheduler.distributed.computer/dcp-client/dcp-client.js"></script>

This will create a dcp global that can be used to access the modules of the Distributed Computer.

Don't know what that means? No problem! Here's a quick example we made using JSFiddle to help get you started (click below for the inline demo, or this link) to go to the full JSFiddle page.

InLine Demo

If you'd like to see an example on GitHub of a web example that uses the Distributed Computer, check out this link. An actual web app that uses the Distributed Computer in this way is Diskfit, which is a tool for astrophysicists. It may inspire you to build a similar frontend to your application!

Computational Notebooks

There are many types of computational notebooks that let you easily share your work. Perhaps the best known of these is Jupyter, although there are several others like Google Colab, Observable, and more.

The advantage of using these notebooks is because they are powerful, versatile, shareable, and provide the ability to perform data visualization in the same environment. For example, Jupyter is a web-based interactive environment that lets you combine text, media, animations, and other elements that improve the user experience.

Using the Distributed Computer from within these notebook environments is very easy. You just have to follow their respective instructions for getting started and configuring your environment (if necessary). Google Colab is perhaps the easiest to use, since every notebook can be executed through the web without a user having to download or install anything. Some good instructions on how to use each of these can be found below:

Although the Distributed Computer requires most code to be written in JavaScript, it is compatible with Python variables and the .ipynb format common in these notebooks. To learn more, please see our guide on a Python to JavaScript variable-sharing tool we have developed called Bifrost.

Check out our example of a simple machine learning project that uses Google Colab if you'd like to see these notebooks in action. Hint: You will need a special type of API Key that combines the Bank Keystore and ID Keystore if you'd like to run it. Send us a message on Slack or at info@distributed.computer if you'd like one!

Deploying Jobs

At its core, a job can be thought of as an input set & a work function. Executing any job takes these and creates an output set.

Jobs can be created with the compute.for method. The arguments to this method determine how many 'Slices' the job will have and what data each 'Slice' should be executed with. A 'Slice' is your work function paired with one of your provided inputs. It is the basic unit of parallelism in the job you create, where each slice is transmitted to a compute node using the Distributed Computer. The compute node that picks up a particular slice will follow the instructions it was given by the input set & work function to create an output set. This output set is transmitted back to you as the job owner, and can be thought of as one "piece of the puzzle" being completed.

As you might imagine, some jobs might have only a few hundred or even a few dozen slices, while others can have thousands or even millions. The amount of time each slice takes to complete will also vary dramatically depending on your input set & work function. Sometimes it may take as little as a few hundred milliseconds, while other slices might take hours or days to finish. Slices from the same job will usually not have a range this large, although some jobs might have a big distribution of times it takes the individual slices to complete. If you're new to all this, welcome to the exciting world of Parallel Computing!

In Node

Create 10 slices. In this case the first slice will run your work function with 1 as an input, then 2, etc... all the way to 10.

require('dcp-client').initSync();
const compute = require('dcp/compute');
// dcp-client is ready to be used

const job = compute.for(1, 10, function (n) {
  const result = n + 100;
  progress('50%');
  result = result * n;
  progress('100%');
  return result;
});

// execute the job
job.exec();

In Vanilla-Web

Create a slice for each color in an array:

const { compute } = dcp;

const job = compute.for(['red', 'green', 'blue'], function (colour) {
  console.log(colour);
  progress();
  return colour;
});

// execute the job
job.exec();

Note

The last argument, the work function, is what will be executed on the worker nodes. It must be either a string, or stringifyable via toString().

The work function must call the global function progress() at least one time during its execution, and at least once per minute. This is to ensure that the slice is operating normnally and has not crashed. If the slice does not call progress, or does not call it frequently enough, the job will be cancelled with the error ENOPROGRESS. Progress can be provided with a value that will be used to show how far the slice has completed. This value can be a percentage between 0 and 1, a string like '50%', or undefined to indicate indeterminate progress (when a worker node can't tell how long it will take to finish a Slice or how much progress it has made. This is actually very common for a lot of computational jobs!).

Limitations to consider: Since the work function must be either a string or stringifyable via toString(), it means that native functions (i.e. Node functions written in C++) cannot be used for work. Additionally, the function must be completely defined and not a closure, since stringification cannot take the closure environment into account. This means requiring all dependent code to be within the work function, since dependencies that are being required outside will not be available in the workers. A rule of thumb is that if you cannot eval() it, you cannot distribute it.

A Word of Advice: Deploy Locally & Save Credits ⚠️

If you want to test deploying a job but don't want to use up credits from your Bank Keystore, we suggest to try deploying your job to your own computer first using a method called localExec() rather than the regular exec(). It is always better to be safe than sorry here by making sure your job works before you spend credits distributing it to hundreds of computers! You will find some examples of this method in our documentation. If you have any questions, don't hesitate to reach out to our core developers on Slack or via email (info@distributed.computer).

Using Credits With The Distributed Computer

Credits on the Distributed Computer are used to pay for work (after all, the people who own those worker nodes paid for the computer and its electricity!). The Distributed Computer's centralized 'Bank' maintains a ledger of credits that are associated with user accounts and Bank Keystores. Accounts are represented and accessed by keystore files, which contain the private key used to authorize access to a Bank Keystore. Try launching a job and view your credit balance decrease, or do some work on the Worker tab and see your credit balance go up!

By default, a job you deploy from Node.js will look for your Bank Keystore in .dcp/default.keystore. When deploying a job from the web, a modal will be displayed to prompt you for your Bank Keystore.

Note In its current implementation, the value of these credits is not tied to anything. They are currently a placeholder for testing & experimentation purposes. After the First Developers beta, the Distributed Computer will be upgraded to feature a full implementation of the costing and metering algorithms fundamental to make sure credits match the actual amount of work done. This full release of the Distributed Computer will also add real monetary value to credits, and a way to cash them out. So, hang on to those credits you aren't using - they will be worth real money soon, no matter what country you are in!

Listening for Events

There are a number of useful events available on the job handle, which is itself an Event Emitter. The table below lists the most frequently used events; the rest of which can be found in the Compute API Documentation.

Event Name Emitter Description
accepted job Emitted when the scheduler accepts this job during deploy
status job Emitted when a slice is distributed or finished being computed
result job Emitted when a result is submitted by a worker
console job Emitted when a console method (log, warn, error, debug) is invoked in the work function
error job Emitted when a slice throws an error. By default, uncaughtExceptions are logged to the console

Examples

  1. Deploying a simple job and checking the results
const job = compute.for(1, 3, function (i) {
  progress();
  return i * 10;
});

job.on('accepted', () => console.log('Job accepted', job.id));
job.on('complete', () => console.log('Job complete!'));

const results = await job.exec();
console.log('results:    ', results);
console.log('entries:    ', results.entries());
console.log('fromEntries:', results.fromEntries());
console.log('keys:       ', results.keys());
console.log('values:     ', results.values());
console.log('key(2):     ', results.key(2));
See demo

  1. Listening for console events
const job = compute.for(1, 5, function (i) {
  progress();
  console.log('Hello, World!', i);
  return i;
});

job.on('console', (event) => console.log(event));

await job.exec();
See demo

  1. Listening for exceptions thrown during execution
const job = compute.for(1, 3, function (i) {
  throw new Error(
    'Oops, an error was thrown! Better listen for these events to know when it happens!',
  );
});

job.on('error', (event) => {
  console.error('An exception was thrown by the work function:', event.message);
});
await job.exec();
See demo

Modules

The following modules are available from the DCP Client package:

Global dcp (script tag) CommonJS Module Description Documentation
dcp['client-modal'] n/a Module for creating modals in the browser module:dcp/client-modal
dcp.compute 'dcp/compute' The Compute API, used to deploy and manage work that has been deployed to the network module:dcp/compute
dcp['dcp-build'] 'dcp/dcp-build' The build object is exported for introspection. module:dcp/dcp-build
dcp['dcp-cli'] 'dcp/dcp-cli' Provides a standard set of DCP CLI options and related utility functions via yargs. module:dcp/dcp-cli
dcp['dcp-config'] 'dcp/dcp-config' The active configuration object. module:dcp/dcp-config
dcp['dcp-events'] 'dcp/dcp-events' Provides classes related to cross-platform event emitting. module:dcp/dcp-events
dcp.wallet 'dcp/wallet' The Wallet API, used to access and manipulate DCP addresses and keystores module:dcp/wallet
dcp.worker 'dcp/worker' The Worker API, used to start workers on DCP module:dcp/worker

GPUs

The Distributed Computer currently supports WebGL and WebGL2.0 for parallel data processing. This allows for fractal rendering, massively parallelized neural networks, WebGL based render farming, and other GPU heavy tasks.

Since WebGPU promises significantly more functionality than WebGL, our core developers are working on supporting the WebGPU Compute API (not for rendering graphics yet) which exposes the new compute pipelines available from the latest GPU backends such as Vulkan, Metal, and DX13. Direct access to modern Backend GPU APIs will make the Distributed Computer's GPU capabilities even more powerful!

For information on how to set requirements for your jobs see Requirements Object Properties.

What's Next

You're all set to go and deploy your own jobs on the Distributed Computer! Please feel free to check out other guides or read through our documentation to explore what the technology has to offer. If you haven't already, don't forget to sign up for our First Developers program where we will send you a few emails with more code examples, inspirational projects, and more. We will respect your inbox, so don't worry about that.

Happy coding!

The Team @ Distributed Compute Labs Kingston, ON, Canada