Creating a Workflow

A workflow is normally built and run (executed) as follows:

  1. Create a blank workflow as described in Create a Workflow.

  2. Create a node that starts the workflow, that is, create a node that provides one or more sources for data mining operations. Such a node identifies a database object. For example, the starting node might be one of the Data Nodes.

  3. Create nodes that perform mining tasks, such as data preparation or model build and test. Nodes are described in Workflow Nodes.

  4. Connect nodes as described in Link.


    Note:

    Steps 1 and 2 are usually performed together, that is, you create a node, edit it, and connect it to an already existing node.

  5. Run nodes, as described in Run Node.

  6. Examine results.

  7. Iterate the steps as necessary.

Workflows must contain one or more sources of data, such as a table or a model. For example, to build a Naive Bayes model, you first identify input with a Data Source node. You then create a classification node to build and test the model.

Workflows are built up in such a way that they read left to right. The following is a simple workflow:

workflow from Getting Started cue card set

The Data Source node, Data Source 1, is at the start of the workflow; normally, it identifies a table or a view. The Data Source node is the ancestor (in this case, the parent) of the Classification node, Class Build 5. The Classification node is the descendent (in this case, child) of the Data Source node.

Workflow Terminology describes how workflow components are named.