Getting Started with Data Miner

Data Miner uses workflows to specify data mining operations. The operations specified in the workflow are executed in the Data Miner server.

First create a database connection to the user account where the data to mine resides. Then create a project where the workflows reside. In the workflow, create a node that contains the mining data and connect it to a model build node. After building models, select the best model and apply it to new data to make predictions.

cue cards icon Launch the cue cards to learn how to use the Data Miner GUI and build classification models.

This cue card set introduces you to the Data Mining graphical user interface (GUI); it also illustrates how to solve a business problem.

Business Problem

Problem: A retailer wants to increase revenues by identifying all potentially high-value customers in order to offer incentives to them. For example, the incentive might be an affinity card that gives a discount on purchases.

Solution: Build a Data Mining Classification model to predict which customers who likely to be high spenders in the next year. Award these customers an affinity card.

An important part of data mining problems is data gathering and preparation. Data gathering and preparation consumes more than 50% of the time and effort of a data mining project. In this example, you are provided with data that is ready to be mined.

An important part of building a model is deciding which algorithm to use. In this example, you build and test four classification models with the same target using four different algorithms. You then select the best model by comparing the test results.

Identifying the best model and applying it to new data is not the end of the problem. It is necessary to deploy the results; in this case, you might want to distribute affinity cards to high-value customers. The first step in deploying the results is to export the predictions (apply results) to a table in the database.

Problem Solution

The following steps build several classification models, select the best model, apply the best model to new data, and write the results of the model apply to a database table:

  1. Create a Database Connection and a Project: Use the New/Select Database Connection dialog to create a database connection to the user for data mining. Select the connection, and use the Create Project dialog to create a project for data mining.
  2. Create a Workflow: Use the Create Workflow dialog to create a workflow.
  3. Create a Data Source: To specify the data used to build the models, drag a Data Source Node onto the workflow and select the view that includes the data.
  4. Create a Classification Node: Move a Classification node onto the workflow.
  5. Link Data Source to Classification Node and Specify Model Target: To specify data for building the classification models, link the data source node to the classification node; then specify the target of the models.
  6. Run Workflow: To build and test the models, run the workflow.
  7. View Model: After the execution of the workflow completes, view the models and the test results.
  8. Identify and Select the Best Model: Compare the models to identify the one that makes the best predictions.
  9. Apply the Best Model to New Data: Apply the best model to new data to make predictions.
  10. Create Table: Create a table in the database containing model predictions.

cue cards icon Launch the cue cards.