Define Filter Columns Settings

This dialog allows you to create and edit filter columns settings.

There are three kinds of settings:

You can specify the following data quality criteria:

The filter columns by default uses a sample to determine data quality and attribute importance. The default is to use a Sample Size of 2,000 records. You can turn off sampling, that is use all of the data, or increase the sample size.

The default values for Data Quality and Sampling are specified in preferences; see Filter Columns for details. You can change the default.

By default, Filter Columns does not calculate Attribute Importance. To calculate Attribute Importance, select Attribute Importance and specify the Target. Attribute Importance is most useful used in conjunction with Classification; the target for Attribute Importance in Filter Columns should be the same as the target of the classification model that you plan to build.