MINING_DATA_TEXT_BUILD_V
contains the attribute COMMENTS
that consists of unstructured data. COMMENTS
must be prepared before it is used for model building.
In the Component Palette, expand Text.
Click Build Text. Move the mouse to the workflow; click again.
Name the new node PrepareBuild
.
Right-click the data source node MINING_DATA_TEXT_BUILD_V
, and select Connect. Drag the line to the new node
and click again.
COMMENTS
is the only attribute that consists of text. Follow these steps to define how to transform COMMENTS
:
PrepareBuild
; select Edit. The Edit Build Text Node dialog opens.COMMENTS
and click COMMENTS_TOK
is listed.
COMMENTS
has Right-click PrepareBuild
and select Run.
COMMENTS_TOK
, which consists of tokens. To view the tokens, double-click PrepareBuild
and select COMMENTS_TOK
in the upper pane. Note that COMMENTS_TOK
has type DM_NESTED_NUMERICALS. The lower pane has two tabs.
The Tokens tab lists all the tokens and their frequency. The Output tab shows the tokens in an individual
comment; select a comment by number to see the tokens.The text column is prepared so that you can use the data source to build models. The prepared attribute COMMENTS_TOK
will be passed to the model build node, and COMMENTS
will not be passed on.
Tell me more about what I see in the GUI
The
Next step is to build models.
Copyright © 2011, 2012, Oracle. All rights reserved.