A
ODM Interface Comparison

The Java and PL/SQL interfaces for Oracle Data Mining (ODM) provide similar functionality; however, they are not identical. They are aimed at different audiences; they support different features on a detailed level; and they must be used differently in different programming environments. This appendix compares the two interfaces.

A.1 Target Users of the ODM Interfaces

The two interfaces are aimed at different audiences, as follows:

The ODM Java interface is designed to support the development of interactive and batch mining applications and tools. The ODM Java interface is a complete interface for general-purpose application development. All mining operations, such as model build, apply, test, and lift, are executed as asynchronous tasks. The interface also supports wrapper operations -- which may involve one or more iterations of the core operations -- such as Model Seeker and utilities like import and export of PMML models. The interface also provides internal packages to support Data Mining for Java (DM4J).
The PL/SQL interface is designed to support traditional Oracle RDBMS application developers and DBAs who are familiar with SQL and PL/SQL packages. The DBMS_DATA_MINING package provides a set of core data mining primitives that enable creation, drop, or rename of a data mining model, and scoring of new data using a given model. The package also contains a set of helper functions for evaluating a model and for inspecting the contents of a model. All operations are synchronous. Data is not preprocessed implicitly by any of the operations. DBMS_DATA_MINING_TRANSFORM, a complementary, open-sourced package, provides a set of utilities to preprocess the data to be used for model creation, model testing, and for scoring new data using an existing model.

A.2 Feature Comparison of the ODM Interfaces

Table A-1 summarizes the differences in features between the two data mining interfaces.

Table A-1 ODM Java and PL/SQL Interface Feature Comparison

Feature	ODM Java Interface	DBMS_DATA_MINING Interface
Operation mode	Asynchronous.	Synchronous. If asynchronous execution is required, use other Oracle database features like unified scheduler
Algorithms	ODM k-means algorithm.	Different version of k-means algorithm; faster, handles sparse data, supports new distance metrics (cosine and fast cosine), handles categorical and numerical attributes, doesn't require binning (instead it normalizes numeric attributes) O-Cluster not supported
Model build specification	Based in ODM classes: LAD (data location), PDS (format of input data), MFS (function settings), MAS (algorithm settings)	Data location (schema) is passed in the argument list (default is user schema); mining function is passed in argument list; settings (function and algorithm) are passed in a single optional table
Settings	Provided through Java objects MAS (optional) and MFS.	Provided through an optional settings table.
Default settings	Available for algorithms?	Available for function and algorithm.
Attribute form type	LDS (explicit or convenience function)	Automatically inferred from column data type; form types can be modified using views
Location of input data and result tables	LAD (Java object)	Provided in the argument list as schema information; default is user schema
Input data structure	Supports both single-record case both single-record case (as a conventional relational table input) and multi-record case (as a table input in "transactional format")	Supports both single-record case (as a conventional relational table input) and multi-record case (as a conventional relational table with nested table columns representing association of multiple attributes of the same kind with the same case identifier ("wide data")
Model apply (data scoring)	Flexible filtering specification	Apply interface is provided; a separate interface to rank apply accepts a cost matrix input to enable results generation on the basis of cost
Model evaluation	Confusion matrix and lift metrics for classification, tightly coupled with models for maximum ease of use	Provides a variety of evaluation metrics: confusion matrix, lift RMSE, and ROC. Not coupled with a model for maximum flexibility; allows use of different cost matrices at evaluation time and performance evaluation of non-ODM models
Transformations (data preparation)	Internal support for automatic binning and normalization. Other transformations must be performed as pre-processing.	All transformation must be performed as pre-processing. Normalization and binning are supported by DBMS_DATA_MINING_TRANSFORM
Model export and import	PMML export/import for Naive Bayes and Association models; no support for native format	Export and import of all models in native format; no support for PMML.
Model comparison (finding the best model)	Model Seeker builds multiple NB and ABN models and selects the "best" one	Not supported
Cross validation	Automatic for NB models	Not supported

A.3 The ODM Interfaces in Different Programming Environments

Table A-2 compares using the two interfaces in different programming environments.

Programming in XML programming in PL/SQL

Table A-2 ODM APIs in Programming Environments

Environment	ODM Java interface	ODM PL/SQL interface
Programming in PL/SQL	Package ODM routines as PL/SQL (Java Stored Procedures)	Use the ODM packages DBMS_DATA_MINING and DBMS_DATA_MINING_TRANSFORM.
Programming in Java	Use native calls.	Use JDBC to call the ODM PL/SQL packages.
Programming in OCI/C or OOCI/C++	Invoke PL/SQL (Java stored procedures) through OCIStatement() calls.	.Use the ODM packages DBMS_DATA_MINING and DBMS_DATA_MINING_TRANSFORM
Programming in Pro*C, COBOL, or FORTRAN	Invoke PL/SQL (Java stored procedures) using EXEC calls	Use standard EXEC SQL interface.
Programming in XML	JDeveloper9i (and the new JDK1.4 release) enable seamless Java/XML and generation of execution objects (SOAP).	NA

A ODM Interface Comparison