Skip Headers

Oracle9i Warehouse Builder User's Guide
Release 9.2

Part Number B10996-01
Go To Table Of Contents
Contents
Go To Index
Index

Go to previous page Go to next page

11
Configuring ETL Objects

Earlier in the design phase, you defined a logical model for your target system using Warehouse Builder design objects. This chapter includes reference information and recommendations for assigning physical properties to mappings and process flows.

This chapter includes:

Configuring Mappings Reference

When you correctly configure mappings, you can improve the ETL performance. Use this section as a reference for setting configuration parameters that govern how Warehouse Builder loads data and optimizes code for better performance.

This section includes the following topics:

Procedure for Configuring Mappings

To configure physical properties for a mapping:
  1. Select a mapping from the navigation tree, select Object from the menu bar, and select Configure.

    Or, right-click the mapping you want to configure and select Configure from the pop-up menu.

    Warehouse Builder displays the configuration Properties dialog for a mapping as shown in Figure 11-1.

Figure 11-1 Configuration Properties dialog For Mapping Tables

Text description of configmc.gif follows.

Text description of the illustration configmc.gif

  1. Set Deployable to true to enable Warehouse Builder to generate a set of scripts for mapping entities marked as deployable.

    The default setting is true. If you set deployable to false for a mapping, Warehouse Builder does not generate scripts for that mapping.

  2. Set Step Type to the type of code you want to generate for the selected mapping.

    The options you can choose from depend upon the design and use of the operators in the mapping. Warehouse Builder sets the correct value depending on the mapping: PL/SQL, SQL*Loader, ABAP (for an SAP source mapping).

  3. Expand the Runtime Parameters Reference to configure your mapping for deployment.

    For a description of each runtime parameter, see Runtime Parameters Reference.

  4. Expand Code Generation Options Reference to enable performance options that optimize the code generated for the mapping.

    For a description of each option, see "Code Generation Options Reference".

  5. Expand Sources and Targets Reference to set the physical properties of the operators in the mapping.

    The Configuration Properties dialog displays all source and target operators in a mapping under Sources and Targets. Each operator contains a set of properties you can edit. For information on configuring sources and targets in a mapping, see "Sources and Targets Reference".

  6. If the mapping contains a flat file as a source or target, configure additional parameters. For information on configuring mappings with flat file sources and targets, see "Configuring Flat File Operators".

Runtime Parameters Reference

When you configure Runtime Parameters for a mapping, you set the default behaviors for the mapping. You can override these parameters when you execute the mapping either in the Deployment Manager, the Process Flow Editor, or Oracle Enterprise Manager.

The Runtime Parameters include the following parameters:

Default Operating Mode

For mappings with a PL/SQL implementation, select a default operating mode. The operating mode you select can greatly affect mapping performance. For details on how operating modes affect performance, see "Selecting a Default Operating Mode". You can select one of the following operating modes:

Bulk Size

Use Bulk Size to specify the number of rows in each bulk for PL/SQL Bulk Processing. Warehouse Builder uses the Bulk Size parameter only when Generate Bulk Processing Mode is set to true. For more information, see the Oracle PL/SQL Reference Guide.

Default Audit Level

Use Default Audit Level to indicate the audit level used when executing the package. Audit levels dictate the amount of audit information captured in the runtime schema when the package is run. The audit level settings are:

Maximum Number of Errors

Use Maximum Number of Errors to indicate the maximum number of errors allowed when executing the package. Execution of the package terminates when the number of errors reached is greater than the maximum number of errors value.

Commit Frequency

Commit frequency only applies to non-bulk mode mappings. Bulk mode mappings commit according to the bulk size.

When you set the Default Operating Mode to row based and Bulk Processing Code to false, Warehouse Builder uses the Commit Frequency parameter when executing the package. Warehouse Builder commits data to the database after processing the number of rows specified in this parameter.

If you set Bulk Processing Code to true, set the Commit Frequency equal to the Bulk Size. If the two values are different, Bulk Size overrides the commit frequency and Warehouse Builder implicitly performs a commit for every bulk size.

Default Purge Group

The Default Purge Group is used when executing the package. Each audit record in the runtime schema is assigned to the purge group specified.

Analyze Table Sample Percentage

When you set Analyze Table Statements to true, Warehouse Builder estimates when gathering statistics on the target tables. After data is loaded into the target tables, statistics used for cost-based optimization are gathered on each target table. You can set this parameter to the percentage of rows in each target table used for this analysis.

Code Generation Options Reference

The Code Generation Options include the following:

Bulk Processing Code

If this configuration parameter is set to true, Warehouse Builder generates PL/SQL bulk processing code. PL/SQL bulk processing improves row-based ETL performance by collecting, processing, and writing rows in bulk, instead of doing it row by row. The size of each bulk is determined by the configuration parameter Bulk Size. Set based mode offers optimal performance, followed by bulk processing, and finally by row based mode. For more information, see the Oracle PL/SQL Reference Guide.

Row Based Mode

Row based mode is an operating mode option. Row based mode allows for the maximum amount of runtime auditing. A mapping that runs in row-based mode processes data from a source record by record.

Parallel Row Code

Set this parameter to true to generate code using the table function feature supported in Oracle9i. This setting improves performance by enabling table functions to use parallel execution in the Oracle Server for Row based and Row based (Target only) operating modes.

You can set this parameter to true when the mapping generates PL/SQL code and is executed by Oracle9i server.

If the mapping generates SQL* Loader code, set the parameter to false (default). This property is only available for mappings that will generate PL/SQL code. It will not be available as a configuration property for mappings that will generate SQL* Loader code. This property is not available for set-based mode, but for set-based-fail-over-rowbased and set-based-fail-over-rowbased-target modes, this property is available if the set based mode fails and the code executes in row-based/row-based-target mode. This property is not available for mappings that include a Mapping Output Parameter, a Flat File operator, or a Sequence operator as a source.

Scenarios when parallelism is lost for table functions:

  1. When the input cursor refers to objects (tables, views, etc.) in remote schemas. For example, a mapping containing a table operator bound to a table in a remote database schema cannot be executed in parallel.

  2. When the input cursor passed to table function refers to objects in non-Oracle databases. Oracle database server parallel engine is unable to extract parallelism when the objects are accessed using an Oracle gateway. For example, when a mapping contains a table operator that is bounded to a table in a DB2 database, the mapping will not be executed in parallel.

  3. Parallel DML operation cannot be performed on tables with triggers. For example, a mapping with a table and a pre-map or post-map triggers that operate on the table.

  4. Updates and deletes can only be parallelized on partitioned tables. Update and delete parallelism is not possible within a partition or on a non-partitioned table. For example, if a mapping contains a table operator with a delete/update load, the target table must be partitioned.

Analyze Table Statements

If you set this parameter to true, Warehouse Builder generates code for analyzing the target table after the target is loaded if the resulting target table only is double or half its original size.

Optimized Code

Set this parameter to true to improve performance for mappings that include the Splitter operator and inserts into multiple target tables. When this parameter is set to true and the mapping is executed by Oracle9i, Warehouse Builder generates a single SQL statement (multi_table_insert) that inserts data into multiple tables based on same set of source data.

Warehouse Builder performs the multiple table insert only if this parameter is set to true and the Oracle target module database is Oracle9i. The multiple tables insert is performed only for mappings in set based mode that include a Splitter operator and no active operators such as an Aggregator or Joiner operator between the Splitter and the target. Also, the multiple insert is available only for tables, not views, materialized views, dimensions, or cubes. Each target table must have fewer than 999 columns. For detailed instructions on how to create a mapping with multiple targets, see "Example: Creating Mappings with Multiple Targets".

Set this parameter to false for mappings run in row based mode or for mappings executed by Oracle8i server. You may also want to set this parameter to false when auditing is required. When set to true, Warehouse Builder returns one total SELECT and INSERT count for all targets.

Correlated Commit

This applies to mappings with multiple targets. When set to true, Warehouse Builder commits data based on the selected records. When set to false, Warehouse Builder commits data based on target by target processing (insert, update, delete).

The mapping behavior varies according to the operating mode you select. For more information about correlated commit, see "Committing Data from a Single Source to Multiple Targets"

Sources and Targets Reference

The Configuration Properties dialog displays all source and target operators in a mapping under the Sources and Targets. The properties that are available for configuration vary according to the types of source and target operators used in the mapping.

For relational and dimensional sources and targets such as tables, views, and cubes, Warehouse Builder displays the following set of properties for each operator:

Bound Name

When Warehouse Builder generates code, it identifies each source and target operator by its bound name. By default, the bound name is the same name as the operator name.

Schema

This parameter maintained for backwards compatibility only.

In previous version of Warehouse Builder, you could link the mapping to a particular schema by clicking on the Schema field and typing a name.

Database Link

This parameter maintained for backwards compatibility only.

In previous version of Warehouse Builder, you could select a database link by name from the drop-down list. Source operators can be configured for schemas and database links, but targets can be configured for schemas only. Sources and targets can reside in different schemas, but they must reside in the same database instance.

Partition Exchange Loading

Set this parameter to true to enable partition exchange loading (PEL) into a target table. For information on how to design mappings for PEL, see "Using Partition Exchange Loading"

Hints

Define extraction or loading hints. Application developers often develop insights into their data. For example, they know that a query runs much faster if a set of tables is joined in one order rather than another. Warehouse Builder can incorporate these insights into the generated SQL code packages as SQL Optimizer Hints.

When you select a hint from the Hints dialog, the hint appears in the Existing Hints field. Type additional text as appropriate in the Extra Text column. The editor includes the hint in the mapping definition as is. There is no validation or checking on this text.

For information on optimizer hints and how to use them, see Oracle9i Designing and Tuning for Performance.

Constraint Management

Configure the following Constraint Management parameters:

SQL*Loader Parameters

Configuring Flat File Operators

The Configuration Properties dialog contains additional settings for Mapping Flat File operators, depending on how the operators are used in the mapping.

Flat File Operators as a Target

To configure properties unique to mappings with flat file targets:
  1. Select a mapping from the navigation tree, select Object from the menu bar, and select Configure.

    Or, right-click the mapping you want to configure and select Configure from the pop-up menu.

    Warehouse Builder displays the Configuration Properties dialog as shown in Figure 11-2.

Figure 11-2 Configuration Properties dialog (Flat File Target)

Text description of configpr.gif follows.

Text description of the illustration configpr.gif

  1. Choose the parameters you want to configure and click the space to the right of the parameter name to edit its value.

    For each parameter, you can either select an option from a list, type a value, or click ... to display another properties dialog.

  2. Set Deployable to true to enable Warehouse Builder to generate a set of scripts for mapping objects marked as deployable. The default setting is true. If you set deployable to false for a mapping, Warehouse Builder does not generate scripts for that mapping.

  3. Set Step Type to the type of code you want to generate for the selected mapping. The options you can choose from depend upon the design and use of the operators in the mapping. Depending on the mapping, you can select from PL/SQL, ABAP (for an SAP source mapping), or SQL*Loader.

  4. Specify the location to deploy the mapping.

  5. Under Runtime Parameters Reference, set the Default Operating Mode to Row based (target only). This type of mapping will not generate code in any other default operating mode. For a description of each runtime parameter, see "Runtime Parameters Reference".

  6. Set the Code Generation Options Reference as described in "Code Generation Options Reference"

  7. Set the Sources and Targets Referenceas described in "Sources and Targets Reference".

  8. For Access Specification, specify the name of the flat file target in Target Data File Name. For the Target Data File Location, specify a target file located on the machine where you installed the Warehouse Builder Runtime Platform.

Flat File Operator as a Source

To configure a mapping with a flat file operator as a source:
  1. Select a mapping from the navigation tree, select Object from the menu bar, and select Configure. Or, right-click the mapping you want to configure and select Configure from the pop-up menu.

    The Configuration Properties dialog appears as shown in Figure 11-3.

Figure 11-3 Configuration Properties Dialog (Flat File Source)

Text description of configpa.gif follows.

Text description of the illustration configpa.gif

  1. Choose the parameters you want to configure and click the space to the right of the parameter name to edit its value.

    For each parameter, you can either select an option from a list, type a value, or click ... to display another properties dialog.

  2. Set Deployable to true to a generate SQL*Loader script.

  3. Specify the Execution Location, the location where the mapping is executed.

  4. For the Source Data File, enter the full physical filename of the data file to be loaded.

    Type a complete path name for the file in the syntax of the operating system on which the SQL*Loader script is deployed. The value you provide is placed in the INFILE clause. If you enter a value enclosed in single quote, the generated code will be LOAD. CONTINUE_LOAD is used when a direct load of multiple tables is discontinued and needs to be restarted. It is used in conjunction with the operator-level SKIP option.

  5. Specify the Log File Location and Log File Name.

  6. Set Continue Load.

    If SQL*Loader runs out of space for data rows or index entries, the load is discontinued. If Continue Load is set to true, Warehouse Builder attempts to continue discontinued loads.

  7. In Nls Characterset, specify the character set to place in the CHARACTERSET clause.

  8. In Direct Mode, specify the value of the DIRECT option as either TRUE or FALSE. True indicates that a direct path load will be done. False indicates that a conventional load will be done. In general, direct mode is faster.

  9. In Operation Recoverable, select true to indicate that the load is recoverable. Select False to indicate that the load is not recoverable and records are not recorded in the redo log. This parameter controls the output of the RECOVERABLE clause.

  10. Configure the following parameters that affect the OPTIONS clause in the SQL *Loader scripts Warehouse Builder generates for mappings with flat file sources.

    Perform Parallel Load: Specifies the value of the PARALLEL option as either =TRUE or =FALSE. True indicates that direct loads can operate in multiple concurrent sessions.

    Errors Allowed: If the value specified is greater than 0, then the ERRORS = n option is generated. SQL*Loader terminates the load at the first consistent point after it reaches this error limit.

    Records To Skip: If the value specified is greater than 0, then the SKIP = n option is generated. This value indicates the number of records from the beginning of the file that should not be loaded. If the value is not specified, no records are skipped.

    Records To Load: If the value specified is greater than 0, then the LOAD = n option will be generated. This value specifies the maximum number of records to load. If a value is not specified all of the records are loaded.

    Rows Per Commit: If the value specified is greater than 0, then the ROWS = n option is generated. For direct path loads, the value identifies the number of rows to read from the source before a data is saved. For conventional path loads, the value specifies the number of rows in the bind array.

    Read Size: If the value specified is greater than 0, then the READSIZE = n option is generated. The value is used to specify the size of the read buffer.

    Bind Size: If the value specified is greater than 0, then the BINDSIZE = n option is generated. The value indicates the maximum size in bytes of the bind array.

    Read Buffers: If the value specified is greater than 0, then the READBUFFERS n clause is generated. READBUFFERS specifies the number of buffers to use during a direct path load. Do not specify a value for READBUFFERS unless it becomes necessary.

    Preserve Blanks: If this parameter is set to TRUE, then the PRESERVE BLANKS clause is generated. PRESERVE BLANKS retains leading white space when optional enclosure delimiters are not present. It also leaves the trailing white space intact when fields are specified with a predetermined size.

    Database File Name: This parameter enables you to specify the characteristics of the physical files to be loaded. The initial values for these parameters are set from the properties of the flat file used in the mapping.

    If this parameter is set to a non-blank value, then the FILE= option is generated. The value specified is enclosed in single quotes in the generated code.

    Control File Location and Control File Name: The control file name necessary for audit details.

    For more information on each SQL*Loader option and clause, see Oracle9i Database Utilities.

  11. Expand the Runtime Parameters Reference to configure your mapping for deployment.

    Audit: Perform audit when the step is executed.

    Default Purge Group: The Default Purge Group is used when executing the package. Each audit record in the runtime schema is assigned to the purge group specified.

  12. Expand Sources and Targets Reference to set the physical properties of the operators in the mapping as described in "Sources and Targets Reference"
.

Strategies for Configuring PL/SQL Mappings

This section discusses various strategies for configuring mappings. Use this section to determine which configuration settings are appropriate for a given scenario.

This section includes the following topics:

Selecting a Default Operating Mode

For mappings with a PL/SQL implementation, select one of the following operating modes:

The default operating mode you select depends upon the performance you expect, the amount of auditing data you require, and how you design the mapping. Mappings have at least one and as many as three valid operating modes, excluding the options for failing over to row based modes. During code generation, Warehouse Builder generates code for the specified default operating mode as well as the unselected modes. Therefore, at runtime, you can select to run in the default operating mode or any one of the other valid operating modes.

The types of operators in the mapping may limit the operating modes you can select. As a general rule, mappings run in set based mode can include any of the operators except for Match-Merge, Name-Address, and Transformations used as procedures. Although you can include any of the operators in row based and row based (target only) modes, there are important restrictions on how you use SQL based operators such as Aggregators, Joins, and Key Lookups. To use SQL based operators in either of the row based modes, ensure that the operation associated with the operator can be included in the cursor.

These general rules are explained in the following sections. For more information on how to use operators, see "Avoiding Invalid Designs for PL/SQL Mappings"

Set based

In set based mode, Warehouse Builder generates a single SQL statement that processes all data and performs all operations. Although processing data as a set improves performance, the auditing information available is limited. Runtime Auditing is limited to reporting to the execution error only. In set based mode, you cannot view details on which rows contain errors.

Table 11-4 shows a simple mapping and the associated logic Warehouse Builder uses to generate code for the mapping when run in set based operating mode. TAB1, FLTR, and TAB2 are processed as a set using SQL.

Figure 11-4 Simple Mapping Run in Set Based Mode

Text description of setex.gif follows.

Text description of the illustration setex.gif

To correctly design a mapping for the set based mode, avoid operators that require row by row processing such as Match-Merge and Name-Address operators. If you include an operator in the dataflow that cannot be performed in SQL, Warehouse Builder does not generate set based code and issues an error when you execute the package in set based mode.

For target operators in a mapping, set their loading types to either INSERT/UPDATE or UPDATE/INSERT. Warehouse Builder does not support UPDATE or DELETE loading in set based mode. For a complete listing of how Warehouse Builder handles operators in set based mappings, see Table 11-3

Row based

In row based mode, Warehouse Builder generates statements that process data row by row. The select statement is in a SQL cursor. All subsequent statements are PL/SQL. You can access full runtime auditing information for all operators performed in PL/SQL and only limited information for operations performed in the cursor.

Table 11-5 shows a simple mapping and the associated logic Warehouse Builder uses to generate code for the mapping when run in row based operating mode. TAB1 is included in the cursor and processed as a set using SQL. FLTR and TAB2 are processed row by row using PL/SQL.

Figure 11-5 Simple Mapping Run in Row Based Mode

Text description of rowex.gif follows.

Text description of the illustration rowex.gif

If the mapping includes any SQL based operators, which cannot be performed in PL/SQL, Warehouse Builder attempts to generate code with those operations in the cursor. To generate valid row based code, design your mapping such that , if you include any of the following SQL based operators, Warehouse Builder can include the operations in the cursor:

In order for the above operators to be included in the cursor, do not directly precede it by an operator that generates PL/SQL code. In other words, you can not run the mapping in row-based mode if it contains a Transformation implemented as procedure, a Flat File used as a source, a Match-Merge, or Name-Address operator directly followed by any of the seven SQL based operators. For the design to be valid, include a staging table between the PL/SQL generating operator and the SQL based operator. For more information about how to design mappings, see "Avoiding Invalid Designs for PL/SQL Mappings"

Row based (Target Only)

In row based (Target Only) mode, Warehouse Builder generates a cursor select statement and attempts to include as many operations as possible in the cursor. For each target, Warehouse Builder inserts each row into the target separately. You can access full runtime auditing information for all operators performed in PL/SQL and only limited information for operations performed in the cursor. Use this mode when you expect fast set based operations to extract and transform the data but need extended auditing for the loading the data, which is where errors are likely to occur.

Table 11-6 shows a simple mapping and the associated logic Warehouse Builder uses to generate code for the mapping when run in row based (target only) operating mode. TAB1 and FLTR are included in the cursor and processed as a set using SQL. TAB2 is processed row by row.

Figure 11-6 Simple Mapping Run in Row Based (Target Only) Mode

Text description of rowtarge.gif follows.

Text description of the illustration rowtarge.gif

Row based (target only) places the same restrictions on SQL based operators as the row based operating mode. Additionally, for mappings with multiple targets, Warehouse Builder generates code with a cursor for each target.

Committing Data from a Single Source to Multiple Targets

If you want to populate multiple targets based on a common source, you may also want to ensure that every row from the source is represented correctly across all targets.

Figure 11-7 shows a mapping that illustrates this case. The target tables all depend upon the source table. For every row added to TARGET_1, one row should be added to the other targets. If this relationship is not maintained when you reload data, the data becomes inaccurate and possibly unusable.

Figure 11-7 Mapping with Multiple Targets Dependent on One Source

Text description of corcommi.gif follows.

Text description of the illustration corcommi.gif

If the number of rows from SOURCE is relatively small, maintaining the three targets may not be difficult. Manually maintaining targets dependent on a common source, however, becomes more tedious as you increase the number of rows from the source, or as you design more complex mappings with more targets and transformations.

To use Warehouse Builder to ensure that every row in the source is properly represented in every target, configure the mapping to use a correlated commit strategy.

Commit Strategies

When you assign a commit strategy for a mapping in Warehouse Builder, you determine the commit and rollback behaviors for the multiple targets dependent on a single source. You can choose between the following strategies:

The combination of the commit strategy and operating mode determines mapping behavior. Table 11-1 shows the valid combinations you can select.

Table 11-1 Valid Commit Strategies for Operating Modes  
Operating Mode Correlated Commit Independent Commit

Set based

Valid

Valid

Row based

Valid

Valid

Row based (target only)

Not Recommended

Valid

Correlated commit is not recommended for row based (target only). By definition, this operating mode places the cursor as close to the target as possible. In most cases, this results in only one target for each select statement and negates the purpose of committing data to multiple targets. If you design a mapping with the row based (target only) and correlated commit combination, Warehouse Builder runs the mapping but, in most cases, does not perform the correlated commit.

To understand the affects each operating mode and commit strategy combination has on a mapping, consider the mapping from Figure 11-7Assume the data from source table equates to 1,000 new rows. When the mapping runs successfully, Warehouse Builder loads 1,000 rows to each of the targets. If the mapping fails to load the 100th new row to Target_2, you can expect the following results, ignoring the influence from other configuration settings such as Commit Frequency and Number of Maximum Errors:

Using the Correlated Commit Strategy

Correlated commit may impact the size of your rollback segments. Space for rollback segments may be a concern when you merge data (insert/update or updated/insert) in set based mode.

Correlated commit operates transparently with PL/SQL bulk processing code.

When you use Correlated Commit to ensure all source rows are represented in multiple targets, consider using the Set based fail over to row based operating mode. In this way, you can find a balance between performance and error handling.

The correlated commit strategy is not available for mappings run in any mode that are configured for Partition Exchange Loading or include an Advanced Queue, Match-Merge, or Table Function operator.

Avoiding Invalid Designs for PL/SQL Mappings

Warehouse Builder generates code for PL/SQL mappings that meet the following criteria:

As you design a mapping, you can evaluate its validity by taking note of the input and output code types for each operator in the mapping. For example, you can see that the mapping in Figure 11-8 is invalid because the Match-Merge operator MM generates PL/SQL output but the subsequent Join operator accepts SQL input only.

Figure 11-8 Mapping Violates Input Requirement for Join Operator

Text description of invalidm.gif follows.

Text description of the illustration invalidm.gif

To achieve the desired results for the mapping, consider joining the source tables before performing the Match-Merge, as shown in Figure 11-9, or loading the results from the Match-Merge to a staging table before performing the join, as shown in Figure 11-10.

Figure 11-9 Valid Mapping Design with Sources Joined Before Match-Merge

Text description of validmap.gif follows.

Text description of the illustration validmap.gif

Figure 11-10 Valid Mapping Design with Staging Table

Text description of validmaa.gif follows.

Text description of the illustration validmaa.gif

Table 11-3 and Table 11-3 list the implementation types for each Warehouse Builder operator. These tables also indicate whether or not PL/SQL code includes the operation associated with the operator in the cursor. This information is relevant in determining which operating modes are valid for a given mapping design. It also determines what auditing details are available during error handling.

Table 11-2 Source-Target Operators Implementation in PL/SQL Mappings
Operator Implementation Types Valid in Set Based Mode Valid in Row Based Mode Valid in Row Based (Target Only)

Source Operators: Tables, Cubes, Views, External Tables, etc.

SQL

Yes.

Yes.

Yes. Part of cursor.

Target Operators: Tables, Cubes, Views, etc.

SQL

PL/SQL

Yes, except when loading =UPDATE or DELETE.

Yes.

Yes. Not part of cursor.

Flat File as source

For PL/SQL, create External Table.

Yes.

Yes.

Yes. Part of the cursor.

Flat File as target

SQL

Yes, but not targets with loading =UPDATE or DELETE.

Yes.

Yes. Not part of cursor.

Advanced Queue as source

SQL

Yes.

Yes.

Yes, part of cursor.

Advanced Queue as target

SQL

Yes, but not targets with loading =UPDATE or DELETE.

Yes.

Yes. Not part of cursor

Sequence as source

SQL

Yes.

Yes.

Yes, part of cursor.

Table 11-3 Data Flow Operator Implementation in PL/SQL Mappings
Operator Name Implementation Types Valid in Set Based Mode Valid in Row Based Mode Valid in Row Based (Target Only) Mode

Aggregators

SQL

Yes.

Yes, only if part of the cursor.

Yes, only if part of the cursor.

Constant Operator

PL/SQL

SQL

Yes.

Yes.

Yes.

Data Generator

SQL*Loader Only

N/A

N/A

N/A

Deduplicator

SQL

Yes.

Yes, only if part of the cursor.

Yes, only if part of the cursor.

Expression

SQL

PL/SQL

Yes.

Yes.

Yes.

Filter

SQL

PL/SQL

Yes.

Yes.

Yes.

Joiner

SQL

Yes.

Yes, only if part of the cursor.

Yes, only if part of the cursor.

Key Lookup

SQL

Yes.

Yes, only if part of the cursor.

Yes, only if part of the cursor.

Mapping Input Parameter

SQL

PL/SQL

Yes.

Yes.

Yes.

Mapping Output Parameter

SQL

PL/SQL

Yes.

Yes.

Yes.

Match-Merge

SQL input

PL/SQL output

(PL/SQL input from XREF group only)

No.

Yes.

Yes. Not part of cursor.

Name-Address

PL/SQL

No.

Yes.

Yes. Not part of cursor.

Pivot

SQL

PL/SQL

Yes.

Yes.

Yes.

Post-Mapping Process

Irrelevant

Yes, independent of dataflow.

Yes.

Yes.

Pre-Mapping Process

Irrelevant

Yes, independent of dataflow.

Yes.

Yes.

Set

SQL

Yes.

Yes, only if part of the cursor.

Yes, only if part of the cursor.

Sorter

SQL

Yes.

Yes, only if part of the cursor.

Yes, as part of the cursor.

Splitter

SQL

PL/SQL

Yes.

Yes.

Yes.

Table Function

SQL or PL/SQL input

SQL output only

Yes.

Yes.

Yes.

Transformation as a procedure

PL/SQL

No.

Yes.

Yes. Not part of cursor.

Transformation as a function that does not perform DML

SQL

PL/SQL

Yes.

Yes.

Yes, included in the cursor.

Using Partition Exchange Loading

Data partitioning can improve performance when loading or purging data in a target system. This performance feature is referred to as Partition Exchange Loading (PEL).

PEL is recommended when loading a relatively small amount of data into a target containing a much larger volume of historical data. The target can be a table, a dimension, or a cube in a data warehouse.

This section includes the following topics:

About Partition Exchange Loading

By manipulating partitions in your target system, you can use Partition Exchange Loading (PEL) to instantly add or delete data. When a table is exchanged with an empty partition, new data is added.

You can use PEL to load new data by exchanging it into a target table as a partition. For example, a table that holds the new data assumes the identity of a partition from the target table and this partition assumes the identity of the source table. This exchange process is a DDL operation with no actual data movement. Figure 11-11 illustrates this example.

Figure 11-11 Overview of Partition Exchange Loading

Text description of concepto.gif follows.

Text description of the illustration concepto.gif

In Figure 11-11, data from a source table Source is inserted into a target table consisting of four partitions (Target_P1, Target_P2, Target_P3, and Target_P4). If the new data needs to be loaded into Target_P3, the partition exchange operation only exchanges the names on the data objects without moving the actual data. After the exchange, the formerly labeled Source is renamed to Target_P3, and the former Target_P3 is now labeled as Source. The target table still contains four partitions: Target_P1, Target_P2, Target_P3, and Target_P4. The partition exchange operation available in Oracle9i completes the loading process without data movement.

Configuring a Mapping for PEL

To configure a mapping for partition exchange loading, complete the following steps:
  1. In the navigation tree, right-click on a mapping and select Configure.

    Warehouse Builder displays the Configuration Properties window as shown in Figure 11-12.

Figure 11-12 PEL Configuration Properties

Text description of configpe.gif follows.

Text description of the illustration configpe.gif

  • By default, PEL is disabled for all mappings. Set PEL Enabled to True to use Partition Exchange Loading.

  • Use Data Collection Frequency to specify the amount of new data to be collected per run of the mapping. Set this parameter to specify if you want the data collected by Year, Quarter, Month, Day, Hour, or Minute. This determines the number of partitions.

  • Set Direct to True if you want to create a temporary table to stage the collected data before performing the partition exchange. If you set this parameter to False, Warehouse Builder directly swaps the source table into the target table as a partition without creating a temporary table. For more information, see "Direct vs. Indirect PEL".

  • If you set Replace Data to True, Warehouse Builder replaces the existing data in the target partition with the newly collected data. If set to False (default), Warehouse Builder preserves the existing data in the target partition. The new data is inserted into a non-empty partition. This parameter affects the local partition and can be used to remove or swap a partition out of a target table. At the table level, you can set Truncate/Insert properties.

    Direct vs. Indirect PEL

    When you use Warehouse Builder to load a target by exchanging partitions, you can load the target indirectly or directly.

    Using Indirect PEL

    If you design a mapping using PEL and it includes remote sources or a join of multiple sources, Warehouse Builder must perform source processing and stage the data before partition exchange can proceed. Therefore, configure such mappings with Direct PEL set to False. Warehouse Builder transparently creates and maintains a temporary table that stores the results from source processing. After performing the PEL, Warehouse Builder drops the table.

    Figure 11-13 shows a mapping that joins two sources and performs an aggregation. If all new data loaded into the ORDER_SUMMARY table is always loaded into same partition, then you can use Indirect PEL on this mapping to improve load performance. In this case, Warehouse Builder transparently creates a temporary table after the Aggregator and before ORDER_SUMMARY.

    Figure 11-13 Mapping with Multiple Sources

    Text description of getsales.gif follows.

    Text description of the illustration getsales.gif

    Warehouse Builder creates the temporary table using the same structure as the target table with the same columns, indexes, and constraints. For the fastest performance, Warehouse Builder loads the temporary table using parallel direct-path loading INSERT. After the INSERT, Warehouse Builder indexes and constrains the temporary table in parallel.

    Example: Using Direct PEL to Publish Fact Tables

    Use Direct PEL when the source table is local and the data is of good quality. You must design the mapping such that the source and target are in the same database and have exactly the same structure. The source and target must have the same indexes and constraints, the same number of columns, and the same column types and lengths.

    For example, assume that you have the same mapping from Figure 11-13 but would like greater control on when data is loaded into the target. Depending on the amount of data, it could take hours to load and you would not know precisely when the target table would be updated.

    To instantly load data to a target using Direct PEL:
    1. Design one mapping to join source data, if necessary, transform data, ensure data validity, and load it to a staging table. Do not configure this mapping to use PEL.

      Design the staging table to exactly match the structure of the final target that you will load in a separate mapping. For example, the staging table in Figure 11-13 is ORDER_SUMMARY and should be of the same structure as the final target, ORDER_CUBE in Figure 11-14.

    2. Create a second mapping that loads data from the staging table to the final target such as shown in Figure 11-14. Configure this mapping to use Direct PEL.

    Figure 11-14 Publish_Sales_Summary Mapping

    Text description of direct_p.gif follows.

    Text description of the illustration direct_p.gif

  • Use either the Warehouse Builder Process Flow Editor or Oracle Workflow to launch the second mapping after the completion of the first.

    Using PEL Effectively

    You can use PEL effectively for scalable loading performance if the following conditions are true:

    Configuring Targets in a Mapping

    To configure targets in a mapping for PEL:

    Step 1: Create All Partitions

    Warehouse Builder does not automatically create partitions during runtime. You must create all partitions before you can use PEL. See "Creating Partitions".

    For example, if you select Month as the frequency of new data collection, you need to create all the required partitions for each month of new data. Use the Configuration Properties window to create partitions for a table, dimension, or cube. Figure 11-15 shows the property inspector window for table ORDER_SUMMARY. This figure shows six partitions that have been added for this table.

    To use PEL, all partition names must follow a naming convention. For example, for a partition that will hold data for May 2002, the partition name must be in the format Y2002_Q2_M05.

    For PEL to recognize a partition, its name must fit one of the following formats.

    Ydddd

    Ydddd_Qd

    Ydddd_Qd_Mdd

    Ydddd_Qd_Mdd_Ddd

    Ydddd_Qd_Mdd_Ddd_Hdd

    Ydddd_Qd_Mdd_Ddd_Hdd_Mdd

    Where d represents a decimal digit. All the letters must be in upper case. Lower case is not recognized.

    Figure 11-15 Configuration Properties for Table ORDER_SUMMARY

    Text description of ordersum.gif follows.

    Text description of the illustration ordersum.gif

    If all partitions are added with correct names, Warehouse Builder automatically computes the Value Less Than property for each partition. Otherwise, the Value Less Than is must be manually configured for each partition in order for Warehouse Builder to generate a DDL statement. The following is an example of a DDL statement generated by Warehouse Builder:

    . . .
    PARTITION A_PARTITION_NAME 
          VALUES LESS THAN (TO_DATE(`01-06-2002','DD-MM-YYYY')),
    . . .
    
    

    Figure 11-16 shows automatically generated configuration values for the Value Less Than parameter.

    Figure 11-16 Automatically Generated "Value Less Than" Setting

    Text description of autogenv.gif follows.

    Text description of the illustration autogenv.gif

    Step 2: Create All Indexes Using the LOCAL Option

    Figure 11-17 shows an index (ORDER_SUMMARY_PK_IDX) added to the ORDER_SUMMARY table. This index has two columns, ORDER_DATE and ITEM_ID. Configure the following:

    Now Warehouse Builder can generate a DDL statement for a unique local index on table ORDER_SUMMARY.

    Using local indexes provides the most important PEL performance benefit. Local indexes require all indexes to be partitioned in the same way as the table. When the temporary table is swapped into the target table using PEL, so are the identities of the index segments.

    Figure 11-17 Configure an Index as a Local Index

    Text description of configin.gif follows.

    Text description of the illustration configin.gif

    If an index is created as a local index, the Oracle server requires that the partition key column must be the leading column of the index. In the above example, the partition key is ORDER_DATE and it is the leading column in the index ORDER_SUMMARY_PK_IDX.

    Step 3: Primary/Unique Keys Use "USING INDEX" Option

    In this step you must specify that all primary key and unique key constraints are created with the USING INDEX option. Figure 11-18 shows an example where the primary key constraint ORDER_SUMMARY_PK on the ORDER_SUMMARY table is specified with the USING INDEX option.

    With the USING INDEX option, a constraint will not trigger automatic index creation when it is added to the table. The server will search existing indexes for an index with same column list as that of the constraint. Thus, each primary or unique key constraint must be backed by a user-defined unique local index. The index required by the constraint ORDER_SUMMARY_PK is ORDER_SUMMARY_PK_IDX which was created in "Step 2: Create All Indexes Using the LOCAL Option".

    Figure 11-18 Specify a Constraint with USING INDEX option

    Text description of specifyc.gif follows.

    Text description of the illustration specifyc.gif

    Restrictions for Using PEL in Warehouse Builder

    These are the restrictions for using PEL in Warehouse Builder:

    Configuring Process Flows

    To configure a process flow:
    1. Navigate to a process flow package, right-click on a process flow and select Configure.

      Warehouse Builder displays the Configuration Properties sheet for the process flow as shown in Figure 11-19.

    Figure 11-19 Process Flow Configuration Properties Sheet

    Text description of configpf.gif follows.

    Text description of the illustration configpf.gif

    1. Expand Execution Settings and set the property Use Return as Status.

      This setting governs the behavior for activities that return NUMBER in their output. These activities include the FTP, External Process, and Transformation activities. When you set Use Return as Status to true, the Process Flow Editor assigns the outgoing transition conditions based on the following numerical return values for the activity:

      1 = Success Transition

      2 = Warning Transition

      3 = Error Transition

    2. Expand Path Settings and set the following properties for each activity in the process flow:

      Execution Location: The location from which this activity is executed. If you configured Oracle Enterprise Manager, you can select an OEM agent to execute the process flow.

      Remote Location: The remote location for FTP activities only.

      Working Location: The working location for FTP, FILE EXISTS and External Process activities only.

      Deployed Location: The deployment location. This setting applies to transformation activities only. For activities referring to pre-defined transformations, you must change the setting from Use Default Location and specify a valid location.

    3. Expand General Properties. You can view the bound name which is the name of the object that the activity represents in the process flow. Only mapping, transformation, and subprocess activities have bound names.


  • Go to previous page Go to next page
    Oracle
    Copyright © 2001, 2003 Oracle Corporation.

    All Rights Reserved.
    Go To Table Of Contents
    Contents
    Go To Index
    Index