Friday, 23 October 2015

Goldengate - Problems converting bit datatypes from sybase, MS Sql and My-Sql database to Oracle Number data type.

Recently encountered an issue where the client was taking data from MS-SQL Server to Oracle 11g Database. Both source and target environments were running the latest Goldengate Version 12.


Columns with Bit Data Type are not converted and sent to Oracle database. i.e. the corresponding column in Oracle db table is blank.


Goldengate Processes (Extracts and Replicats) don't give any kind of errors and their status is shown  as Running.


Symptoms -


 When Bit data type is used in sqlserver source, and the replicat site maps the BIT to char or number, the bit value will not show in target site.


Cause -


Bit values used to be treated as ASCII prior to 11.1. Now they are treated as binary, so a pre-11.1 trail with a bit value of 0 applied to a target with an 11.1 or later replicat will treat the ASCII value of 0x30 as 1 since it is not  0x00


NOTE - Although Oracle support lists the problem with Goldengate Ver 11. The issue was duplicated in Goldengate Ver 12 also.


Solution -


Apply @NUMTOBIN while converting and writing data on the target column.


e.g


MAP scott.emp,  target ora.emp,
COLMAP (usedefaults, source_bit_column_name = @NUMBIN (target_bit_column_name);










Friday, 9 October 2015

LMHB Process Trace Files in Oracle RAC - Potential Database Terminator.

During a routine performance analysis of Exadata system I came along trace files by the names on dbname_lmhb_number.trc i.e. testdb_lmhb_129875.trc being generated in the alert log location with the following lines contained in them


LMDO (ospid: 95477) has not moved for 74 sec (123453423.14444321115)
kjfmGCR_HBCheckAll: LMD0 (ospid: 95477) has status 2
: waiting for event 'ges remote message' for 0 secs with wait_id 123.

There are repeated messages like the above with only the wait_id changing in each.

We can see that the highlighted call above is something related to heartbeat check in RAC.

What Exactly is a LMHB Process

Global Cache/Enqueue Service Heartbeat Monitor
Monitor the heartbeat of LMON, LMD, and LMSn processes
LMHB monitors LMON, LMD, and LMSn processes to ensure they are running normally without blocking or spinning.

Cause of these waits  - Identified as BUG 13718279 - Affected Versions - 11.2.0.3

Effects of This Bug - DB instance terminated due to ORA-29770 in RAC.
.
Solution

Apply patch 13718279 or set the hidden init parameter _gc_defer_time=3.
Fixed in Version 11.2.0.4

Reference - MOS DOC ID - 1440112.1


Goldengate Replicats abending with Warnings related to Trigger and no clear error

Situation 
GG replicats at client site abended with following errors in the report file


015-10-08 16:03:40  WARNING OGG-02904  Replication of PARTIAL XML containing NCHAR/NVARCHAR/NCLOB data may cause divergence.

2015-10-09 11:18:34  WARNING OGG-00869  OCI Error ORA-04098: trigger 'UATPROD.TRG_VALIDATE' is invalid and failed re-validation (status = 4098), SQL <INSERT INTO "UATPROD.TESTTAB" 
 
2015-10-09 11:18:34  WARNING OGG-01004  Aborted grouped transaction on ' UATPROD.TESTTAB ', Database error 4098 (OCI Error ORA-04098: 
 


2015-10-09 11:18:34  WARNING OGG-01003  Repositioning to rba 28437291 in seqno 3.2015-10-09 11:18:34 


Analysis

Although there is no clear OGG error, the warnings themselves reveal the issue. Particularly the warning -- OGG-00869  OCI Error ORA-04098: trigger

In this case the entire schema was replicated at the target db and as a result, during impdp, its triggers also got imported.

When we have trigger firing at source and modifying or checking any data, then we can be rest assured that the data processed and committed as part of this trigger action will find its way into the Goldengate Trail.

Therefore there is no need to have these triggers replicated at the Target Site

The only exception to this case is if you are planning to do switchovers between source and target and have them acting as a data source at different points in time.

Solution

The simple solution was to drop the triggers present in the schema on the Target database and just start the Replicat processes again.