XA failure - Database saved - Queue rolled back

classic Classic list List threaded Threaded
2 messages Options
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

XA failure - Database saved - Queue rolled back

bhanner@technobabblist.com
We are using BTM 2.1.2 with Websphere MQ and DB2.  We have had the
following situation happen a couple of times.  During the commit there is
an error and the database changes are saved but the message is rolled back
to the queue.

The program uses Spring-Batch and is multithreaded while reading from the
queue and writing to the database.  There are five threads just reading,
parsing, and writing to the database tables.  So, when this happens the
message is returned to the queue and picked up by the available thread, but
causes the job to fail because the record already exist in the database.

One question is why did it happen that the commit failed.  The other is why
was the database not rolled back.

Here is the exception that we get.

<resource activemq failed on a Bitronix XID
[737072696E672D62746D0000013927EA950C0000
5407 : 737072696E672D62746D0000013927EA951400005410]>
bitronix.tm.internal.BitronixXAException: unknown heuristic termination,
global state of this transaction is unknown - guilty: an
XAResourceHolderState with uni
queName=activemq XAResource=com.ibm.mq.jmqi.JmqiXAResource@17389fe (ended)
with XID a Bitronix XID [737072696E672D62746D0000013927EA950C00005407 :
737072696E672
D62746D0000013927EA951400005410]
        at
bitronix.tm.twopc.Committer$CommitJob.handleXAException(Committer.java:209)
        at
bitronix.tm.twopc.Committer$CommitJob.commitResource(Committer.java:198)
        at bitronix.tm.twopc.Committer$CommitJob.execute(Committer.java:183)
        at bitronix.tm.twopc.executor.Job.run(Job.java:70)
        at
bitronix.tm.twopc.executor.SyncExecutor.submit(SyncExecutor.java:31)
        at
bitronix.tm.twopc.AbstractPhaseEngine.runJobsForPosition(AbstractPhaseEngine
.java:129)
        at
bitronix.tm.twopc.AbstractPhaseEngine.executePhase(AbstractPhaseEngine.java:
90)
        at bitronix.tm.twopc.Committer.commit(Committer.java:82)
        at
bitronix.tm.BitronixTransaction.commit(BitronixTransaction.java:239)
        at
bitronix.tm.BitronixTransactionManager.commit(BitronixTransactionManager.jav
a:120)
        at
xxx.xx.xxx.rapply.batch.TransactionalChunkListener.afterChunk(TransactionalC
hunkListener.java:124)
        at
org.springframework.batch.core.listener.CompositeChunkListener.afterChunk(Co
mpositeChunkListener.java:59)
        at
org.springframework.batch.core.step.tasklet.TaskletStep$2.doInChunkContext(T
askletStep.java:272)
        at
org.springframework.batch.core.scope.context.StepContextRepeatCallback.doInI
teration(StepContextRepeatCallback.java:76)
        at
org.springframework.batch.repeat.support.TaskExecutorRepeatTemplate$Executin
gRunnable.run(TaskExecutorRepeatTemplate.java:258)
        at
java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.ja
va:886)
        at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:9
08)
        at java.lang.Thread.run(Thread.java:619)
Caused by: javax.transaction.xa.XAException: The method 'xa_commit' has
failed with errorCode '-4'.
        at com.ibm.mq.jmqi.JmqiXAResource.commit(JmqiXAResource.java:407)
        at
bitronix.tm.twopc.Committer$CommitJob.commitResource(Committer.java:194)
        ... 16 more
2012-08-14 21:34:11,090 ERROR
[batch.TransactionalChunkListener:taskExecutor-3] - <Heuristic Mixed.>
bitronix.tm.internal.BitronixHeuristicMixedException: transaction failed
during commit of a Bitronix Transaction with GTRID
[737072696E672D62746D0000013927EA950
C00005407], status=UNKNOWN, 2 resource(s) enlisted (started Tue Aug 14
21:34:08 EDT 2012): resource(s) [activemq] improperly unilaterally rolled
back (or hazard
 happened)
        at bitronix.tm.twopc.Committer.throwException(Committer.java:147)
        at bitronix.tm.twopc.Committer.commit(Committer.java:86)
        at
bitronix.tm.BitronixTransaction.commit(BitronixTransaction.java:239)
        at
bitronix.tm.BitronixTransactionManager.commit(BitronixTransactionManager.jav
a:120)
        at
xxx.xx.xxx.rapply.batch.TransactionalChunkListener.afterChunk(TransactionalC
hunkListener.java:124)
        at
org.springframework.batch.core.listener.CompositeChunkListener.afterChunk(Co
mpositeChunkListener.java:59)
        at
org.springframework.batch.core.step.tasklet.TaskletStep$2.doInChunkContext(T
askletStep.java:272)
        at
org.springframework.batch.core.scope.context.StepContextRepeatCallback.doInI
teration(StepContextRepeatCallback.java:76)
        at
org.springframework.batch.repeat.support.TaskExecutorRepeatTemplate$Executin
gRunnable.run(TaskExecutorRepeatTemplate.java:258)
        at
java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.ja
va:886)
        at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:9
08)
        at java.lang.Thread.run(Thread.java:619)
Caused by: bitronix.tm.twopc.PhaseException: collected 1 exception(s):
 [activemq - bitronix.tm.internal.BitronixXAException(XA_HEURHAZ) - unknown
heuristic termination, global state of this transaction is unknown -
guilty: an XARe
sourceHolderState with uniqueName=activemq
XAResource=com.ibm.mq.jmqi.JmqiXAResource@17389fe (ended) with XID a
Bitronix XID [737072696E672D62746D0000013927EA95
0C00005407 : 737072696E672D62746D0000013927EA951400005410]]
        at
bitronix.tm.twopc.AbstractPhaseEngine.executePhase(AbstractPhaseEngine.java:
110)
        at bitronix.tm.twopc.Committer.commit(Committer.java:82)
        ... 10 more
2012-08-14 21:34:11,097 ERROR
[batch.TransactionalChunkListener:taskExecutor-3] - <Sub Exception>
bitronix.tm.twopc.PhaseException: collected 1 exception(s):
 [activemq - bitronix.tm.internal.BitronixXAException(XA_HEURHAZ) - unknown
heuristic termination, global state of this transaction is unknown -
guilty: an XARe
sourceHolderState with uniqueName=activemq
XAResource=com.ibm.mq.jmqi.JmqiXAResource@17389fe (ended) with XID a
Bitronix XID [737072696E672D62746D0000013927EA95
0C00005407 : 737072696E672D62746D0000013927EA951400005410]]
        at
bitronix.tm.twopc.AbstractPhaseEngine.executePhase(AbstractPhaseEngine.java:
110)
        at bitronix.tm.twopc.Committer.commit(Committer.java:82)
        at
bitronix.tm.BitronixTransaction.commit(BitronixTransaction.java:239)
        at
bitronix.tm.BitronixTransactionManager.commit(BitronixTransactionManager.jav
a:120)
        at
xxx.xx.xxx.rapply.batch.TransactionalChunkListener.afterChunk(TransactionalC
hunkListener.java:124)
        at
org.springframework.batch.core.listener.CompositeChunkListener.afterChunk(Co
mpositeChunkListener.java:59)
        at
org.springframework.batch.core.step.tasklet.TaskletStep$2.doInChunkContext(T
askletStep.java:272)
        at
org.springframework.batch.core.scope.context.StepContextRepeatCallback.doInI
teration(StepContextRepeatCallback.java:76)
        at
org.springframework.batch.repeat.support.TaskExecutorRepeatTemplate$Executin
gRunnable.run(TaskExecutorRepeatTemplate.java:258)
        at
java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.ja
va:886)
        at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:9
08)
        at java.lang.Thread.run(Thread.java:619)

Any help would be appreciated.

Thanks,
Ben


--------------------------------------------------------------------
mail2web.com – What can On Demand Business Solutions do for you?
http://link.mail2web.com/Business/SharePoint



---------------------------------------------------------------------
To unsubscribe from this list, please visit:

    http://xircles.codehaus.org/manage_email


Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: XA failure - Database saved - Queue rolled back

Ludovic Orban-2
The logs you posted make me believe that for some reason, sometimes
your JMS server does not follow the XA protocol strictly and decides
to unilaterally rollback the message after the first phase of the 2PC
protocol without waiting for the the transaction manager's directives.

Here's the bit of info that hints me:

 Caused by: javax.transaction.xa.XAException: The method 'xa_commit'
has failed with errorCode '-4'.
         at com.ibm.mq.jmqi.JmqiXAResource.commit(JmqiXAResource.java:407)
  ...

errorCode -4 is the value of XAException.XAER_NOTA (see:
http://docs.oracle.com/javase/1.4.2/docs/api/constant-values.html#javax.transaction.xa.XAException.XAER_NOTA)
which means that the XAResource doesn't know about the transaction.
This exception is flowing through the Committer class which indicates
that it happened during the 2nd phase of the 2PC protocol.

In brief, my understanding is that you started a transaction, updated
your DB, sent or received a message, committed the JTA transaction,
BTM executed prepare successfully on both resources (first phase) then
managed to execute commit fine your your DB (which is why you see the
update) but failed to commit the message on the JMS server as the
latter reported it 'lost' knowledge of that transaction. That's why
BTM reports that problem as a heuristic hazard: one resource ended up
committed as expected, the other rolled back for an unknown reason so
the outcome of that transaction is not atomic. The only other thing
BTM can do other than just reporting this error is also reporting the
resources that made the transaction fail. This is what the "guilty: an
XAResourceHolderState with uniqueName=activemq" error message is
telling you.

The reason why your JMS server unilaterally rolled back is unknown to
me, you'll have to investigate to try to figure out why it misbehaved
like this. One possibility is that there could be a configurable
timeout somewhere telling it to heuristically rollback in-doubt XA
transactions after a certain amount of time but that's just a guess.

I highly recommend you to contact your vendor about this problem. If
they want to blame BTM for whatever reason, you can direct them to me
and I'll happily debate their suspicions with them.




On Wed, Aug 15, 2012 at 3:34 PM, [hidden email]
<[hidden email]> wrote:

> We are using BTM 2.1.2 with Websphere MQ and DB2.  We have had the
> following situation happen a couple of times.  During the commit there is
> an error and the database changes are saved but the message is rolled back
> to the queue.
>
> The program uses Spring-Batch and is multithreaded while reading from the
> queue and writing to the database.  There are five threads just reading,
> parsing, and writing to the database tables.  So, when this happens the
> message is returned to the queue and picked up by the available thread, but
> causes the job to fail because the record already exist in the database.
>
> One question is why did it happen that the commit failed.  The other is why
> was the database not rolled back.
>
> Here is the exception that we get.
>
> <resource activemq failed on a Bitronix XID
> [737072696E672D62746D0000013927EA950C0000
> 5407 : 737072696E672D62746D0000013927EA951400005410]>
> bitronix.tm.internal.BitronixXAException: unknown heuristic termination,
> global state of this transaction is unknown - guilty: an
> XAResourceHolderState with uni
> queName=activemq XAResource=com.ibm.mq.jmqi.JmqiXAResource@17389fe (ended)
> with XID a Bitronix XID [737072696E672D62746D0000013927EA950C00005407 :
> 737072696E672
> D62746D0000013927EA951400005410]
>         at
> bitronix.tm.twopc.Committer$CommitJob.handleXAException(Committer.java:209)
>         at
> bitronix.tm.twopc.Committer$CommitJob.commitResource(Committer.java:198)
>         at bitronix.tm.twopc.Committer$CommitJob.execute(Committer.java:183)
>         at bitronix.tm.twopc.executor.Job.run(Job.java:70)
>         at
> bitronix.tm.twopc.executor.SyncExecutor.submit(SyncExecutor.java:31)
>         at
> bitronix.tm.twopc.AbstractPhaseEngine.runJobsForPosition(AbstractPhaseEngine
> .java:129)
>         at
> bitronix.tm.twopc.AbstractPhaseEngine.executePhase(AbstractPhaseEngine.java:
> 90)
>         at bitronix.tm.twopc.Committer.commit(Committer.java:82)
>         at
> bitronix.tm.BitronixTransaction.commit(BitronixTransaction.java:239)
>         at
> bitronix.tm.BitronixTransactionManager.commit(BitronixTransactionManager.jav
> a:120)
>         at
> xxx.xx.xxx.rapply.batch.TransactionalChunkListener.afterChunk(TransactionalC
> hunkListener.java:124)
>         at
> org.springframework.batch.core.listener.CompositeChunkListener.afterChunk(Co
> mpositeChunkListener.java:59)
>         at
> org.springframework.batch.core.step.tasklet.TaskletStep$2.doInChunkContext(T
> askletStep.java:272)
>         at
> org.springframework.batch.core.scope.context.StepContextRepeatCallback.doInI
> teration(StepContextRepeatCallback.java:76)
>         at
> org.springframework.batch.repeat.support.TaskExecutorRepeatTemplate$Executin
> gRunnable.run(TaskExecutorRepeatTemplate.java:258)
>         at
> java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.ja
> va:886)
>         at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:9
> 08)
>         at java.lang.Thread.run(Thread.java:619)
> Caused by: javax.transaction.xa.XAException: The method 'xa_commit' has
> failed with errorCode '-4'.
>         at com.ibm.mq.jmqi.JmqiXAResource.commit(JmqiXAResource.java:407)
>         at
> bitronix.tm.twopc.Committer$CommitJob.commitResource(Committer.java:194)
>         ... 16 more
> 2012-08-14 21:34:11,090 ERROR
> [batch.TransactionalChunkListener:taskExecutor-3] - <Heuristic Mixed.>
> bitronix.tm.internal.BitronixHeuristicMixedException: transaction failed
> during commit of a Bitronix Transaction with GTRID
> [737072696E672D62746D0000013927EA950
> C00005407], status=UNKNOWN, 2 resource(s) enlisted (started Tue Aug 14
> 21:34:08 EDT 2012): resource(s) [activemq] improperly unilaterally rolled
> back (or hazard
>  happened)
>         at bitronix.tm.twopc.Committer.throwException(Committer.java:147)
>         at bitronix.tm.twopc.Committer.commit(Committer.java:86)
>         at
> bitronix.tm.BitronixTransaction.commit(BitronixTransaction.java:239)
>         at
> bitronix.tm.BitronixTransactionManager.commit(BitronixTransactionManager.jav
> a:120)
>         at
> xxx.xx.xxx.rapply.batch.TransactionalChunkListener.afterChunk(TransactionalC
> hunkListener.java:124)
>         at
> org.springframework.batch.core.listener.CompositeChunkListener.afterChunk(Co
> mpositeChunkListener.java:59)
>         at
> org.springframework.batch.core.step.tasklet.TaskletStep$2.doInChunkContext(T
> askletStep.java:272)
>         at
> org.springframework.batch.core.scope.context.StepContextRepeatCallback.doInI
> teration(StepContextRepeatCallback.java:76)
>         at
> org.springframework.batch.repeat.support.TaskExecutorRepeatTemplate$Executin
> gRunnable.run(TaskExecutorRepeatTemplate.java:258)
>         at
> java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.ja
> va:886)
>         at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:9
> 08)
>         at java.lang.Thread.run(Thread.java:619)
> Caused by: bitronix.tm.twopc.PhaseException: collected 1 exception(s):
>  [activemq - bitronix.tm.internal.BitronixXAException(XA_HEURHAZ) - unknown
> heuristic termination, global state of this transaction is unknown -
> guilty: an XARe
> sourceHolderState with uniqueName=activemq
> XAResource=com.ibm.mq.jmqi.JmqiXAResource@17389fe (ended) with XID a
> Bitronix XID [737072696E672D62746D0000013927EA95
> 0C00005407 : 737072696E672D62746D0000013927EA951400005410]]
>         at
> bitronix.tm.twopc.AbstractPhaseEngine.executePhase(AbstractPhaseEngine.java:
> 110)
>         at bitronix.tm.twopc.Committer.commit(Committer.java:82)
>         ... 10 more
> 2012-08-14 21:34:11,097 ERROR
> [batch.TransactionalChunkListener:taskExecutor-3] - <Sub Exception>
> bitronix.tm.twopc.PhaseException: collected 1 exception(s):
>  [activemq - bitronix.tm.internal.BitronixXAException(XA_HEURHAZ) - unknown
> heuristic termination, global state of this transaction is unknown -
> guilty: an XARe
> sourceHolderState with uniqueName=activemq
> XAResource=com.ibm.mq.jmqi.JmqiXAResource@17389fe (ended) with XID a
> Bitronix XID [737072696E672D62746D0000013927EA95
> 0C00005407 : 737072696E672D62746D0000013927EA951400005410]]
>         at
> bitronix.tm.twopc.AbstractPhaseEngine.executePhase(AbstractPhaseEngine.java:
> 110)
>         at bitronix.tm.twopc.Committer.commit(Committer.java:82)
>         at
> bitronix.tm.BitronixTransaction.commit(BitronixTransaction.java:239)
>         at
> bitronix.tm.BitronixTransactionManager.commit(BitronixTransactionManager.jav
> a:120)
>         at
> xxx.xx.xxx.rapply.batch.TransactionalChunkListener.afterChunk(TransactionalC
> hunkListener.java:124)
>         at
> org.springframework.batch.core.listener.CompositeChunkListener.afterChunk(Co
> mpositeChunkListener.java:59)
>         at
> org.springframework.batch.core.step.tasklet.TaskletStep$2.doInChunkContext(T
> askletStep.java:272)
>         at
> org.springframework.batch.core.scope.context.StepContextRepeatCallback.doInI
> teration(StepContextRepeatCallback.java:76)
>         at
> org.springframework.batch.repeat.support.TaskExecutorRepeatTemplate$Executin
> gRunnable.run(TaskExecutorRepeatTemplate.java:258)
>         at
> java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.ja
> va:886)
>         at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:9
> 08)
>         at java.lang.Thread.run(Thread.java:619)
>
> Any help would be appreciated.
>
> Thanks,
> Ben
>
>
> --------------------------------------------------------------------
> mail2web.com – What can On Demand Business Solutions do for you?
> http://link.mail2web.com/Business/SharePoint
>
>
>
> ---------------------------------------------------------------------
> To unsubscribe from this list, please visit:
>
>     http://xircles.codehaus.org/manage_email
>
>

---------------------------------------------------------------------
To unsubscribe from this list, please visit:

    http://xircles.codehaus.org/manage_email


Loading...