threads dying when ForceBatchingEnabled is true

classic Classic list List threaded Threaded
6 messages Options
Reply | Threaded
Open this post in threaded view
|

threads dying when ForceBatchingEnabled is true

jkuchta
Hi,
I am seeing weird behavior in my application (spring+btm+swiftmq+oracle+jdk1.6.0_18). From time to time a thread dies. The last entry in the debug log is always :

bitronix.tm.journal.DiskForceWaitQueue - waiting for a TransactionLogAppender on journal1.btm to get dequeued

The log snippet is bellow. The troubled threads are  SRRoute-1 and SRRoute-4.  

First they get enqued:
[SRRoute-4] () bitronix.tm.journal.DiskForceWaitQueue - enqueued a TransactionLogAppender on journal1.btm, 1 TransactionLogAppender waiting for a disk force
[SRRoute-1] () bitronix.tm.journal.DiskForceWaitQueue - enqueued a TransactionLogAppender on journal1.btm, 2 TransactionLogAppender waiting for a disk force

Then DiskForceBatcherThread runs:
2011-03-05 11:14:39.581 DEBUG [bitronix-disk-force-batcher] () bitronix.tm.journal.DiskForceBatcherThread - wait queue is not empty anymore (1 in queue)
2011-03-05 11:14:39.588 DEBUG [bitronix-disk-force-batcher] () bitronix.tm.journal.DiskForceBatcherThread - forcing...
2011-03-05 11:14:39.588 DEBUG [bitronix-disk-force-batcher] () bitronix.tm.journal.DiskForceWaitQueue - returning head TransactionLogAppender
2011-03-05 11:14:39.588 DEBUG [bitronix-disk-force-batcher] () bitronix.tm.journal.TransactionLogAppender - forcing log writing
2011-03-05 11:14:39.592 DEBUG [bitronix-disk-force-batcher] () bitronix.tm.journal.TransactionLogAppender - done forcing log
2011-03-05 11:14:39.592 DEBUG [bitronix-disk-force-batcher] () bitronix.tm.journal.DiskForceWaitQueue - clearing list of waiting TransactionLogAppender
2011-03-05 11:14:39.592 DEBUG [bitronix-disk-force-batcher] () bitronix.tm.journal.DiskForceBatcherThread - waiting for the wait queue to fill up
2011-03-05 11:14:39.592 DEBUG [bitronix-disk-force-batcher] () bitronix.tm.journal.DiskForceWaitQueue - waiting for some TransactionLogAppender to get enqueued

finally the threads wakes up but they find their TransactionLogAppender still queued
[SRRoute-4] () bitronix.tm.journal.DiskForceWaitQueue - waiting for a TransactionLogAppender on journal1.btm to get dequeued
[SRRoute-1] () bitronix.tm.journal.DiskForceWaitQueue - waiting for a TransactionLogAppender on journal1.btm to get dequeued

I checked the DiskForceWaitQueue code and it seems fine to me. I am not expert on JVM optimizations and have no idea how a race condition can happen here or maybe I just do not interpret the log correctly. Just wonder if anyone came across the same problem or what do you thing.

When I turn off ForceBatchingEnabled then all threads are fine and healthy and I do not see the problem anymore.

More detail log snippet:

2011-03-05 11:14:39.562 DEBUG [SRRoute-2] () bitronix.tm.internal.XAResourceHolderState - applying resource timeout of 30s on an XAResourceHolderState with uniqueName=KUJ_ImpJMS XAResource=com.swiftmq.jms.v750.XAResourceImpl@130c8e9 with XID a Bitronix XID [6B756A2D696D702D636F72652D62746D0000012E858304C7000392B1 : 6B756A2D696D702D636F72652D62746D0000012E858304C9000392B3]
2011-03-05 11:14:39.566 DEBUG [SRRoute-4] () bitronix.tm.twopc.Preparer - prepared resource an XAResourceHolderState with uniqueName=KUJ_ImpDB XAResource=oracle.jdbc.driver.T4CXAResource@168bcb7 (ended) with XID a Bitronix XID [6B756A2D696D702D636F72652D62746D0000012E858300E80003929B : 6B756A2D696D702D636F72652D62746D0000012E858303B8000392AA] voted XA_OK
2011-03-05 11:14:39.567 DEBUG [SRRoute-4] () bitronix.tm.twopc.AbstractPhaseEngine - phase executed with 0 exception(s)
2011-03-05 11:14:39.567 DEBUG [SRRoute-4] () bitronix.tm.twopc.AbstractPhaseEngine - ran 2 job(s) for position '1'
2011-03-05 11:14:39.567 DEBUG [SRRoute-4] () bitronix.tm.BitronixTransaction - changing transaction status to PREPARED
2011-03-05 11:14:39.567 DEBUG [SRRoute-4] () bitronix.tm.journal.TransactionLogAppender - between 832225 and 832313, writing a Bitronix TransactionLogRecord with status=PREPARED, recordLength=80, headerLength=28, time=1299320079567, sequenceNumber=234164, crc32=-1839180916, gtrid=6B756A2D696D702D636F72652D62746D0000012E858300E80003929B, uniqueNames=KUJ_ImpDB,KUJ_ImpJMS
2011-03-05 11:14:39.568 DEBUG [SRRoute-1] () bitronix.tm.twopc.Preparer - prepared resource an XAResourceHolderState with uniqueName=KUJ_ImpDB XAResource=oracle.jdbc.driver.T4CXAResource@165e55e (ended) with XID a Bitronix XID [6B756A2D696D702D636F72652D62746D0000012E858300BD00039297 : 6B756A2D696D702D636F72652D62746D0000012E858302D3000392A6] voted XA_OK
2011-03-05 11:14:39.568 DEBUG [SRRoute-1] () bitronix.tm.twopc.AbstractPhaseEngine - phase executed with 0 exception(s)
2011-03-05 11:14:39.569 DEBUG [SRRoute-1] () bitronix.tm.twopc.AbstractPhaseEngine - ran 2 job(s) for position '1'
2011-03-05 11:14:39.569 DEBUG [SRRoute-1] () bitronix.tm.BitronixTransaction - changing transaction status to PREPARED
2011-03-05 11:14:39.569 DEBUG [SRRoute-6] () bitronix.tm.twopc.Preparer - prepared resource an XAResourceHolderState with uniqueName=KUJ_ImpDB XAResource=oracle.jdbc.driver.T4CXAResource@18212f3 (ended) with XID a Bitronix XID [6B756A2D696D702D636F72652D62746D0000012E8583010A000392A3 : 6B756A2D696D702D636F72652D62746D0000012E858303B2000392A9] voted XA_OK
2011-03-05 11:14:39.569 DEBUG [SRRoute-6] () bitronix.tm.twopc.AbstractPhaseEngine - phase executed with 0 exception(s)
2011-03-05 11:14:39.569 DEBUG [SRRoute-6] () bitronix.tm.twopc.AbstractPhaseEngine - ran 2 job(s) for position '1'
2011-03-05 11:14:39.569 DEBUG [SRRoute-6] () bitronix.tm.BitronixTransaction - changing transaction status to PREPARED
2011-03-05 11:14:39.577 DEBUG [SRRoute-4] () bitronix.tm.journal.TransactionLogAppender - disk journal appender now at position 832313
2011-03-05 11:14:39.577 DEBUG [SRRoute-4] () bitronix.tm.BitronixTransaction - transaction status is changing from PREPARING to PREPARED - executing 0 listener(s)
2011-03-05 11:14:39.577 DEBUG [SRRoute-4] () bitronix.tm.twopc.Preparer - successfully prepared 2 resource(s)
2011-03-05 11:14:39.577 DEBUG [SRRoute-4] () bitronix.tm.BitronixTransaction - 2 interested resource(s)
2011-03-05 11:14:39.577 DEBUG [SRRoute-4] () bitronix.tm.BitronixTransaction - changing transaction status to COMMITTING (forced)
2011-03-05 11:14:39.578 DEBUG [SRRoute-4] () bitronix.tm.journal.TransactionLogAppender - between 832313 and 832401, writing a Bitronix TransactionLogRecord with status=COMMITTING, recordLength=80, headerLength=28, time=1299320079577, sequenceNumber=234165, crc32=-1946994592, gtrid=6B756A2D696D702D636F72652D62746D0000012E858300E80003929B, uniqueNames=KUJ_ImpDB,KUJ_ImpJMS
2011-03-05 11:14:39.579 DEBUG [SRRoute-4] () bitronix.tm.journal.TransactionLogAppender - disk journal appender now at position 832401
2011-03-05 11:14:39.579 DEBUG [SRRoute-4] () bitronix.tm.journal.DiskForceWaitQueue - enqueued a TransactionLogAppender on journal1.btm, 1 TransactionLogAppender waiting for a disk force
2011-03-05 11:14:39.579 DEBUG [SRRoute-4] () bitronix.tm.journal.DiskForceBatcherThread - batching disk force, there are 1 TransactionLogAppender in the wait queue
2011-03-05 11:14:39.580 DEBUG [SRRoute-4] () bitronix.tm.journal.DiskForceWaitQueue - waiting for a TransactionLogAppender on journal1.btm to get dequeued
2011-03-05 11:14:39.580 DEBUG [SRRoute-1] () bitronix.tm.journal.TransactionLogAppender - between 832401 and 832489, writing a Bitronix TransactionLogRecord with status=PREPARED, recordLength=80, headerLength=28, time=1299320079580, sequenceNumber=234166, crc32=-437286430, gtrid=6B756A2D696D702D636F72652D62746D0000012E858300BD00039297, uniqueNames=KUJ_ImpDB,KUJ_ImpJMS
2011-03-05 11:14:39.581 DEBUG [SRRoute-1] () bitronix.tm.journal.TransactionLogAppender - disk journal appender now at position 832489
2011-03-05 11:14:39.581 DEBUG [SRRoute-1] () bitronix.tm.BitronixTransaction - transaction status is changing from PREPARING to PREPARED - executing 0 listener(s)
2011-03-05 11:14:39.581 DEBUG [SRRoute-1] () bitronix.tm.twopc.Preparer - successfully prepared 2 resource(s)
2011-03-05 11:14:39.581 DEBUG [SRRoute-1] () bitronix.tm.BitronixTransaction - 2 interested resource(s)
2011-03-05 11:14:39.581 DEBUG [SRRoute-1] () bitronix.tm.BitronixTransaction - changing transaction status to COMMITTING (forced)
2011-03-05 11:14:39.582 DEBUG [SRRoute-1] () bitronix.tm.journal.TransactionLogAppender - between 832489 and 832577, writing a Bitronix TransactionLogRecord with status=COMMITTING, recordLength=80, headerLength=28, time=1299320079582, sequenceNumber=234167, crc32=861186692, gtrid=6B756A2D696D702D636F72652D62746D0000012E858300BD00039297, uniqueNames=KUJ_ImpDB,KUJ_ImpJMS
2011-03-05 11:14:39.583 DEBUG [SRRoute-1] () bitronix.tm.journal.TransactionLogAppender - disk journal appender now at position 832577
2011-03-05 11:14:39.583 DEBUG [SRRoute-1] () bitronix.tm.journal.DiskForceWaitQueue - enqueued a TransactionLogAppender on journal1.btm, 2 TransactionLogAppender waiting for a disk force
2011-03-05 11:14:39.584 DEBUG [SRRoute-1] () bitronix.tm.journal.DiskForceBatcherThread - batching disk force, there are 2 TransactionLogAppender in the wait queue
2011-03-05 11:14:39.584 DEBUG [SRRoute-1] () bitronix.tm.journal.DiskForceWaitQueue - waiting for a TransactionLogAppender on journal1.btm to get dequeued
2011-03-05 11:14:39.585 DEBUG [SRRoute-6] () bitronix.tm.journal.TransactionLogAppender - between 832577 and 832665, writing a Bitronix TransactionLogRecord with status=PREPARED, recordLength=80, headerLength=28, time=1299320079585, sequenceNumber=234168, crc32=-958657710, gtrid=6B756A2D696D702D636F72652D62746D0000012E8583010A000392A3, uniqueNames=KUJ_ImpDB,KUJ_ImpJMS
2011-03-05 11:14:39.586 DEBUG [SRRoute-6] () bitronix.tm.journal.TransactionLogAppender - disk journal appender now at position 832665
2011-03-05 11:14:39.586 DEBUG [SRRoute-6] () bitronix.tm.BitronixTransaction - transaction status is changing from PREPARING to PREPARED - executing 0 listener(s)
2011-03-05 11:14:39.586 DEBUG [SRRoute-6] () bitronix.tm.twopc.Preparer - successfully prepared 2 resource(s)
2011-03-05 11:14:39.586 DEBUG [SRRoute-6] () bitronix.tm.BitronixTransaction - 2 interested resource(s)
2011-03-05 11:14:39.586 DEBUG [SRRoute-6] () bitronix.tm.BitronixTransaction - changing transaction status to COMMITTING (forced)
2011-03-05 11:14:39.586 DEBUG [SRRoute-6] () bitronix.tm.journal.TransactionLogAppender - between 832665 and 832753, writing a Bitronix TransactionLogRecord with status=COMMITTING, recordLength=80, headerLength=28, time=1299320079586, sequenceNumber=234169, crc32=-1622014559, gtrid=6B756A2D696D702D636F72652D62746D0000012E8583010A000392A3, uniqueNames=KUJ_ImpDB,KUJ_ImpJMS
2011-03-05 11:14:39.587 DEBUG [SRRoute-6] () bitronix.tm.journal.TransactionLogAppender - disk journal appender now at position 832753
2011-03-05 11:14:39.588 DEBUG [SRRoute-4] () bitronix.tm.journal.DiskForceWaitQueue - waiting for a TransactionLogAppender on journal1.btm to get dequeued
2011-03-05 11:14:39.581 DEBUG [bitronix-disk-force-batcher] () bitronix.tm.journal.DiskForceBatcherThread - wait queue is not empty anymore (1 in queue)
2011-03-05 11:14:39.588 DEBUG [bitronix-disk-force-batcher] () bitronix.tm.journal.DiskForceBatcherThread - forcing...
2011-03-05 11:14:39.588 DEBUG [bitronix-disk-force-batcher] () bitronix.tm.journal.DiskForceWaitQueue - returning head TransactionLogAppender
2011-03-05 11:14:39.588 DEBUG [bitronix-disk-force-batcher] () bitronix.tm.journal.TransactionLogAppender - forcing log writing
2011-03-05 11:14:39.592 DEBUG [bitronix-disk-force-batcher] () bitronix.tm.journal.TransactionLogAppender - done forcing log
2011-03-05 11:14:39.592 DEBUG [bitronix-disk-force-batcher] () bitronix.tm.journal.DiskForceWaitQueue - clearing list of waiting TransactionLogAppender
2011-03-05 11:14:39.592 DEBUG [bitronix-disk-force-batcher] () bitronix.tm.journal.DiskForceBatcherThread - waiting for the wait queue to fill up
2011-03-05 11:14:39.592 DEBUG [bitronix-disk-force-batcher] () bitronix.tm.journal.DiskForceWaitQueue - waiting for some TransactionLogAppender to get enqueued
2011-03-05 11:14:39.598 DEBUG [SRRoute-6] () bitronix.tm.journal.DiskForceWaitQueue - enqueued a TransactionLogAppender on journal1.btm, 1 TransactionLogAppender waiting for a disk force
2011-03-05 11:14:39.599 DEBUG [SRRoute-6] () bitronix.tm.journal.DiskForceBatcherThread - batching disk force, there are 0 TransactionLogAppender in the wait queue
2011-03-05 11:14:39.599 DEBUG [SRRoute-6] () bitronix.tm.journal.DiskForceBatcherThread - wait queue got emptied, disk force is done
2011-03-05 11:14:39.600 DEBUG [SRRoute-6] () bitronix.tm.BitronixTransaction - transaction status is changing from PREPARED to COMMITTING - executing 0 listener(s)
2011-03-05 11:14:39.600 DEBUG [SRRoute-6] () bitronix.tm.twopc.AbstractPhaseEngine - executing phase on 2 resource(s) enlisted in 1 position(s) in reverse position order
2011-03-05 11:14:39.600 DEBUG [SRRoute-6] () bitronix.tm.twopc.AbstractPhaseEngine - running 2 job(s) for position '1'
2011-03-05 11:14:39.600 DEBUG [SRRoute-6] () bitronix.tm.twopc.Committer - committing resource an XAResourceHolderState with uniqueName=KUJ_ImpDB XAResource=oracle.jdbc.driver.T4CXAResource@18212f3 (ended) with XID a Bitronix XID [6B756A2D696D702D636F72652D62746D0000012E8583010A000392A3 : 6B756A2D696D702D636F72652D62746D0000012E858303B2000392A9]
2011-03-05 11:14:39.603 DEBUG [SRRoute-4] () bitronix.tm.journal.DiskForceWaitQueue - waiting for a TransactionLogAppender on journal1.btm to get dequeued
2011-03-05 11:14:39.607 DEBUG [SRRoute-2] () bitronix.tm.internal.XAResourceHolderState - started an XAResourceHolderState with uniqueName=KUJ_ImpJMS XAResource=com.swiftmq.jms.v750.XAResourceImpl@130c8e9 (started) with XID a Bitronix XID [6B756A2D696D702D636F72652D62746D0000012E858304C7000392B1 : 6B756A2D696D702D636F72652D62746D0000012E858304C9000392B3] with NOFLAGS
2011-03-05 11:14:39.607 DEBUG [SRRoute-2] () bitronix.tm.resource.common.AbstractXAResourceHolder - putting XAResourceHolderState [an XAResourceHolderState with uniqueName=KUJ_ImpJMS XAResource=com.swiftmq.jms.v750.XAResourceImpl@130c8e9 (started) with XID a Bitronix XID [6B756A2D696D702D636F72652D62746D0000012E858304C7000392B1 : 6B756A2D696D702D636F72652D62746D0000012E858304C9000392B3]] on a DualSessionWrapper in state ACCESSIBLE of a JmsPooledConnection of pool KUJ_ImpJMS in state ACCESSIBLE with underlying connection com.swiftmq.jms.v750.XAConnectionImpl@12e7234
2011-03-05 11:14:39.607 DEBUG [SRRoute-2] () bitronix.tm.resource.common.AbstractXAResourceHolder - GTRID [6B756A2D696D702D636F72652D62746D0000012E858304C7000392B1] previously unknown to a DualSessionWrapper in state ACCESSIBLE of a JmsPooledConnection of pool KUJ_ImpJMS in state ACCESSIBLE with underlying connection com.swiftmq.jms.v750.XAConnectionImpl@12e7234, adding it to the resource's transactions list
2011-03-05 11:14:39.607 DEBUG [SRRoute-2] () org.springframework.jms.listener.DefaultMessageListenerContainer - Received message of type [class com.swiftmq.jms.BytesMessageImpl] from consumer [a MessageConsumerWrapper of a DualSessionWrapper in state ACCESSIBLE of a JmsPooledConnection of pool KUJ_ImpJMS in state ACCESSIBLE with underlying connection com.swiftmq.jms.v750.XAConnectionImpl@12e7234] of session [a DualSessionWrapper in state ACCESSIBLE of a JmsPooledConnection of pool KUJ_ImpJMS in state ACCESSIBLE with underlying connection com.swiftmq.jms.v750.XAConnectionImpl@12e7234]
...
... only [SRRoute-2] operations
...
2011-03-05 11:14:39.614 DEBUG [SRRoute-2] () bitronix.tm.resource.jms.DualSessionWrapper - choosing XA session
2011-03-05 11:14:39.614 DEBUG [SRRoute-2] () bitronix.tm.resource.common.TransactionContextHelper - enlisting a DualSessionWrapper in state ACCESSIBLE of a
 JmsPooledConnection of pool KUJ_ImpJMS in state ACCESSIBLE with underlying connection com.swiftmq.jms.v750.XAConnectionImpl@12e7234 into a Bitronix Transa
ction with GTRID [6B756A2D696D702D636F72652D62746D0000012E858304C7000392B1], status=ACTIVE, 1 resource(s) enlisted (started Sat Mar 05 11:14:39 CET 2011)
2011-03-05 11:14:39.614 DEBUG [SRRoute-2] () bitronix.tm.resource.common.TransactionContextHelper - avoiding re-enlistment of already enlisted but not ende
d resource an XAResourceHolderState with uniqueName=KUJ_ImpJMS XAResource=com.swiftmq.jms.v750.XAResourceImpl@130c8e9 (started) with XID a Bitronix XID [6B
756A2D696D702D636F72652D62746D0000012E858304C7000392B1 : 6B756A2D696D702D636F72652D62746D0000012E858304C9000392B3]
2011-03-05 11:14:39.614 INFO  [SRRoute-2] () com.o2.cz.indra.imp.core.ChargeHandler - CDR request created, for BaNumber:19911, status:ACCEPTED
2011-03-05 11:14:39.614 DEBUG [SRRoute-2] () org.springframework.beans.factory.support.DefaultListableBeanFactory - Returning cached instance of singleton
bean 'longMsgAggregator'
2011-03-05 11:14:39.615 DEBUG [SRRoute-1] () bitronix.tm.journal.DiskForceWaitQueue - waiting for a TransactionLogAppender on journal1.btm to get dequeued
2011-03-05 11:14:39.617 DEBUG [SRRoute-2] () com.o2.cz.indra.imp.core.LongMessageAggregator - StatusReport {ACCEPTED} IdSufix {0303} for Long SMS id {ID-33
099-1299240332477-1-100360} is processing
Reply | Threaded
Open this post in threaded view
|

Re: threads dying when ForceBatchingEnabled is true

Ludovic Orban-2
I guess your JVM deadlocks when you observe this problem?

That can only mean there is a race condition somewhere that is extremely uncommon. What is the OS are you running on? Is that the Sun JVM? Have you tried with the latest version?

Could you please send a debug log file (it's not possible to wok with the logs you copied in your email) as well as a full thread dump of your JVM when the problem happens?


2011/3/5 jkuchta <[hidden email]>

Hi,
I am seeing weird behavior in my application
(spring+btm+swiftmq+oracle+jdk1.6.0_18). From time to time a thread dies.
The last entry in the debug log is always :

bitronix.tm.journal.DiskForceWaitQueue - waiting for a
TransactionLogAppender on journal1.btm to get dequeued

The log snippet is bellow. The troubled threads are  SRRoute-1 and
SRRoute-4.

First they get enqued:
[SRRoute-4] () bitronix.tm.journal.DiskForceWaitQueue - enqueued a
TransactionLogAppender on journal1.btm, 1 TransactionLogAppender waiting for
a disk force
[SRRoute-1] () bitronix.tm.journal.DiskForceWaitQueue - enqueued a
TransactionLogAppender on journal1.btm, 2 TransactionLogAppender waiting for
a disk force

Then DiskForceBatcherThread runs:
2011-03-05 11:14:39.581 DEBUG [bitronix-disk-force-batcher] ()
bitronix.tm.journal.DiskForceBatcherThread - wait queue is not empty anymore
(1 in queue)
2011-03-05 11:14:39.588 DEBUG [bitronix-disk-force-batcher] ()
bitronix.tm.journal.DiskForceBatcherThread - forcing...
2011-03-05 11:14:39.588 DEBUG [bitronix-disk-force-batcher] ()
bitronix.tm.journal.DiskForceWaitQueue - returning head
TransactionLogAppender
2011-03-05 11:14:39.588 DEBUG [bitronix-disk-force-batcher] ()
bitronix.tm.journal.TransactionLogAppender - forcing log writing
2011-03-05 11:14:39.592 DEBUG [bitronix-disk-force-batcher] ()
bitronix.tm.journal.TransactionLogAppender - done forcing log
2011-03-05 11:14:39.592 DEBUG [bitronix-disk-force-batcher] ()
bitronix.tm.journal.DiskForceWaitQueue - clearing list of waiting
TransactionLogAppender
2011-03-05 11:14:39.592 DEBUG [bitronix-disk-force-batcher] ()
bitronix.tm.journal.DiskForceBatcherThread - waiting for the wait queue to
fill up
2011-03-05 11:14:39.592 DEBUG [bitronix-disk-force-batcher] ()
bitronix.tm.journal.DiskForceWaitQueue - waiting for some
TransactionLogAppender to get enqueued

finally the threads wakes up but they find their TransactionLogAppender
still queued
[SRRoute-4] () bitronix.tm.journal.DiskForceWaitQueue - waiting for a
TransactionLogAppender on journal1.btm to get dequeued
[SRRoute-1] () bitronix.tm.journal.DiskForceWaitQueue - waiting for a
TransactionLogAppender on journal1.btm to get dequeued

I checked the DiskForceWaitQueue code and it seems fine to me. I am not
expert on JVM optimizations and have no idea how a race condition can happen
here or maybe I just do not interpret the log correctly. Just wonder if
anyone came across the same problem or what do you thing.

When I turn off ForceBatchingEnabled then all threads are fine and healthy
and I do not see the problem anymore.

More detail log snippet:

2011-03-05 11:14:39.562 DEBUG [SRRoute-2] ()
bitronix.tm.internal.XAResourceHolderState - applying resource timeout of
30s on an XAResourceHolderState with uniqueName=KUJ_ImpJMS
XAResource=com.swiftmq.jms.v750.XAResourceImpl@130c8e9 with XID a Bitronix
XID [6B756A2D696D702D636F72652D62746D0000012E858304C7000392B1 :
6B756A2D696D702D636F72652D62746D0000012E858304C9000392B3]
2011-03-05 11:14:39.566 DEBUG [SRRoute-4] () bitronix.tm.twopc.Preparer -
prepared resource an XAResourceHolderState with uniqueName=KUJ_ImpDB
XAResource=oracle.jdbc.driver.T4CXAResource@168bcb7 (ended) with XID a
Bitronix XID [6B756A2D696D702D636F72652D62746D0000012E858300E80003929B :
6B756A2D696D702D636F72652D62746D0000012E858303B8000392AA] voted XA_OK
2011-03-05 11:14:39.567 DEBUG [SRRoute-4] ()
bitronix.tm.twopc.AbstractPhaseEngine - phase executed with 0 exception(s)
2011-03-05 11:14:39.567 DEBUG [SRRoute-4] ()
bitronix.tm.twopc.AbstractPhaseEngine - ran 2 job(s) for position '1'
2011-03-05 11:14:39.567 DEBUG [SRRoute-4] () bitronix.tm.BitronixTransaction
- changing transaction status to PREPARED
2011-03-05 11:14:39.567 DEBUG [SRRoute-4] ()
bitronix.tm.journal.TransactionLogAppender - between 832225 and 832313,
writing a Bitronix TransactionLogRecord with status=PREPARED,
recordLength=80, headerLength=28, time=1299320079567, sequenceNumber=234164,
crc32=-1839180916,
gtrid=6B756A2D696D702D636F72652D62746D0000012E858300E80003929B,
uniqueNames=KUJ_ImpDB,KUJ_ImpJMS
2011-03-05 11:14:39.568 DEBUG [SRRoute-1] () bitronix.tm.twopc.Preparer -
prepared resource an XAResourceHolderState with uniqueName=KUJ_ImpDB
XAResource=oracle.jdbc.driver.T4CXAResource@165e55e (ended) with XID a
Bitronix XID [6B756A2D696D702D636F72652D62746D0000012E858300BD00039297 :
6B756A2D696D702D636F72652D62746D0000012E858302D3000392A6] voted XA_OK
2011-03-05 11:14:39.568 DEBUG [SRRoute-1] ()
bitronix.tm.twopc.AbstractPhaseEngine - phase executed with 0 exception(s)
2011-03-05 11:14:39.569 DEBUG [SRRoute-1] ()
bitronix.tm.twopc.AbstractPhaseEngine - ran 2 job(s) for position '1'
2011-03-05 11:14:39.569 DEBUG [SRRoute-1] () bitronix.tm.BitronixTransaction
- changing transaction status to PREPARED
2011-03-05 11:14:39.569 DEBUG [SRRoute-6] () bitronix.tm.twopc.Preparer -
prepared resource an XAResourceHolderState with uniqueName=KUJ_ImpDB
XAResource=oracle.jdbc.driver.T4CXAResource@18212f3 (ended) with XID a
Bitronix XID [6B756A2D696D702D636F72652D62746D0000012E8583010A000392A3 :
6B756A2D696D702D636F72652D62746D0000012E858303B2000392A9] voted XA_OK
2011-03-05 11:14:39.569 DEBUG [SRRoute-6] ()
bitronix.tm.twopc.AbstractPhaseEngine - phase executed with 0 exception(s)
2011-03-05 11:14:39.569 DEBUG [SRRoute-6] ()
bitronix.tm.twopc.AbstractPhaseEngine - ran 2 job(s) for position '1'
2011-03-05 11:14:39.569 DEBUG [SRRoute-6] () bitronix.tm.BitronixTransaction
- changing transaction status to PREPARED
2011-03-05 11:14:39.577 DEBUG [SRRoute-4] ()
bitronix.tm.journal.TransactionLogAppender - disk journal appender now at
position 832313
2011-03-05 11:14:39.577 DEBUG [SRRoute-4] () bitronix.tm.BitronixTransaction
- transaction status is changing from PREPARING to PREPARED - executing 0
listener(s)
2011-03-05 11:14:39.577 DEBUG [SRRoute-4] () bitronix.tm.twopc.Preparer -
successfully prepared 2 resource(s)
2011-03-05 11:14:39.577 DEBUG [SRRoute-4] () bitronix.tm.BitronixTransaction
- 2 interested resource(s)
2011-03-05 11:14:39.577 DEBUG [SRRoute-4] () bitronix.tm.BitronixTransaction
- changing transaction status to COMMITTING (forced)
2011-03-05 11:14:39.578 DEBUG [SRRoute-4] ()
bitronix.tm.journal.TransactionLogAppender - between 832313 and 832401,
writing a Bitronix TransactionLogRecord with status=COMMITTING,
recordLength=80, headerLength=28, time=1299320079577, sequenceNumber=234165,
crc32=-1946994592,
gtrid=6B756A2D696D702D636F72652D62746D0000012E858300E80003929B,
uniqueNames=KUJ_ImpDB,KUJ_ImpJMS
2011-03-05 11:14:39.579 DEBUG [SRRoute-4] ()
bitronix.tm.journal.TransactionLogAppender - disk journal appender now at
position 832401
2011-03-05 11:14:39.579 DEBUG [SRRoute-4] ()
bitronix.tm.journal.DiskForceWaitQueue - enqueued a TransactionLogAppender
on journal1.btm, 1 TransactionLogAppender waiting for a disk force
2011-03-05 11:14:39.579 DEBUG [SRRoute-4] ()
bitronix.tm.journal.DiskForceBatcherThread - batching disk force, there are
1 TransactionLogAppender in the wait queue
2011-03-05 11:14:39.580 DEBUG [SRRoute-4] ()
bitronix.tm.journal.DiskForceWaitQueue - waiting for a
TransactionLogAppender on journal1.btm to get dequeued
2011-03-05 11:14:39.580 DEBUG [SRRoute-1] ()
bitronix.tm.journal.TransactionLogAppender - between 832401 and 832489,
writing a Bitronix TransactionLogRecord with status=PREPARED,
recordLength=80, headerLength=28, time=1299320079580, sequenceNumber=234166,
crc32=-437286430,
gtrid=6B756A2D696D702D636F72652D62746D0000012E858300BD00039297,
uniqueNames=KUJ_ImpDB,KUJ_ImpJMS
2011-03-05 11:14:39.581 DEBUG [SRRoute-1] ()
bitronix.tm.journal.TransactionLogAppender - disk journal appender now at
position 832489
2011-03-05 11:14:39.581 DEBUG [SRRoute-1] () bitronix.tm.BitronixTransaction
- transaction status is changing from PREPARING to PREPARED - executing 0
listener(s)
2011-03-05 11:14:39.581 DEBUG [SRRoute-1] () bitronix.tm.twopc.Preparer -
successfully prepared 2 resource(s)
2011-03-05 11:14:39.581 DEBUG [SRRoute-1] () bitronix.tm.BitronixTransaction
- 2 interested resource(s)
2011-03-05 11:14:39.581 DEBUG [SRRoute-1] () bitronix.tm.BitronixTransaction
- changing transaction status to COMMITTING (forced)
2011-03-05 11:14:39.582 DEBUG [SRRoute-1] ()
bitronix.tm.journal.TransactionLogAppender - between 832489 and 832577,
writing a Bitronix TransactionLogRecord with status=COMMITTING,
recordLength=80, headerLength=28, time=1299320079582, sequenceNumber=234167,
crc32=861186692,
gtrid=6B756A2D696D702D636F72652D62746D0000012E858300BD00039297,
uniqueNames=KUJ_ImpDB,KUJ_ImpJMS
2011-03-05 11:14:39.583 DEBUG [SRRoute-1] ()
bitronix.tm.journal.TransactionLogAppender - disk journal appender now at
position 832577
2011-03-05 11:14:39.583 DEBUG [SRRoute-1] ()
bitronix.tm.journal.DiskForceWaitQueue - enqueued a TransactionLogAppender
on journal1.btm, 2 TransactionLogAppender waiting for a disk force
2011-03-05 11:14:39.584 DEBUG [SRRoute-1] ()
bitronix.tm.journal.DiskForceBatcherThread - batching disk force, there are
2 TransactionLogAppender in the wait queue
2011-03-05 11:14:39.584 DEBUG [SRRoute-1] ()
bitronix.tm.journal.DiskForceWaitQueue - waiting for a
TransactionLogAppender on journal1.btm to get dequeued
2011-03-05 11:14:39.585 DEBUG [SRRoute-6] ()
bitronix.tm.journal.TransactionLogAppender - between 832577 and 832665,
writing a Bitronix TransactionLogRecord with status=PREPARED,
recordLength=80, headerLength=28, time=1299320079585, sequenceNumber=234168,
crc32=-958657710,
gtrid=6B756A2D696D702D636F72652D62746D0000012E8583010A000392A3,
uniqueNames=KUJ_ImpDB,KUJ_ImpJMS
2011-03-05 11:14:39.586 DEBUG [SRRoute-6] ()
bitronix.tm.journal.TransactionLogAppender - disk journal appender now at
position 832665
2011-03-05 11:14:39.586 DEBUG [SRRoute-6] () bitronix.tm.BitronixTransaction
- transaction status is changing from PREPARING to PREPARED - executing 0
listener(s)
2011-03-05 11:14:39.586 DEBUG [SRRoute-6] () bitronix.tm.twopc.Preparer -
successfully prepared 2 resource(s)
2011-03-05 11:14:39.586 DEBUG [SRRoute-6] () bitronix.tm.BitronixTransaction
- 2 interested resource(s)
2011-03-05 11:14:39.586 DEBUG [SRRoute-6] () bitronix.tm.BitronixTransaction
- changing transaction status to COMMITTING (forced)
2011-03-05 11:14:39.586 DEBUG [SRRoute-6] ()
bitronix.tm.journal.TransactionLogAppender - between 832665 and 832753,
writing a Bitronix TransactionLogRecord with status=COMMITTING,
recordLength=80, headerLength=28, time=1299320079586, sequenceNumber=234169,
crc32=-1622014559,
gtrid=6B756A2D696D702D636F72652D62746D0000012E8583010A000392A3,
uniqueNames=KUJ_ImpDB,KUJ_ImpJMS
2011-03-05 11:14:39.587 DEBUG [SRRoute-6] ()
bitronix.tm.journal.TransactionLogAppender - disk journal appender now at
position 832753
2011-03-05 11:14:39.588 DEBUG [SRRoute-4] ()
bitronix.tm.journal.DiskForceWaitQueue - waiting for a
TransactionLogAppender on journal1.btm to get dequeued
2011-03-05 11:14:39.581 DEBUG [bitronix-disk-force-batcher] ()
bitronix.tm.journal.DiskForceBatcherThread - wait queue is not empty anymore
(1 in queue)
2011-03-05 11:14:39.588 DEBUG [bitronix-disk-force-batcher] ()
bitronix.tm.journal.DiskForceBatcherThread - forcing...
2011-03-05 11:14:39.588 DEBUG [bitronix-disk-force-batcher] ()
bitronix.tm.journal.DiskForceWaitQueue - returning head
TransactionLogAppender
2011-03-05 11:14:39.588 DEBUG [bitronix-disk-force-batcher] ()
bitronix.tm.journal.TransactionLogAppender - forcing log writing
2011-03-05 11:14:39.592 DEBUG [bitronix-disk-force-batcher] ()
bitronix.tm.journal.TransactionLogAppender - done forcing log
2011-03-05 11:14:39.592 DEBUG [bitronix-disk-force-batcher] ()
bitronix.tm.journal.DiskForceWaitQueue - clearing list of waiting
TransactionLogAppender
2011-03-05 11:14:39.592 DEBUG [bitronix-disk-force-batcher] ()
bitronix.tm.journal.DiskForceBatcherThread - waiting for the wait queue to
fill up
2011-03-05 11:14:39.592 DEBUG [bitronix-disk-force-batcher] ()
bitronix.tm.journal.DiskForceWaitQueue - waiting for some
TransactionLogAppender to get enqueued
2011-03-05 11:14:39.598 DEBUG [SRRoute-6] ()
bitronix.tm.journal.DiskForceWaitQueue - enqueued a TransactionLogAppender
on journal1.btm, 1 TransactionLogAppender waiting for a disk force
2011-03-05 11:14:39.599 DEBUG [SRRoute-6] ()
bitronix.tm.journal.DiskForceBatcherThread - batching disk force, there are
0 TransactionLogAppender in the wait queue
2011-03-05 11:14:39.599 DEBUG [SRRoute-6] ()
bitronix.tm.journal.DiskForceBatcherThread - wait queue got emptied, disk
force is done
2011-03-05 11:14:39.600 DEBUG [SRRoute-6] () bitronix.tm.BitronixTransaction
- transaction status is changing from PREPARED to COMMITTING - executing 0
listener(s)
2011-03-05 11:14:39.600 DEBUG [SRRoute-6] ()
bitronix.tm.twopc.AbstractPhaseEngine - executing phase on 2 resource(s)
enlisted in 1 position(s) in reverse position order
2011-03-05 11:14:39.600 DEBUG [SRRoute-6] ()
bitronix.tm.twopc.AbstractPhaseEngine - running 2 job(s) for position '1'
2011-03-05 11:14:39.600 DEBUG [SRRoute-6] () bitronix.tm.twopc.Committer -
committing resource an XAResourceHolderState with uniqueName=KUJ_ImpDB
XAResource=oracle.jdbc.driver.T4CXAResource@18212f3 (ended) with XID a
Bitronix XID [6B756A2D696D702D636F72652D62746D0000012E8583010A000392A3 :
6B756A2D696D702D636F72652D62746D0000012E858303B2000392A9]
2011-03-05 11:14:39.603 DEBUG [SRRoute-4] ()
bitronix.tm.journal.DiskForceWaitQueue - waiting for a
TransactionLogAppender on journal1.btm to get dequeued
2011-03-05 11:14:39.607 DEBUG [SRRoute-2] ()
bitronix.tm.internal.XAResourceHolderState - started an
XAResourceHolderState with uniqueName=KUJ_ImpJMS
XAResource=com.swiftmq.jms.v750.XAResourceImpl@130c8e9 (started) with XID a
Bitronix XID [6B756A2D696D702D636F72652D62746D0000012E858304C7000392B1 :
6B756A2D696D702D636F72652D62746D0000012E858304C9000392B3] with NOFLAGS
2011-03-05 11:14:39.607 DEBUG [SRRoute-2] ()
bitronix.tm.resource.common.AbstractXAResourceHolder - putting
XAResourceHolderState [an XAResourceHolderState with uniqueName=KUJ_ImpJMS
XAResource=com.swiftmq.jms.v750.XAResourceImpl@130c8e9 (started) with XID a
Bitronix XID [6B756A2D696D702D636F72652D62746D0000012E858304C7000392B1 :
6B756A2D696D702D636F72652D62746D0000012E858304C9000392B3]] on a
DualSessionWrapper in state ACCESSIBLE of a JmsPooledConnection of pool
KUJ_ImpJMS in state ACCESSIBLE with underlying connection
com.swiftmq.jms.v750.XAConnectionImpl@12e7234
2011-03-05 11:14:39.607 DEBUG [SRRoute-2] ()
bitronix.tm.resource.common.AbstractXAResourceHolder - GTRID
[6B756A2D696D702D636F72652D62746D0000012E858304C7000392B1] previously
unknown to a DualSessionWrapper in state ACCESSIBLE of a JmsPooledConnection
of pool KUJ_ImpJMS in state ACCESSIBLE with underlying connection
com.swiftmq.jms.v750.XAConnectionImpl@12e7234, adding it to the resource's
transactions list
2011-03-05 11:14:39.607 DEBUG [SRRoute-2] ()
org.springframework.jms.listener.DefaultMessageListenerContainer - Received
message of type [class com.swiftmq.jms.BytesMessageImpl] from consumer [a
MessageConsumerWrapper of a DualSessionWrapper in state ACCESSIBLE of a
JmsPooledConnection of pool KUJ_ImpJMS in state ACCESSIBLE with underlying
connection com.swiftmq.jms.v750.XAConnectionImpl@12e7234] of session [a
DualSessionWrapper in state ACCESSIBLE of a JmsPooledConnection of pool
KUJ_ImpJMS in state ACCESSIBLE with underlying connection
com.swiftmq.jms.v750.XAConnectionImpl@12e7234]
...
... only [SRRoute-2] operations
...
2011-03-05 11:14:39.614 DEBUG [SRRoute-2] ()
bitronix.tm.resource.jms.DualSessionWrapper - choosing XA session
2011-03-05 11:14:39.614 DEBUG [SRRoute-2] ()
bitronix.tm.resource.common.TransactionContextHelper - enlisting a
DualSessionWrapper in state ACCESSIBLE of a
 JmsPooledConnection of pool KUJ_ImpJMS in state ACCESSIBLE with underlying
connection com.swiftmq.jms.v750.XAConnectionImpl@12e7234 into a Bitronix
Transa
ction with GTRID [6B756A2D696D702D636F72652D62746D0000012E858304C7000392B1],
status=ACTIVE, 1 resource(s) enlisted (started Sat Mar 05 11:14:39 CET 2011)
2011-03-05 11:14:39.614 DEBUG [SRRoute-2] ()
bitronix.tm.resource.common.TransactionContextHelper - avoiding
re-enlistment of already enlisted but not ende
d resource an XAResourceHolderState with uniqueName=KUJ_ImpJMS
XAResource=com.swiftmq.jms.v750.XAResourceImpl@130c8e9 (started) with XID a
Bitronix XID [6B
756A2D696D702D636F72652D62746D0000012E858304C7000392B1 :
6B756A2D696D702D636F72652D62746D0000012E858304C9000392B3]
2011-03-05 11:14:39.614 INFO  [SRRoute-2] ()
com.o2.cz.indra.imp.core.ChargeHandler - CDR request created, for
BaNumber:19911, status:ACCEPTED
2011-03-05 11:14:39.614 DEBUG [SRRoute-2] ()
org.springframework.beans.factory.support.DefaultListableBeanFactory -
Returning cached instance of singleton
bean 'longMsgAggregator'
2011-03-05 11:14:39.615 DEBUG [SRRoute-1] ()
bitronix.tm.journal.DiskForceWaitQueue - waiting for a
TransactionLogAppender on journal1.btm to get dequeued
2011-03-05 11:14:39.617 DEBUG [SRRoute-2] ()
com.o2.cz.indra.imp.core.LongMessageAggregator - StatusReport {ACCEPTED}
IdSufix {0303} for Long SMS id {ID-33
099-1299240332477-1-100360} is processing

--
View this message in context: http://old.nabble.com/threads-dying-when-ForceBatchingEnabled-is-true-tp31074769p31074769.html
Sent from the Bitronix Transaction Manager mailing list archive at Nabble.com.


---------------------------------------------------------------------
To unsubscribe from this list, please visit:

   http://xircles.codehaus.org/manage_email



Reply | Threaded
Open this post in threaded view
|

threads dying when ForceBatchingEnabled is true

jkuchta
In reply to this post by jkuchta
Hi,

I am sending another log showing a thread stucked in DiskForceWaitQueue.waitUntilNotContains

related to 'threads dying when ForceBatchingEnabled is true'

http://old.nabble.com/threads-dying-when-ForceBatchingEnabled-is-true-ts31074769.html

The failed thread is SRRoute-3 waiting at bitronix.tm.journal.DiskForceWaitQueue.waitUntilNotContains

Its last log entry is:
2011-03-10 14:33:09.024 DEBUG [SRRoute-3] () bitronix.tm.journal.DiskForceWaitQueue - waiting for a TransactionLogAppender on journal2.btm to get dequeued from size=1, obj=bitronix.tm.journal.DiskForceWaitQueue@1f133c8

See attached debug log and threads dump. This time I slightly modified BTM library (BTM-2.1.0). I added more detailed logging so now you can see what DiskForceWaitQueue object is used. Example is below (there is info appended 'size=1, obj=bitronix.tm.journal.DiskForceWaitQueue@1f133c8')

2011-03-10 14:33:08.996 DEBUG [SRRoute-3] () bitronix.tm.journal.DiskForceWaitQueue - enqueued a TransactionLogAppender on journal2.btm, 1 TransactionLogAp pender waiting for a disk force, obj=bitronix.tm.journal.DiskForceWaitQueue@1f133c8
2011-03-10 14:33:09.001 DEBUG [bitronix-disk-force-batcher] () bitronix.tm.journal.DiskForceWaitQueue - clearing list of waiting TransactionLogAppender siz e=1, obj=bitronix.tm.journal.DiskForceWaitQueue@1f133c8
2011-03-10 14:33:09.024 DEBUG [SRRoute-3] () bitronix.tm.journal.DiskForceWaitQueue - waiting for a TransactionLogAppender on journal2.btm to get dequeued from size=1, obj=bitronix.tm.journal.DiskForceWaitQueue@1f133c8

Hope it helps.

Environment details:

Sun JVM we are running on: Java(TM) SE Runtime Environment (build 1.6.0_18-b07)
OS:
Red Hat Enterprise Linux AS release 4 (Nahant Update 5) Linux 2.6.9-55.ELsmp #1 SMP Fri Apr 20 17:03:35 EDT 2007 i686 athlon i386 GNU/Linux
Server: 8xOpteron QuadCore 2.4G, 64GB

Thanks

Jirka Kuchta


Este correo electrónico y, en su caso, cualquier fichero anexo al mismo, contiene información de carácter confidencial exclusivamente dirigida a su destinatario o destinatarios. Si no es vd. el destinatario indicado, queda notificado que la lectura, utilización, divulgación y/o copia sin autorización está prohibida en virtud de la legislación vigente. En el caso de haber recibido este correo electrónico por error, se ruega notificar inmediatamente esta circunstancia mediante reenvío a la dirección electrónica del remitente.
Evite imprimir este mensaje si no es estrictamente necesario.

This email and any file attached to it (when applicable) contain(s) confidential information that is exclusively addressed to its recipient(s). If you are not the indicated recipient, you are informed that reading, using, disseminating and/or copying it without authorisation is forbidden in accordance with the legislation in effect. If you have received this email by mistake, please immediately notify the sender of the situation by resending it to their email address.
Avoid printing this message if it is not absolutely necessary.

---------------------------------------------------------------------
To unsubscribe from this list, please visit:

    http://xircles.codehaus.org/manage_email

btm-debug-SR3.tar.gz (83K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: threads dying when ForceBatchingEnabled is true

jkuchta
Hi,

I think I have found the problem in the code. There are actually two problems, both in the method DiskForceBatcherThread.enqueue():
1) DiskForceBatcherThread class attribute waitQueue can change between invocation of methods waitQueue.enqueue(tla) and waitQueue.waitUntilNotContains(tla) so when application tests if the given tla is no longer in the waitQueue it actually sometimes tests against a different waitQueue (new one).
2) DiskForceBatcherThread class attribute waitQueue can get cleared before another thread obtains the waitQueue lock and enques into it, i.e. application writes into the queue which was cleared already.

I have a naive solution. Not sure if it is good enough as I do not fully understand BTM code. Here it is. I suggest to modify two classes:
1) bitronix.tm.journal.DiskForceBatcherThread - method enqueue()
2) bitronix.tm.journal.DiskForceWaitQueue - new class attribute isCleared, it is accessed by methods enqueue() and clear()

in detail:
1) bitronix.tm.journal.DiskForceBatcherThread:
    public void enqueue(TransactionLogAppender tla) {
        DiskForceWaitQueue currrentWaitQueue = waitQueue;
        while (!currrentWaitQueue.enqueue(tla)) {
            if (log.isDebugEnabled()) log.debug("The current queue is cleared, wait for next one, current:" + currrentWaitQueue + ", next:" + waitQueue);
            currrentWaitQueue = waitQueue;
        };
        if (log.isDebugEnabled()) log.debug("batching disk force, there are " + currrentWaitQueue.size() + " TransactionLogAppender in the wait queue");
        try {
            currrentWaitQueue.waitUntilNotContains(tla);
        } catch (InterruptedException ex) {
            if (log.isDebugEnabled()) log.debug("interrupted while waiting for journal log to be forced but disk force will happen anyway");
        }
        if (log.isDebugEnabled()) log.debug("wait queue got emptied, disk force is done");
    }
2)  bitronix.tm.journal.DiskForceWaitQueue

    private boolean isCleared = false;

    /**
     * returns true if the tla was successfully enqueued otherwise returns false
     */
    public synchronized boolean enqueue(TransactionLogAppender tla) {
        if (isCleared) { return false; }
        objects.add(tla);
        if (log.isDebugEnabled()) log.debug("enqueued " + tla + ", " + objects.size() + " TransactionLogAppender waiting for a disk force, obj=" + toString());
        notifyAll();
        return true;
    }

    public synchronized void clear() {
        if (log.isDebugEnabled()) log.debug("clearing list of waiting TransactionLogAppender size=" + objects.size() + ", obj=" + toString());
        objects.clear();
        isCleared = true;
        notifyAll();
    }

When I apply the changes my application does not hang up anymore.
Reply | Threaded
Open this post in threaded view
|

Re: threads dying when ForceBatchingEnabled is true

Ludovic Orban-2
Hi,

Apparently you've found the bug and your proposed patch looks perfectly correct to me.

I've opened a new JIRA to track this bug (http://jira.codehaus.org/browse/BTM-102) and I've already applied your proposed change in git. You can find a pre-built version at this location if you wish: https://nexus.codehaus.org/content/repositories/snapshots/org/codehaus/btm/btm/2.1.1-SNAPSHOT/btm-2.1.1-20110311.090521-4.jar

I've also slightly edited your code before committing it so I'd be glad if you could give the 'official' snapshot version a try and report back if it's working or nor so I can close the issue.


Thanks a lot for the report and the work you did tracing and fixing this bug! I'm just sorry I couldn't really help you with this.

Thanks again,
Ludovic


2011/3/10 jkuchta <[hidden email]>

Hi,

I think I have found the problem in the code. There are actually two
problems, both in the method DiskForceBatcherThread.enqueue():
1) DiskForceBatcherThread class attribute waitQueue can change between
invocation of methods waitQueue.enqueue(tla) and
waitQueue.waitUntilNotContains(tla) so when application tests if the given
tla is no longer in the waitQueue it actually sometimes tests against a
different waitQueue (new one).
2) DiskForceBatcherThread class attribute waitQueue can get cleared before
another thread obtains the waitQueue lock and enques into it, i.e.
application writes into the queue which was cleared already.

I have a naive solution. Not sure if it is good enough as I do not fully
understand BTM code. Here it is. I suggest to modify two classes:
1) bitronix.tm.journal.DiskForceBatcherThread - method enqueue()
2) bitronix.tm.journal.DiskForceWaitQueue - new class attribute isCleared,
it is accessed by methods enqueue() and clear()

in detail:
1) bitronix.tm.journal.DiskForceBatcherThread:
   public void enqueue(TransactionLogAppender tla) {
       DiskForceWaitQueue currrentWaitQueue = waitQueue;
       while (!currrentWaitQueue.enqueue(tla)) {
           if (log.isDebugEnabled()) log.debug("The current queue is
cleared, wait for next one, current:" + currrentWaitQueue + ", next:" +
waitQueue);
           currrentWaitQueue = waitQueue;
       };
       if (log.isDebugEnabled()) log.debug("batching disk force, there are
" + currrentWaitQueue.size() + " TransactionLogAppender in the wait queue");
       try {
           currrentWaitQueue.waitUntilNotContains(tla);
       } catch (InterruptedException ex) {
           if (log.isDebugEnabled()) log.debug("interrupted while waiting
for journal log to be forced but disk force will happen anyway");
       }
       if (log.isDebugEnabled()) log.debug("wait queue got emptied, disk
force is done");
   }
2)  bitronix.tm.journal.DiskForceWaitQueue

   private boolean isCleared = false;

   /**
    * returns true if the tla was successfully enqueued otherwise returns
false
    */
   public synchronized boolean enqueue(TransactionLogAppender tla) {
       if (isCleared) { return false; }
       objects.add(tla);
       if (log.isDebugEnabled()) log.debug("enqueued " + tla + ", " +
objects.size() + " TransactionLogAppender waiting for a disk force, obj=" +
toString());
       notifyAll();
       return true;
   }

   public synchronized void clear() {
       if (log.isDebugEnabled()) log.debug("clearing list of waiting
TransactionLogAppender size=" + objects.size() + ", obj=" + toString());
       objects.clear();
       isCleared = true;
       notifyAll();
   }

When I apply the changes my application does not hang up anymore.

--
View this message in context: http://old.nabble.com/threads-dying-when-ForceBatchingEnabled-is-true-tp31074769p31120664.html
Sent from the Bitronix Transaction Manager mailing list archive at Nabble.com.


---------------------------------------------------------------------
To unsubscribe from this list, please visit:

   http://xircles.codehaus.org/manage_email



Reply | Threaded
Open this post in threaded view
|

Re: threads dying when ForceBatchingEnabled is true

jkuchta
I am glad I could help a little. BTM is great piece of code! Thanks for that.

I tested provided snapshot btm-2.1.1-20110311.090521-4.jar and it is ok. My load tests do not fail anymore.

Unfortunately this is true only for Sun JVM 1.6.0_18. When I run on the latest Sun JVM 1.6.0_24 my application still hangs but the reason is completely different. I am going to open new thread when I collect all the information. You can definitely close this issue as the newly discovered problem is not related.