Question on BTM's recovery behavior

classic Classic list List threaded Threaded
4 messages Options
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Question on BTM's recovery behavior

Karl Cassaigne
Hi,

I did the good boy reading the full "XA Exposed I, II and III" recommanded reading on your site ;-)
I now have a question on the recovery process of BTM : what is the default behavior of BTM in case of XA failure ? does it fail fast returning an error to the client application or does it keep trying forever ?

In my case, I use a MySQL db (not XA using the Last Resource Commit optimization) together with SwiftMQ 7.0 (full XA with user console to manualy finalize in-doubt transaction).

Knowing that no recovery can be done on the MySQL side (so I guess we can assume all in-doubt transactions will be rolled back) and that I can manually roll back any transaction on SwiftMQ side, is there a way (or simply a need) to access BTM's in-doubt transaction list to manually roll them back too (to allow BTM to forget about those transactions) ?

Excuse me in advance if this is not clear enough or if this is a stupid question since I'm quite new to all this XA stuff ;-)

Thanks in advance,
Karl
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Question on BTM's recovery behavior

Ludovic Orban
Administrator
Hi Karl,

BTM's default behavior in case of failure is to try to fix it when possible. There are zillions of different failures and I hope I did a good job identifying them all.

Some errors are not fixable, BTM gives up immediately in that case reporting the problem with an ERROR log (for instance when a heuristic inconsistency happens).

Some are retried (for instance when the XA resource is temporarily unavailable) until the transaction times out. It is then left pending. An ERROR is also logged in that case.

Some are recovered (like after an application server crash or when the transaction was left pending by the TM) by the recovery process. The recovery outputs a summary of its work after it completed as an INFO log. Recovery always runs at startup (to avoid working on inconsistent resources) and at regular interval in the background if configured to.

In your case, since you're not using MySQL's XA support but the Last Resource Commit instead there can never be an in-doubt transaction on MySQL's side. In-doubt transactions could still appear on SwiftMQ's side tough and you can monitor them with its console and eventually terminate them manually. BTW, did you know that proper XA transactions termination was implemented in SwiftMQ on my request (see: http://www.nabble.com/XA-heuristics-support-to7387894.html) ?

Most (99.999%) of the time you do not want to manually finish transactions but rather leave that job to the transaction manager's recovery process. The reason is that you need to know the global outcome: does the in-doubt transaction need to be committed or rolled back ? Per definition, only the TM has this knowledge so it's better to leave it handle all that alone.

In the very uncommon case where you would have to manually terminate a transaction (aka the heuristic case), there is a way to read and analyze BTM's recovery logs. Information about transactions' state is stored there and can be extracted by the GUI console. This is not yet documented as it is not very well polished but nevertheless already useful. You can try it out by executing the bitronix.tm.gui.Console class. This is very low-level so you really need to understand what's going on before making use of heuristic decisions or you might break data integrity.

After a heuristic has been performed, the recovery process will still run and check that you made the right decision. If not, yet another ERROR log will be outputted.

As you see, you'd better rely on the TM for XA error handling and closely monitor what it logs at INFO+ levels.

Ludovic

Karl Cassaigne wrote
Hi,

I did the good boy reading the full "XA Exposed I, II and III" recommanded reading on your site ;-)
I now have a question on the recovery process of BTM : what is the default behavior of BTM in case of XA failure ? does it fail fast returning an error to the client application or does it keep trying forever ?

In my case, I use a MySQL db (not XA using the Last Resource Commit optimization) together with SwiftMQ 7.0 (full XA with user console to manualy finalize in-doubt transaction).

Knowing that no recovery can be done on the MySQL side (so I guess we can assume all in-doubt transactions will be rolled back) and that I can manually roll back any transaction on SwiftMQ side, is there a way (or simply a need) to access BTM's in-doubt transaction list to manually roll them back too (to allow BTM to forget about those transactions) ?

Excuse me in advance if this is not clear enough or if this is a stupid question since I'm quite new to all this XA stuff ;-)

Thanks in advance,
Karl
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Question on BTM's recovery behavior

Karl Cassaigne
Hi Ludovic,

Thanks a lot, it's a lot more clear for me now. So most of the time BTM should hopefully be able to terminate in-doubt transaction on SwiftMQ's side automaticaly during it's recovery process, fine !

I remember seeing that post on SwiftMQ's forum about XA implementation but at that time I was not even aware I would have to use it one day ;-)

Just a little last question for you : I actually use Spring's PlatformTransactionManager object explicitly to control transactions together with JdbcTemplate and plain JMS code. I was wondering whether I should close JMS resources (connection/session/producer/consumer) before or after proceeding with commit/rollback ?

I'll give a try to JmsTemplate and declarative transaction management later on but I always prefer take some time to understand how the things work before using any abstration layer.

Just to know : looking for Bitronix on google I found another site at www.bitronix.be... do you come from Belgium ? (I'm french myself).

Regards,
Karl

Ludovic Orban wrote
Hi Karl,

BTM's default behavior in case of failure is to try to fix it when possible. There are zillions of different failures and I hope I did a good job identifying them all.

Some errors are not fixable, BTM gives up immediately in that case reporting the problem with an ERROR log (for instance when a heuristic inconsistency happens).

Some are retried (for instance when the XA resource is temporarily unavailable) until the transaction times out. It is then left pending. An ERROR is also logged in that case.

Some are recovered (like after an application server crash or when the transaction was left pending by the TM) by the recovery process. The recovery outputs a summary of its work after it completed as an INFO log. Recovery always runs at startup (to avoid working on inconsistent resources) and at regular interval in the background if configured to.

In your case, since you're not using MySQL's XA support but the Last Resource Commit instead there can never be an in-doubt transaction on MySQL's side. In-doubt transactions could still appear on SwiftMQ's side tough and you can monitor them with its console and eventually terminate them manually. BTW, did you know that proper XA transactions termination was implemented in SwiftMQ on my request (see: http://www.nabble.com/XA-heuristics-support-to7387894.html) ?

Most (99.999%) of the time you do not want to manually finish transactions but rather leave that job to the transaction manager's recovery process. The reason is that you need to know the global outcome: does the in-doubt transaction need to be committed or rolled back ? Per definition, only the TM has this knowledge so it's better to leave it handle all that alone.

In the very uncommon case where you would have to manually terminate a transaction (aka the heuristic case), there is a way to read and analyze BTM's recovery logs. Information about transactions' state is stored there and can be extracted by the GUI console. This is not yet documented as it is not very well polished but nevertheless already useful. You can try it out by executing the bitronix.tm.gui.Console class. This is very low-level so you really need to understand what's going on before making use of heuristic decisions or you might break data integrity.

After a heuristic has been performed, the recovery process will still run and check that you made the right decision. If not, yet another ERROR log will be outputted.

As you see, you'd better rely on the TM for XA error handling and closely monitor what it logs at INFO+ levels.

Ludovic
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Question on BTM's recovery behavior

Ludovic Orban
Administrator
Hi Karl,

BTM doesn't care too much and can handle resource releasement after transaction termination but not all transaction managers are that flexible so I would recommend you to close connections (and sessions and so on) before transaction commit/rollback.

Bitronix is my own company, sponsoring BTM's development. And yes, I'm belgian.

Ludovic

Karl Cassaigne wrote
Hi Ludovic,

Thanks a lot, it's a lot more clear for me now. So most of the time BTM should hopefully be able to terminate in-doubt transaction on SwiftMQ's side automaticaly during it's recovery process, fine !

I remember seeing that post on SwiftMQ's forum about XA implementation but at that time I was not even aware I would have to use it one day ;-)

Just a little last question for you : I actually use Spring's PlatformTransactionManager object explicitly to control transactions together with JdbcTemplate and plain JMS code. I was wondering whether I should close JMS resources (connection/session/producer/consumer) before or after proceeding with commit/rollback ?

I'll give a try to JmsTemplate and declarative transaction management later on but I always prefer take some time to understand how the things work before using any abstration layer.

Just to know : looking for Bitronix on google I found another site at www.bitronix.be... do you come from Belgium ? (I'm french myself).

Regards,
Karl

Ludovic Orban wrote
Hi Karl,

BTM's default behavior in case of failure is to try to fix it when possible. There are zillions of different failures and I hope I did a good job identifying them all.

Some errors are not fixable, BTM gives up immediately in that case reporting the problem with an ERROR log (for instance when a heuristic inconsistency happens).

Some are retried (for instance when the XA resource is temporarily unavailable) until the transaction times out. It is then left pending. An ERROR is also logged in that case.

Some are recovered (like after an application server crash or when the transaction was left pending by the TM) by the recovery process. The recovery outputs a summary of its work after it completed as an INFO log. Recovery always runs at startup (to avoid working on inconsistent resources) and at regular interval in the background if configured to.

In your case, since you're not using MySQL's XA support but the Last Resource Commit instead there can never be an in-doubt transaction on MySQL's side. In-doubt transactions could still appear on SwiftMQ's side tough and you can monitor them with its console and eventually terminate them manually. BTW, did you know that proper XA transactions termination was implemented in SwiftMQ on my request (see: http://www.nabble.com/XA-heuristics-support-to7387894.html) ?

Most (99.999%) of the time you do not want to manually finish transactions but rather leave that job to the transaction manager's recovery process. The reason is that you need to know the global outcome: does the in-doubt transaction need to be committed or rolled back ? Per definition, only the TM has this knowledge so it's better to leave it handle all that alone.

In the very uncommon case where you would have to manually terminate a transaction (aka the heuristic case), there is a way to read and analyze BTM's recovery logs. Information about transactions' state is stored there and can be extracted by the GUI console. This is not yet documented as it is not very well polished but nevertheless already useful. You can try it out by executing the bitronix.tm.gui.Console class. This is very low-level so you really need to understand what's going on before making use of heuristic decisions or you might break data integrity.

After a heuristic has been performed, the recovery process will still run and check that you made the right decision. If not, yet another ERROR log will be outputted.

As you see, you'd better rely on the TM for XA error handling and closely monitor what it logs at INFO+ levels.

Ludovic
Loading...