From ede4f1621afe744b3321d2fa006fe93eb3b39eb0 Mon Sep 17 00:00:00 2001 From: Joel Sherrill Date: Tue, 7 Mar 2006 22:09:49 +0000 Subject: 2006-03-07 Steven Johnson PR 850/rtems * src/watchdogtickle.c: A Watchdog (used to timeout an event) with a delay of 1 sometimes does not seem to timeout. The problem occurs, because for whatever reason when the watchdog tickle function executes, the watchdog->delta_interval is 0. it is then decremented before being tested, becomes huge and so doesnt time out. It is thought there is a race condition where the watchdog->delta_interval is calculated by reference to a head (also with a delay of 1). But before it can be added after the head, the head is removed, so the new head now has a delay of 0. --- cpukit/score/ChangeLog | 13 +++++++++++++ cpukit/score/src/watchdogtickle.c | 34 +++++++++++++++++++++++++++++++--- 2 files changed, 44 insertions(+), 3 deletions(-) diff --git a/cpukit/score/ChangeLog b/cpukit/score/ChangeLog index a8de148d84..58d62054c4 100644 --- a/cpukit/score/ChangeLog +++ b/cpukit/score/ChangeLog @@ -1,3 +1,16 @@ +2006-03-07 Steven Johnson + + PR 850/rtems + * src/watchdogtickle.c: A Watchdog (used to timeout an event) with + a delay of 1 sometimes does not seem to timeout. The problem + occurs, because for whatever reason when the watchdog tickle function + executes, the watchdog->delta_interval is 0. it is then decremented + before being tested, becomes huge and so doesnt time out. It is + thought there is a race condition where the watchdog->delta_interval + is calculated by reference to a head (also with a delay of 1). But + before it can be added after the head, the head is removed, so the + new head now has a delay of 0. + 2006-03-07 Joel Sherrill PR 866/rtems diff --git a/cpukit/score/src/watchdogtickle.c b/cpukit/score/src/watchdogtickle.c index f75f099e50..9bd9fcb791 100644 --- a/cpukit/score/src/watchdogtickle.c +++ b/cpukit/score/src/watchdogtickle.c @@ -49,9 +49,37 @@ void _Watchdog_Tickle( goto leave; the_watchdog = _Watchdog_First( header ); - the_watchdog->delta_interval--; - if ( the_watchdog->delta_interval != 0 ) - goto leave; + + /* + * For some reason, on rare occasions the_watchdog->delta_interval + * of the head of the watchdog chain is 0. Before this test was + * added, on these occasions an event (which usually was supposed + * to have a timeout of 1 tick would have a delta_interval of 0, which + * would be decremented to 0xFFFFFFFF by the unprotected + * "the_watchdog->delta_interval--;" operation. + * This would mean the event would not timeout, and also the chain would + * be blocked, because a timeout with a very high number would be at the + * head, rather than at the end. + * The test "if (the_watchdog->delta_interval != 0)" + * here prevents this from occuring. + * + * We were not able to categorically identify the situation that causes + * this, but proved it to be true empirically. So this check causes + * correct behaviour in this circumstance. + * + * The belief is that a race condition exists whereby an event at the head + * of the chain is removed (by a pending ISR or higher priority task) + * during the _ISR_Flash( level ); in _Watchdog_Insert, but the watchdog + * to be inserted has already had its delta_interval adjusted to 0, and + * so is added to the head of the chain with a delta_interval of 0. + * + * Steven Johnson - 12/2005 (gcc-3.2.3 -O3 on powerpc) + */ + if (the_watchdog->delta_interval != 0) { + the_watchdog->delta_interval--; + if ( the_watchdog->delta_interval != 0 ) + goto leave; + } do { watchdog_state = _Watchdog_Remove( the_watchdog ); -- cgit v1.2.3