watchdogd
Advanced system & process supervisor for Linux
Classes | Macros | Enumerations | Functions
wdog.h File Reference

The libwdog API defines how to connect to watchdogd at runtime to query status, including reset cause data, adjust logging, and for advanced users: to ask watchdogd to monitor a process. More...

#include <time.h>
#include "compat.h"
Include dependency graph for wdog.h:

Go to the source code of this file.

Classes

struct  wdog_reason_t
 Reset reason data. More...
 

Enumerations

enum  wdog_code_t {
  WDOG_SYSTEM_NONE = 0, WDOG_SYSTEM_OK, WDOG_FAILED_SUBSCRIPTION, WDOG_FAILED_KICK,
  WDOG_FAILED_UNSUBSCRIPTION, WDOG_FAILED_TO_MEET_DEADLINE, WDOG_FORCED_RESET, WDOG_FAILED_UNKNOWN,
  WDOG_DESCRIPTOR_LEAK, WDOG_MEMORY_LEAK, WDOG_CPU_OVERLOAD
}
 Reset reason codes. More...
 

Functions

int wdog_set_debug (int enable)
 Toggle debug messages in daemon. More...
 
int wdog_get_debug (int *status)
 Get daemon debug status. More...
 
int wdog_set_loglevel (char *level)
 Change daemon log level. More...
 
char * wdog_get_loglevel (void)
 Get daemon log level. More...
 
int wdog_reset_counter (unsigned int *counter)
 Get system reset counter (updated on every watchdog reset, incl. More...
 
int wdog_reset_reason (wdog_reason_t *reason)
 Get reset reason. More...
 
int wdog_reset_reason_raw (wdog_reason_t *reason)
 Get reset reason (raw). More...
 
char * wdog_reset_reason_str (wdog_reason_t *reason)
 Translates wdog_code_t to human-readable string. More...
 
int wdog_reset_reason_clr (void)
 Clear reset reason, including reset counter. More...
 
int wdog_subscribe (char *label, unsigned int timeout, unsigned int *next_ack)
 Start supervising a subscriber. More...
 
int wdog_unsubscribe (int id, unsigned int ack)
 Stop supervising a subscriber. More...
 
int wdog_kick (int id, unsigned int timeout, unsigned int ack, unsigned int *next_ack)
 Kick the watchdog with a custom timeout (old API) More...
 
int wdog_extend_kick (int id, unsigned int timeout, unsigned int *ack)
 Kick the watchdog with a custom timeout. More...
 
int wdog_kick2 (int id, unsigned int *ack)
 Kick the watchdog. More...
 

Detailed Description

The libwdog API defines how to connect to watchdogd at runtime to query status, including reset cause data, adjust logging, and for advanced users: to ask watchdogd to monitor a process.

Please note, the logo, "Watch Dog Detective Taking Notes", is licensed for use by the watchdogd project, copyright © Ron Leishman

Typically a process' event/while(1) loop is instrumented with a call to "kick" the watchdog periodically to inform watchdogd that it is still operational. See the included examples for how this can be used

Enumeration Type Documentation

◆ wdog_code_t

Reset reason codes.

Enumerator
WDOG_SYSTEM_NONE 

After reset/power-on.

WDOG_SYSTEM_OK 

Unused?

WDOG_FAILED_SUBSCRIPTION 

Supervised process.

WDOG_FAILED_KICK 

Supervised process.

WDOG_FAILED_UNSUBSCRIPTION 

Supervised process.

WDOG_FAILED_TO_MEET_DEADLINE 

Supervised process.

WDOG_FORCED_RESET 

Operator requested system reboot.

WDOG_FAILED_UNKNOWN 

Likely, WDT timed out.

WDOG_DESCRIPTOR_LEAK 

filenr pluing

WDOG_MEMORY_LEAK 

meminfo plugin

WDOG_CPU_OVERLOAD 

loadavg plugin

Function Documentation

◆ wdog_extend_kick()

int wdog_extend_kick ( int  id,
unsigned int  timeout,
unsigned int *  ack 
)

Kick the watchdog with a custom timeout.

Checks ack, resets timer with new timeout and sets ack. Use this to extend the kick interval set in wdog_subscribe().

Parameters
idreturn value from wdog_subscribe
timeoutNumber of milliseconds to set timeout to
[in,out]ackPointer to ack received from last wdog API call. Will be updated with new ack.
Returns
0 on success, negative on error (also sets errno)

◆ wdog_get_debug()

int wdog_get_debug ( int *  status)

Get daemon debug status.

Parameters
[out]statusnon-zero when eanbled, must not be NULL.
Returns
POSIX OK(0) on success, non-zero on error.
Examples:
ex2.c.

◆ wdog_get_loglevel()

char* wdog_get_loglevel ( void  )

Get daemon log level.

Returns
See wdog_set_loglevel()

◆ wdog_kick()

int wdog_kick ( int  id,
unsigned int  timeout,
unsigned int  ack,
unsigned int *  next_ack 
)

Kick the watchdog with a custom timeout (old API)

Checks ack, resets timer with provided timeout and sets next_ack. This API is kept for backwards compatibility. The new wdog_kick2() API is a lot easier to use.

Parameters
idreturn value from wdog_subscribe
timeoutNumber of milliseconds to set timeout to
ackack received from last wdog API call
[out]next_ackack to pass to next wdog API call
See also
wdog_kick2()
Returns
0 on success, negative on error (also sets errno)
Examples:
ex2.c.

◆ wdog_kick2()

int wdog_kick2 ( int  id,
unsigned int *  ack 
)

Kick the watchdog.

Checks ack, resets timer and sets next_ack. Uses the timeout value provided in wdog_subscribe().

Parameters
idThe ID returned from wdog_subscribe()
[in,out]ackPointer to ack received from last wdog API call. Will be updated with new ack.
Returns
0 on success, negative on error (also sets errno)
Examples:
ex1.c.

◆ wdog_reset_counter()

int wdog_reset_counter ( unsigned int *  counter)

Get system reset counter (updated on every watchdog reset, incl.

reboots)

Parameters
[out]counterpointer to where to return counter, must not be NULL.
Returns
POSIX OK(0) on success, non-zero on error.

◆ wdog_reset_reason()

int wdog_reset_reason ( wdog_reason_t reason)

Get reset reason.

This function fecthes the reset reason data from the daemon. The retrived data can then be sent to wdog_reset_reason_str() to get a human readable string.

Parameters
[out]reasonpointer to where to return wdog_reason_t, must not be NULL.
Returns
POSIX OK(0) on success, non-zero on error.
Examples:
ex2.c.

◆ wdog_reset_reason_clr()

int wdog_reset_reason_clr ( void  )

Clear reset reason, including reset counter.

Please note, in general you should never call this function. There is no need for it and some fault cases cannot be detected afterwards. The function only exists for requirements mapping to the request of some customers.

Returns
POSIX OK(0) on success, non-zero on error.

◆ wdog_reset_reason_raw()

int wdog_reset_reason_raw ( wdog_reason_t reason)

Get reset reason (raw).

Similar to wdog_reset_reason(), except this reads the prepared reset reason from disk, which will be used in case of sudden power loss.

Parameters
[out]reasonpointer to where to return wdog_reason_t, must not be NULL.
Returns
POSIX OK(0) on success, non-zero on error.

◆ wdog_reset_reason_str()

char* wdog_reset_reason_str ( wdog_reason_t reason)

Translates wdog_code_t to human-readable string.

Parameters
reasonpointer to reset reason data.
Returns
always returns a constant string, even for NULL reason.
Examples:
ex2.c.

◆ wdog_set_debug()

int wdog_set_debug ( int  enable)

Toggle debug messages in daemon.

Parameters
enablewhen non-zero, enables LOG_DEBUG syslog messages.
Returns
POSIX OK(0) on success, non-zero on error.
Examples:
ex2.c.

◆ wdog_set_loglevel()

int wdog_set_loglevel ( char *  level)

Change daemon log level.

Parameters
levelone of: none, err, info, notice, debug
Returns
POSIX OK(0) on success, non-zero on error.

◆ wdog_subscribe()

int wdog_subscribe ( char *  label,
unsigned int  timeout,
unsigned int *  next_ack 
)

Start supervising a subscriber.

After this, one of the kick functions must be called at least every timeout millisecods until wdog_unsubscribe() is called. If not, watchdogd will (depending on the configuration) reset the system or call the supervisor script.

Parameters
labelName of this subscriber. If NULL, process ID will be used.
timeoutTimeout in milliseconds
[out]next_ackout-parameter - the value must be passed to next API call
Returns
ID on success, negative on error (also sets errno)
Examples:
ex1.c, and ex2.c.

◆ wdog_unsubscribe()

int wdog_unsubscribe ( int  id,
unsigned int  ack 
)

Stop supervising a subscriber.

Checks ack and stops supervisor for this subscriber

Parameters
idreturn value from wdog_subscribe
ackLast ack received from the wdog API
Returns
0 on success, negative on error (also sets errno)
Examples:
ex1.c, and ex2.c.