Difference between revisions of "Troubleshooting bind issues"

From Notes_Wiki
m
m
Line 29: Line 29:
assuming resolution for 0.centos.pool.ntp.org is possible using some other DNS server
assuming resolution for 0.centos.pool.ntp.org is possible using some other DNS server


==bind fails to stop and hence fails to start without any good reason==
Sometimes, especially after unclean shutdown, bind may fail to stop and start.  To solve this try following steps:
1. Use '<tt>ps aux | grep named</tt>' and ensure that bind is not running.  Kill the process if necessary.
2. Use '<tt>mount</tt>' and verify that nothing is mounted inside '<tt>/var/named/chroot</tt>'.  Unmount all folders and files mount inside this folder
3. Then go to '<tt>/var/named/chroot/var/run/named</tt>' folder and  delete any pid files that exist
4. Now try '<tt>service named restart</tt>' again


<yambe:breadcrumb>Bind_DNS_server_configuration|Bind DNS</yambe:breadcrumb>
<yambe:breadcrumb>Bind_DNS_server_configuration|Bind DNS</yambe:breadcrumb>

Revision as of 03:20, 28 August 2014

<yambe:breadcrumb>Bind_DNS_server_configuration|Bind DNS</yambe:breadcrumb>

Troubleshooting bind issues

Very high CPU usage (200%+) by bind

When using chroot bind environment with sufficiently complex configuration bind CPU usage may be above 200%. This problem is caused by configuration file mentioning directories such as '/var/named/data' or /var/named/dynamic' which do not exist in location '/var/named/chroot/var/named/data' or '/var/named/chroot/var/named/dynamic', etc. Hence to solve the problem create all directories in chrooted 'var/named' folder and make them owned by named:named. Then restart bind and the CPU usage should go below 0% as usual.


broken trust chain error

If bind logs show 'broken trust chain' such as:

15-Apr-2014 06:06:11.667 lame-servers: info: error (no valid RRSIG) resolving 'google.co.in/DS/IN': 125.19.40.90#53
15-Apr-2014 06:06:11.942 lame-servers: info: error (no valid RRSIG) resolving 'google.co.in/DS/IN': 199.7.87.1#53
15-Apr-2014 06:06:12.212 lame-servers: info: error (no valid RRSIG) resolving 'google.co.in/DS/IN': 199.253.57.1#53
15-Apr-2014 06:06:12.334 lame-servers: info: error (no valid RRSIG) resolving 'google.co.in/DS/IN': 194.0.1.7#53
15-Apr-2014 06:06:12.379 lame-servers: info: error (no valid RRSIG) resolving 'google.co.in/DS/IN': 115.249.164.142#53
15-Apr-2014 06:06:12.470 lame-servers: info: error (no valid RRSIG) resolving 'google.co.in/DS/IN': 199.249.125.1#53
15-Apr-2014 06:06:12.618 lame-servers: info: error (no valid RRSIG) resolving 'google.co.in/DS/IN': 199.249.117.1#53
15-Apr-2014 06:06:12.860 lame-servers: info: error (no valid RRSIG) resolving 'google.co.in/DS/IN': 199.253.56.1#53
15-Apr-2014 06:06:12.861 lame-servers: info: error (no valid DS) resolving 'www.google.co.in/A/IN': 216.239.34.10#53
15-Apr-2014 06:06:12.985 lame-servers: info: error (broken trust chain) resolving 'www.google.co.in/A/IN': 216.239.36.10#53
15-Apr-2014 06:06:13.055 lame-servers: info: error (broken trust chain) resolving 'www.google.co.in/A/IN': 216.239.34.10#53

Then the most probable cause for this is wrong system time. It is recommended to have ntp server or client configured on each system to resolve this permanently. For a quick fix use:

ntpdate -b 0.centos.pool.ntp.org

assuming resolution for 0.centos.pool.ntp.org is possible using some other DNS server


bind fails to stop and hence fails to start without any good reason

Sometimes, especially after unclean shutdown, bind may fail to stop and start. To solve this try following steps: 1. Use 'ps aux | grep named' and ensure that bind is not running. Kill the process if necessary. 2. Use 'mount' and verify that nothing is mounted inside '/var/named/chroot'. Unmount all folders and files mount inside this folder 3. Then go to '/var/named/chroot/var/run/named' folder and delete any pid files that exist 4. Now try 'service named restart' again

<yambe:breadcrumb>Bind_DNS_server_configuration|Bind DNS</yambe:breadcrumb>