PDA

View Full Version : Apache dies 2nd night in a row


i2iweb
04-01-2004, 12:21 AM
Hi,

Wanted to see if anyone can point me in the right direction. This is the second night that apache went down at the same exact time. I used portupgrade to upgrade a few things about 3 days ago which most likely upgraded perl in the process. A simple restart of Apache fixes the issue but how can I fix it for good? Below is an excerpt from my error_log.

System: FreeBSD 4.9
Current Perl Version: 5.8.2


Can't locate Cwd.pm in @INC (@INC contains: /usr/local/lib/perl5/site_perl/5.8.0/mach /usr/local/lib/perl5/site_perl/5.8.0 /usr/local/lib/perl5/site_perl /usr/local/lib/perl5/5.8.0/BSDPAN /usr/local/lib/perl5/5.8.0/mach /usr/local/lib/perl5/5.8.0 .) at (eval 2) line 1.
Can't locate Cwd.pm in @INC (@INC contains: /usr/local/lib/perl5/site_perl/5.8.0/mach /usr/local/lib/perl5/site_perl/5.8.0 /usr/local/lib/perl5/site_perl /usr/local/lib/perl5/5.8.0/BSDPAN /usr/local/lib/perl5/5.8.0/mach /usr/local/lib/perl5/5.8.0 .) at (eval 2) line 1.
Can't locate Cwd.pm in @INC (@INC contains: /usr/local/lib/perl5/site_perl/5.8.0/mach /usr/local/lib/perl5/site_perl/5.8.0 /usr/local/lib/perl5/site_perl /usr/local/lib/perl5/5.8.0/BSDPAN /usr/local/lib/perl5/5.8.0/mach /usr/local/lib/perl5/5.8.0 .) at (eval 2) line 1.
Can't locate Cwd.pm in @INC (@INC contains: /usr/local/lib/perl5/site_perl/5.8.0/mach /usr/local/lib/perl5/site_perl/5.8.0 /usr/local/lib/perl5/site_perl /usr/local/lib/perl5/5.8.0/BSDPAN /usr/local/lib/perl5/5.8.0/mach /usr/local/lib/perl5/5.8.0 .) at (eval 2) line 1.

bjseiler
04-01-2004, 08:07 AM
Can't locate Cwd.pm in @INC (@INC contains: /usr/local/lib/perl5/site_perl/5.8.0/mach /usr/local/lib/perl5/site_perl/5.8.0 /u
sr/local/lib/perl5/site_perl /usr/local/lib/perl5/5.8.0/BSDPAN /usr/local/lib/perl5/5.8.0/mach /usr/local/lib/perl5/5.8.0 .)
at (eval 2) line 1.

Same thing and I am running FreeBSD 5.1. Seems apache is dying every night some time between midnight and 12:31.

i2iweb
04-01-2004, 08:24 AM
Yep, Mine dies at 12:31 as well so it looks like something running in cron kills it at that time. I'll do a little more research and let you know what I come up with. I know one thing that can be done is perl -c between 12:00 to 12:30 to see what's killing it. For the meantime I am trying to symlink the new perl to the old one to see if it helps.

ProWebUK
04-01-2004, 09:14 AM
Try installing Cwd:


perl -MCPAN -e shell
cpan> install Cwd


Chris

bjseiler
04-01-2004, 09:17 AM
Well, would this mean that something in the install process did not work and this should have already been installed?

ProWebUK
04-01-2004, 09:28 AM
I dont think DA even installs perl..


Please do not install services such as Apache, PHP, MySQL, Ftp, Sendmail, etc., as we will do this for you. All we need is a clean install of Redhat.


In the packages directory:

da_exim-4.24-1.i386.rpm
MySQL-client-4.0.16-0.i386.rpm
da_vm-pop3d-1.1.7e-1.i386.rpm
MySQL-devel-4.0.16-0.i386.rpm
squirrelmail-1.4.2.tar.gz
imapd
MySQL-server-4.0.16-0.i386.rpm
webalizer-2.01_10-11.i386.rpm
Mail-SpamAssassin-2.55.tar.gz
phpMyAdmin-2.5.4-php.tar.gz
webmail.tar.gz
proftpd-1.2.9-1.i386.rpm
majordomo-1.94.5.tar.gz
proftpd-standalone-1.2.9-1.i386.rpm

Unless DA installs them if they are not there? There are not any perl rpms from DA on the system im checking here anyway... and perl is installed via rpm on this... (obviously FBSD doesnt use rpm BUT.... if it doesnt install on redhat its doubtful it installs on FBSD either)

Will leave Mark or John to confirm this....

Chris

DirectAdmin Support
04-01-2004, 10:05 AM
DA doesn't touch perl. It will install it with pkg_add if there is *nothing* on the system.. (FreeBSD only).. but doesn't do any updgrades or module installs.

I'm currently looking into a possible cause with apache log rotation where the logs are removed during tally, and recreated when apache restarts. The issue is that the time between the deletion and restart can be several minutes so apache is writing to a non-existant file. That may or may not be releated to the problem at hand, but I thought I'd mention it in case it has some relevance.

John

ProWebUK
04-01-2004, 10:12 AM
Originally posted by DirectAdmin Support
The issue is that the time between the deletion and restart can be several minutes so apache is writing to a non-existant file.

Would LogRotate (rather than the DA app) remove this problem and also allow the restart removed every night..... thats another thing I really dislike at the moment (besides the non live tally :p )

Or... another idea... instaed of:

rm -f log
then create the log

(assuming thats how its done now...)

Couldn't you make the copy then

echo " " > log

?

Chris

bjseiler
04-01-2004, 10:16 AM
It is a little strange because I had DA installed on basically the same server setup and this did not happen.

Brad

DirectAdmin Support
04-01-2004, 10:50 AM
The reason apache is dying (maybe) is beacuase it holds the error logs open from startup. For all other logs, it's not an issue because they're opened and appended each time, so if the log doesn't exist, it's created. Since the filedescriptor is held open, you should in theory be able to rename the log file and it would still fill up the renamed file because the filedescriptor just points to the node in the filesystem, and isn't dependant on the name once it's open. By deleteing the node, apache would need to reopen the filedescriptor in order to write to a pipe that has another end, but that isn't done until the tally is complete.

By just creating a new file, the node is different from the one apache is using, so it won't be used. I may be way off on this, but I'll change how it's done anyway so we can rule it out if it's not the cause.

It's possible that different OS's handle the dangling filedescriptor differently, but I'm not too sure one that.

John

ProWebUK
04-01-2004, 11:02 AM
Yea, I realise that the kernel wont remove an active file until the application, in this case apache, has finished with it - is that the reason to apache being restarted?

Chris

i2iweb
04-02-2004, 02:02 AM
Well Apache crashed once more this time about 20 minutes later than usual. Any other thoughts on what may be happening?

TurtleBay
04-02-2004, 10:45 AM
Originally posted by ProWebUK
Would LogRotate (rather than the DA app) remove this problem and also allow the restart removed every night..... thats another thing I really dislike at the moment

I would like to say - I would like to get rid of this restart too. It causes me headaches. Once or twice a week it does not restart properly, and it stays down until I notice it in the morning.

Can we get DA up to Apache 2 and use piped logs instead?

Please, please get rid of this restart...

John

DirectAdmin Support
04-02-2004, 11:28 AM
Can all affected users please post their Operating Systems?

Is there *any* information in the apache logs as to why it goes down?

I've changed the code to trunctate the logs to zero which apache doesn't seem to mind at all. (instead of deleting them)

The restart is done when the httpd.conf files are rewritten.. For example if a user gets suspended, his DocumentRoot changes, and a reload is needed. I don't think having the restart is the problem, I think the reason the restart isnt working is the problem (restarts are done all the time.. when new domains are added, ssl certs, subdomains.. etc)

John

TurtleBay
04-02-2004, 11:59 AM
RedHat 9.0

From my post in another thread:

Originally posted by TurtleBay
No, not Zend Optimizer. When I tried to install it last time, it made Apache segfault. I have not tried to install it lately.

Log file looks like this:

[Tue Mar 9 00:15:02 2004] [warn] child process 30162 still did not exit, sending a SIGTERM
[Tue Mar 9 00:15:05 2004] [error] Cannot remove module mod_frontpage.c: not found in module list
[Tue Mar 9 00:15:05 2004] [crit] (98)Address already in use: make_sock: could not bind to port 443

Apache then sits in memory, but not responding to anything.


Happens once or twice a week.

John

bjseiler
04-02-2004, 12:02 PM
Can't locate Cwd.pm in @INC (@INC contains: /usr/local/lib/perl5/site_perl/5.8.0/mach /usr/local/lib/perl5/site_perl/5.8.0 /u
sr/local/lib/perl5/site_perl /usr/local/lib/perl5/5.8.0/BSDPAN /usr/local/lib/perl5/5.8.0/mach /usr/local/lib/perl5/5.8.0 .)
at (eval 2) line 1.

FreeBSD 5.1

That is really the only information I could find. Nothing at all in /var/log/messages. If there is anywhere else I can look, let me know.

DirectAdmin Support
04-02-2004, 12:58 PM
[Tue Mar 9 00:15:02 2004] [warn] child process 30162 still did not exit, sending a SIGTERM
[Tue Mar 9 00:15:05 2004] [error] Cannot remove module mod_frontpage.c: not found in module list
[Tue Mar 9 00:15:05 2004] [crit] (98)Address already in use: make_sock: could not bind to port 443It Could also be possible that a script run by apache is hanging around during the time of the restart.

Another thing to try would be to put 1 or 2 seconds of "sleep" between the stop and start in the apache boot script during a restart.

John

TurtleBay
04-02-2004, 01:28 PM
Originally posted by DirectAdmin Support
It Could also be possible that a script run by apache is hanging around during the time of the restart.

Another thing to try would be to put 1 or 2 seconds of "sleep" between the stop and start in the apache boot script during a restart.

John

John,

How would I test for the first possibliity, and how do I implement the second?

Thanks,
John

DirectAdmin Support
04-03-2004, 02:37 PM
You'd need to do a "ps -ax" and look for a script, probably running as a user right when apache refuses to restart. Just cross check that with any scripts your users might have installed (could take a while).. and then see if the script is buggy.

The sleep can be implemented by editing the apache boot script:
Redhat: /etc/init.d/httpd
FreeBSD: /usr/local/etc/rc.d/httpd

Scroll down to the part that says "restart". It should have "stop" and "start" after each other. Insert a line in between those 2 commands and enter:

sleep 2

to give it a 2 second pause after it stops before starting up again.

John

i2iweb
04-03-2004, 08:33 PM
Originally posted by DirectAdmin Support
You'd need to do a "ps -ax" and look for a script, probably running as a user right when apache refuses to restart. Just cross check that with any scripts your users might have installed (could take a while).. and then see if the script is buggy.

The sleep can be implemented by editing the apache boot script:
Redhat: /etc/init.d/httpd
FreeBSD: /usr/local/etc/rc.d/httpd

Scroll down to the part that says "restart". It should have "stop" and "start" after each other. Insert a line in between those 2 commands and enter:

sleep 2

to give it a 2 second pause after it stops before starting up again.

John

The server that's doing it doesn't have anyone on it yet. I just added the command and hopefully everything goes well tonight.

i2iweb
04-04-2004, 08:40 AM
The site went down again last night even though I added "sleep 2" to the httpd script. The only thing that's scheduled to run in crontab the time Aapache dies is:

30 0 * * * root echo 'action=tally&value=all' >> /usr/local/directadmin/data/tas
k.queue

I'm using Freebsd 4.9

DirectAdmin Support
04-04-2004, 02:51 PM
Ok.. so here's what we know.

Nobody on the server, so that rules out log rotation because logs are rotated based on size.
Also rules out bad cgi scripts, as there are none.

Found this:
http://archive.apache.org/gnats/6627

So we need to find out what would prevent an apache child from exiting. That thread mentioned an NFS, but I'm not you use one.

How about check the /var/log/directadmin/errortaskq.log and system.log for apache restarts around that time. See if the restart is failing.

A few other things to try is to put a long pause *after* the start/stop .. so that there is no downtime, but prevents the taskq from checking to see if it worked for a few seconds longer after is did restart. If the check is done too soon, apache might not be running yet..

I'm starting to run out of ideas for this..

John

bjseiler
04-04-2004, 02:56 PM
In both of the log files you mention, I have this same basic thing over and over

2004:04:04-10:06:06: httpd restarted
2004:04:04-10:07:02: httpd started
2004:04:04-10:07:06: httpd restarted
2004:04:04-10:08:00: httpd started
2004:04:04-10:08:05: httpd restarted
2004:04:04-10:09:01: httpd started
2004:04:04-10:09:05: httpd restarted
2004:04:04-10:10:01: httpd started
2004:04:04-10:10:05: httpd restarted

i2iweb
04-05-2004, 05:46 PM
After doing some searches on google I think I've narrowed it down to mod_perl. I'll try some tricks tonight and report back on what I find.

i2iweb
04-07-2004, 07:22 AM
I rebuilt Apache and it seems to have fixed the issue. Hasn't died for 1 nights in a row.

bjseiler
04-07-2004, 07:25 AM
Any special way in how you rebuilt it? Can you say how?

I have found issues like this in the past with non-DA freebsd systems where when I add a port or change PHP in some way, the only real way to get Apache working again is a make deinstall / make reinstall.

i2iweb
04-07-2004, 07:42 AM
Here is what I did.

cd /usr/local/directadmin/customapache
./build mod_perl


If that doesn't work then just do the following:

cd /usr/local/directadmin/customapache
./build clean
./build update
./build all

I think when perl got updated somehow mod_perl needed to be updated to recognize the new version of perl as well.

Kevin

bjseiler
04-07-2004, 07:47 AM
Sorry to ask this stupid question in this thread, but does running the update all erase any data files associated with the programs that are being upgraded?

bjseiler
04-07-2004, 07:48 AM
www6# ./build mod_perl
Found /usr/local/directadmin/customapache/mod_perl-1.0-current.tar.gz
Extracting ...
Done.
Configuring mod_perl-1.27...
perl: not found
Done. Making mod_perl-1.27...
Trying to make mod_perl...
make: no target to make.

*** The make has failed, do you want to try to make again? (y,n):


Any ideas?

DirectAdmin Support
04-08-2004, 10:13 AM
Hello,

does running the update all erase any data files associated with the programs that are being upgraded?No, it doesn't overwrite anything already exiting other than the build script and the README. If you want to revert to the default files, just delete them and then update again.

perl: not foundWhat happens when you type:
which perl

John

bjseiler
04-08-2004, 10:18 AM
1. So if I do an upgrade of MySQL, the database will not be wiped out?

2. www6# which perl
perl: Command not found.

www6# uname -a
FreeBSD 5.1-RELEASE FreeBSD 5.1-RELEASE #0: Thu Jun 5 02:55:42 GMT 2003 root@wv1u.btc.adaptec.com:/usr/obj/usr/src/sys/GENERIC i386

i2iweb
04-08-2004, 10:48 PM
If you're using Freebsd just do a perl -v or /usr/bin/perl -v to find out which version.

bjseiler
04-11-2004, 12:14 PM
uninstalled perl b/c my server was not recognizing it even though it claimed to be installed

reinstalled perl

updated everything through customapache

everything works now and no more nightly crashing

phadley
11-10-2004, 09:40 AM
Thank-you for having this thread. We, too, spent a week tracking it down before I thought to search DirectAdmin support. I have not yet tested it overnight, it has survived a domain addition.

I have one more datum which I'm not completely confident in and that is that when Apache restarted itself reliably (until the upgrades), @INC was looking for things in /usr/local/lib/perl5/5.8.5, afterwards, it was looking in 5.8.0. This was problematic because the whole mach directory in which Cwd.pm is located does not exist in 5.8.0. I suspect that somewhere out there in port land, there is a port or package which redefines @INC to an earlier version.

It would be nice to track this down as it forms a very bad first impression of DirectAdmin.