View Full Version : [Thu Mar 18 04:57:17 2004] [notice] child pid 2644 exit signal Segmentation fault (11
RSanders
03-17-2004, 10:02 PM
Hi,
Tonight, one of our machines droped offline. It fills the /var/log/httpd/error_log
[Thu Mar 18 04:57:59 2004] [notice] child pid 23037 exit signal Segmentation fault (11)
[Thu Mar 18 04:58:08 2004] [notice] child pid 25877 exit signal Segmentation fault (11)
[Thu Mar 18 04:58:12 2004] [notice] child pid
I've checked the up2date log, and do not see any incorrect packages being installed lately.
Noone was in the shell, we do show the client loging in to the control panel the same time this started.
I'm rebuilding apache from direct admin, but I think that's a long shot.
I'm still digging around, but this is odd...
RSanders
03-17-2004, 10:16 PM
Well,
rebuilding apache brings the system back up, but we still don't know what took it down.
ProWebUK
03-17-2004, 10:45 PM
Check none of your log files are large (that can clog up apache), is there anything prior to them messages, anything in /var/log/messages?
Did the entire system go down or just apache also, your quote "rebuilding apache brings the system back up" makes me think that it was only apache that failed?
Chris
Icheb
03-17-2004, 10:52 PM
This is really strange, but i'm starting to think this is DA related.
I see it more and more often; i manage the DA servers for a certain Dutch company, and still one of those servers has problems with this.
My own company's equipment hasn't had problems with it (last Apache compile on my own company's stuff all was nov-2003).
ProWebUK: Apache just quits without leaving much logs, only those child exits, it simply dies, but DA doesn't detect is since the parent survives but no longer responds to requests (and no, it's not declared to be a zombie process), restarting the apache service works for a while, but within 24 hours you most likely have the same problem again...
ProWebUK
03-17-2004, 11:02 PM
Have you tried an entire rebuild (removing files also...)
rm -rf /usr/lib/apache/
cd /usr/local/directadmin/customapache
rm -f configure.*
./build clean
./build update
./build all
Chris
RSanders
03-17-2004, 11:07 PM
Just apache. None of the logs were very large.
cd /usr/local/directadmin/customapache
./build clean
rm -f configure.*
./build update
./build all
and it pops to life. Also, afaik noone has done any system changes to the machine. Nothing that I can see was installed, etc, that might have got in the way.
DirectAdmin Support
03-18-2004, 12:33 AM
Hello,
What OS are you using? We've recently changed around the configure.apache_ssl script. The old one may have been causing problems, so try out the new one. (Affected RedHat 9)
DA only restarts apache and changes the user httpd.conf files.. perhaps with the old configure.apache_ssl script and a certain condition caused the segfault. Hard to say.
John
RSanders
03-18-2004, 01:53 AM
]$ cat /etc/redhat-release
Red Hat Linux release 7.3 (Valhalla)
*shrug*
I do have a 9 box that drops off almost nightly. Strangest thing,
service httpd status
shows pids,
stop it, then status again,
shows _different_ pids
stop it again, status shows stoped,
start it, and it runs *shrug*
Odd, but I still bought yet another license tonight. I have confedence in ya ;)
RSanders
03-18-2004, 01:55 AM
I should mention, I have seen that oddity on the RH 9 machine on non DA machines. Usually, the first status shows one pid, the second shows half a dozen or more, stoping twice and starting always brings it up.
DirectAdmin Support
03-18-2004, 11:55 AM
What time of the day do they start to happen? Wondering if the cpu usage from the dataskq just after midnight is causing it.
John
RSanders
03-18-2004, 12:06 PM
It was 2300 EST, the machine thought it was 0415 EST (all my fault) for the first issue.
The second issue mentioned happens around 0000, with the correct system time. So you might be on to something, heres the monitor for one machine.
HTTP OK 03-17-2004 00:21:47 HTTP ok: HTTP/1.1 200 OK - 0.007 second response time
HTTP OK 03-17-2004 00:21:47 HTTP ok: HTTP/1.1 200 OK - 0.007 second response time
HTTP CRITICAL 03-17-2004 00:18:58 Socket timeout after 10 seconds
HTTP CRITICAL 03-17-2004 00:18:58 Socket timeout after 10 seconds
HTTP OK 03-10-2004 00:19:27 HTTP ok: HTTP/1.1 200 OK - 0.004 second response time
HTTP OK 03-10-2004 00:19:27 HTTP ok: HTTP/1.1 200 OK - 0.004 second response time
HTTP CRITICAL 03-10-2004 00:16:37 Socket timeout after 10 seconds
HTTP CRITICAL 03-10-2004 00:16:37 Socket timeout after 10 seconds
HTTP OK 03-09-2004 00:20:27 HTTP ok: HTTP/1.1 200 OK - 0.004 second response time
HTTP OK 03-09-2004 00:20:27 HTTP ok: HTTP/1.1 200 OK - 0.004 second response time
HTTP CRITICAL 03-09-2004 00:17:37 Socket timeout after 10 seconds
HTTP CRITICAL 03-09-2004 00:17:37 Socket timeout after 10 seconds
If I do say so myself, our responce time for service failures on managed machines is pretty fast :)
DirectAdmin Support
03-18-2004, 01:01 PM
Hmm. Looks to me as though it may be the log rotater. Just a hunch, but since the dataskq takes a while to chug through all the data, apache isn't restarted for quite some time. But before it's restarted the rotated logs are tar.gz'ed and copied over, then the original log is deleted. Now, for the error logs, apache holds the filedescriptor open, so if it's trying to write to the deleted file before apache is restarted, I'm not sure what would happen. On our 7.2 build system, nothing happens. But I can't really say for other ones..
Perhaps do a test.. try deleteing someones error log file, then try generating errors (page not found errors, or php).. see if it generates a segfault.
That's just a guess.. it's the only real thing that I can think of which might cause apache to go down at that time.. as it's the only thing that DA does to apache, other than restart it after the tally.
John
RSanders
03-18-2004, 01:17 PM
I'll keep my eye on it. I would rather not 'test' too much on these as they are production machines. The time i test something, is the time they are doing something 'important' *sighs*
But, I will watch it a bit closer and see what I can find out. Next time one locks up it's open season. If it's down from a failure, I can justify taking 5 minutes to check it out.
arvydas
03-31-2004, 01:43 AM
Originally posted by DirectAdmin Support
Now, for the error logs, apache holds the filedescriptor open, so if it's trying to write to the deleted file before apache is restarted, I'm not sure what would happen. On our 7.2 build system, nothing happens. But I can't really say for other ones..
John
Isn't it better to truncate log files instead of deleting them? This would also allow to avoid restarting apache.
DirectAdmin Support
03-31-2004, 12:02 PM
How do you mean? Anything other than appending would rewrite the file from zero and create a different link in the filesystem, essentially deleting the file and starting over causing a dangling filedescriptor...
John
arvydas
03-31-2004, 02:52 PM
Originally posted by DirectAdmin Support
How do you mean? Anything other than appending would rewrite the file from zero and create a different link in the filesystem, essentially deleting the file and starting over causing a dangling filedescriptor...
John
Below is an example of Perl code that truncates file. It doesn't delete a file, it doesn't dangle file descriptor, it just sets file size to zero. This code was tested on apache logs and it works perfectly.
open(FILE, "+<access.log") or die;
flock(FILE, 2) or die;
seek(FILE, 0, 0) or die;
truncate(FILE, 0) or die;
flock(FILE, 8) or die;
close(FILE) or die;
DirectAdmin Support
04-01-2004, 09:50 AM
Ok, I'll make the in-place log deletion and see what happen :)
http://www.directadmin.com/features.php?id=359
Thanks,
John
duke27
07-09-2004, 12:05 AM
my httpd crash always each day
sometime at around 20h pm
and sometimes at around midnight....
what i need to do???
i built alllllllll
and its did it again....
duke27
07-09-2004, 12:07 AM
[Thu Jul 8 20:13:45 2004] [notice] child pid 31459 exit signal Segmentation fault (11)
[Thu Jul 8 20:13:45 2004] [notice] child pid 31460 exit signal Segmentation fault (11)
[Thu Jul 8 20:13:45 2004] [notice] child pid 31461 exit signal Segmentation fault (11)
[Thu Jul 8 20:13:45 2004] [notice] child pid 31462 exit signal Segmentation fault (11)
[Thu Jul 8 20:13:45 2004] [notice] child pid 31463 exit signal Segmentation fault (11)
[Thu Jul 8 20:13:45 2004] [notice] child pid 31464 exit signal Segmentation fault (11)
[Thu Jul 8 20:13:45 2004] [notice] child pid 31465 exit signal Segmentation fault (11)
[Thu Jul 8 20:13:45 2004] [notice] child pid 31468 exit signal Segmentation fault (11)
[Thu Jul 8 20:13:46 2004] [notice] child pid 31474 exit signal Segmentation fault (11)
OR
[Fri Jul 9 00:15:16 2004] [crit] (98)Address already in use: make_sock: could not bind to port 8090
what can i do serious???
arvydas
07-09-2004, 01:11 AM
It is useful to have a monitoring script, which checks apache responsiveness every minute and restarts it if there is any problem with it. Also, sometimes apache needs to be stopped several times before it can start correctly.
duke27
07-09-2004, 01:18 AM
yea BUT HOW make apache a BREAK before restart ?? lol
i'm sure its can help
DirectAdmin Support
07-13-2004, 11:48 AM
cd /usr/local/directadmin/customapache
rm -f httpd
rm -f httpd_freebsd
./build udpate
Redhat:
cp -f httpd /etc/init.d/httpd
chmod 755 /etc/init.d/httpd
chkconfig httpd reset
FreeBSD:
cp -f httpd_freebsd /usr/local/etc/rc.d/httpd
chmod 755 /usr/local/etc/rc.d/httpd
The newer boot script will wait for all apache processes to quit before restarting. Hopefully that will allow enough time for the port to be closed.
John
jmstacey
07-29-2004, 05:16 PM
I am having a similar problem, not during operation though only when I'm trying to shut apache down or restart it.
I get a lot of these
[Thu Jul 29 20:18:38 2004] [notice] child pid 45684 exit signal Segmentation fault (11)
and a lot of these
httpd in free(): warning: recursive call
Maybe this is normal during a shutdown procedure? but it takes apache almost 5 minutes to shutdown on my system
jmstacey
08-02-2004, 04:26 PM
Did some searching on google and it may be related to the php4 module. Any word on this though?
Dr-Host
08-03-2004, 03:11 AM
I'm getting
[Tue Aug 3 12:02:42 2004] [notice] caught SIGTERM, shutting down
[Tue Aug 3 12:02:43 2004] [warn] NameVirtualHost 212.179.2.45:80 has no VirtualHosts
[Tue Aug 3 12:02:43 2004] [warn] NameVirtualHost 212.179.2.45:443 has no VirtualHosts
[Tue Aug 3 12:02:43 2004] [warn] NameVirtualHost 212.179.2.46:80 has no VirtualHosts
[Tue Aug 3 12:02:43 2004] [warn] NameVirtualHost 212.179.2.46:443 has no VirtualHosts
[Tue Aug 3 12:02:43 2004] [warn] NameVirtualHost 212.179.2.41:80 has no VirtualHosts
[Tue Aug 3 12:02:43 2004] [warn] NameVirtualHost 212.179.2.41:443 has no VirtualHosts
[Tue Aug 3 12:02:43 2004] [warn] NameVirtualHost 212.179.2.61:80 has no VirtualHosts
[Tue Aug 3 12:02:43 2004] [warn] NameVirtualHost 212.179.2.61:443 has no VirtualHosts
[Tue Aug 3 12:02:43 2004] [warn] NameVirtualHost 212.179.2.60:80 has no VirtualHosts
[Tue Aug 3 12:02:43 2004] [warn] NameVirtualHost 212.179.2.60:443 has no VirtualHosts
[Tue Aug 3 12:02:43 2004] [warn] NameVirtualHost 212.179.2.57:80 has no VirtualHosts
[Tue Aug 3 12:02:43 2004] [warn] NameVirtualHost 212.179.2.57:443 has no VirtualHosts
[Tue Aug 3 12:02:43 2004] [warn] NameVirtualHost 212.179.2.59:80 has no VirtualHosts
[Tue Aug 3 12:02:43 2004] [warn] NameVirtualHost 212.179.2.59:443 has no VirtualHosts
[Tue Aug 3 12:02:43 2004] [warn] NameVirtualHost 212.179.2.58:80 has no VirtualHosts
[Tue Aug 3 12:02:43 2004] [warn] NameVirtualHost 212.179.2.58:443 has no VirtualHosts
[Tue Aug 3 12:02:43 2004] [notice] Apache/1.3.31 (Unix) mod_ssl/2.8.19 OpenSSL/0.9.7a PHP/4.3.8 mod_perl/1.27 FrontPage/5.0.2.2510
configured -- resuming normal operations
[Tue Aug 3 12:02:43 2004] [notice] suEXEC mechanism enabled (wrapper: /usr/sbin/suexec)
[Tue Aug 3 12:02:43 2004] [notice] Accept mutex: sysvsem (Default: sysvsem)
[Tue Aug 3 12:09:33 2004] [notice] caught SIGTERM, shutting down
[Tue Aug 3 12:09:34 2004] [warn] NameVirtualHost 212.179.2.45:80 has no VirtualHosts
[Tue Aug 3 12:09:34 2004] [warn] NameVirtualHost 212.179.2.45:443 has no VirtualHosts
[Tue Aug 3 12:09:34 2004] [warn] NameVirtualHost 212.179.2.46:80 has no VirtualHosts
[Tue Aug 3 12:09:34 2004] [warn] NameVirtualHost 212.179.2.46:443 has no VirtualHosts
[Tue Aug 3 12:09:34 2004] [warn] NameVirtualHost 212.179.2.41:80 has no VirtualHosts
[Tue Aug 3 12:09:34 2004] [warn] NameVirtualHost 212.179.2.41:443 has no VirtualHosts
[Tue Aug 3 12:09:34 2004] [warn] NameVirtualHost 212.179.2.61:80 has no VirtualHosts
[Tue Aug 3 12:09:34 2004] [warn] NameVirtualHost 212.179.2.61:443 has no VirtualHosts
[Tue Aug 3 12:09:34 2004] [warn] NameVirtualHost 212.179.2.60:80 has no VirtualHosts
[Tue Aug 3 12:09:34 2004] [warn] NameVirtualHost 212.179.2.60:443 has no VirtualHosts
[Tue Aug 3 12:09:34 2004] [warn] NameVirtualHost 212.179.2.57:80 has no VirtualHosts
[Tue Aug 3 12:09:34 2004] [warn] NameVirtualHost 212.179.2.57:443 has no VirtualHosts
[Tue Aug 3 12:09:34 2004] [warn] NameVirtualHost 212.179.2.59:80 has no VirtualHosts
[Tue Aug 3 12:09:34 2004] [warn] NameVirtualHost 212.179.2.59:443 has no VirtualHosts
[Tue Aug 3 12:09:34 2004] [warn] NameVirtualHost 212.179.2.58:80 has no VirtualHosts
[Tue Aug 3 12:09:34 2004] [warn] NameVirtualHost 212.179.2.58:443 has no VirtualHosts
[Tue Aug 3 12:09:34 2004] [notice] Apache/1.3.31 (Unix) mod_ssl/2.8.19 OpenSSL/0.9.7a PHP/4.3.8 mod_perl/1.27 FrontPage/5.0.2.2510
configured -- resuming normal operations
[Tue Aug 3 12:09:34 2004] [notice] suEXEC mechanism enabled (wrapper: /usr/sbin/suexec)
[Tue Aug 3 12:09:34 2004] [notice] Accept mutex: sysvsem (Default: sysvsem)
ever 5 min
jmstacey
08-03-2004, 11:21 PM
My problem has been solved. I did a module by module test to pinpoint the problem and it was TurckMMCache. After I disabled that all is well again :D
vBulletin® v3.7.0, Copyright ©2000-2008, Jelsoft Enterprises Ltd.