Backup jobs overloading system

Omines

Verified User
Joined
Dec 17, 2007
Messages
48
I have a rather big DirectAdmin server on which it has become impossible to make backups since it causes so much I/O load that Apache and MySQL start to hang. While this is most probably caused by the underlying SAN infrastructure (it's a VM) I cannot solve it there - Linux just keeps on writing.

Now I suspect there's a very simple solution to this problem - if DirectAdmin backup tasks would launch their tar commands with "ionice -C3" or even C2 this problem shouldn't occur, backups just take longer to prepare but who cares. Alternatively it would also be possible to just nice the tar processes themselves away since default I/O priority is derived from CPU niceness - just not sure if this would have enough of an effect.

ionice class -C3 should have the effect of adding no visible system load, which is kind of the effect that I'm seeking with 24/7 loaded DA servers.

My main question: how would I go about testing my theories, which DA scripts should I modify?

Secondary question: shouldn't this be a feature of DA, most probably even enabled by default since backups should never hamper the server's main task of 'webserving'?
 
run them at night when it doesnt matter.
Our customers will actually call us if their site goes down for 10 minutes at night without prior maintenance notice, and hang us if it's over an hour like backups are currently causing, even when run at 3am. Not all DirectAdmin servers only run thousands of inactive sites with only timezone-local visitors.
If 19 is already the default it doesn't have enough effect and this only proves ionice control is also needed.

ionice allows you to specify 3 classes of I/O niceness: realtime, best effort, and idle. Within the default 'best effort' class further prioritization is done according to CPU niceness, but it's still 'unthrottled', allowing an application to overload the I/O subsystem, especially on this webserver (quad core 2.93GHz Xeon) that is overpowered in all other areas.

An ionice class of 'idle' means the program only gets I/O timeslices if no one else is requesting them, and a grace period has expired. This should mean that you could even run backups during your busiest hours as Apache and MySQL always go first, and your backups would just be damn slow.

Extra info on ionice: http://www.askapache.com/linux-unix/optimize-nice-ionice.html#ionice-tool
The kernel function DA should probably use if doing this from C/C++ code: http://www.kernel.org/doc/man-pages/online/pages/man2/ioprio_set.2.html
 
Last edited:
Well it seems I just solved my own problem... edited /etc/cron.d/directadmin_cron call to dataskq:
Code:
* * * * * root /usr/bin/ionice -c3 /usr/local/directadmin/dataskq
Just ran a backup of some several-GB accounts, system load never went above 2.30 and all websites continued to load instantly, even uncached pages hitting MySQL in full.

Can make this a formal feature request for DA now I suppose :)
 
I dont see how you could ionice doesnt even exist on freebsd.
 
I'm discussing the issue with John and the reason that he didn't add it as default in this feature is indeed that not all platforms support it - we're running Debian though where it is present by default (it's probably a Linux kernel feature that's not in BSD kernels yet).

Also, I am aware that on servers that are not I/O-limited and/or don't have such heavy customers the feature would only slow backup creation and not provide significant benefit in return, so I suggested he'd add facilities for power users to enable it at will if they run a supporting OS.
 
Well it seems I just solved my own problem... edited /etc/cron.d/directadmin_cron call to dataskq:
Code:
* * * * * root /usr/bin/ionice -c3 /usr/local/directadmin/dataskq
Just ran a backup of some several-GB accounts, system load never went above 2.30 and all websites continued to load instantly, even uncached pages hitting MySQL in full.

Can make this a formal feature request for DA now I suppose :)

I need to ask - how is this supposed to help, since dataskq isnt the only command ran when backing up? What about tar, gzip, etc?
 
I need to ask - how is this supposed to help, since dataskq isnt the only command ran when backing up? What about tar, gzip, etc?
Any process that does not explicitly request an IO priority class inherits it from its parent process. Since tar and gzip are started by dataskq this is why it works. Also makes sense obviously from an OS perspective that it works that way, would be rather ridiculous if an 'idle' process could still kill the system by forking 'real time' children.

http://superuser.com/questions/6309...ce-priorities-from-their-parents-how-do-you-c
I dont see how you could ionice doesnt even exist on freebsd.
Since DirectAdmin does all sorts of customizations based on the OS it's installed on - wouldn't hurt to do one more that improves performance on operating systems that do support it.
 
Last edited:
Thanks, Omines. I'm going to run the backups tonight with ionice tonight.

Did this completely solve your issue? The largest problem we have here while backing up is some users having abnormal quantities of files in their folders (some of them even 700.000 inodes or more) so we've had problems every night with high load and unresponsive sites.
 
Back
Top