Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]
Groups > comp.os.linux.misc > #56433 > unrolled thread
| Started by | Woozy Song <suzyw0ng@outlook.com> |
|---|---|
| First post | 2024-05-24 10:34 +0800 |
| Last post | 2024-05-24 16:18 +0000 |
| Articles | 4 — 3 participants |
Back to article view | Back to comp.os.linux.misc
setting load limit for atd batch system? Woozy Song <suzyw0ng@outlook.com> - 2024-05-24 10:34 +0800
Re: setting load limit for atd batch system? Woozy Song <suzyw0ng@outlook.com> - 2024-05-24 12:24 +0800
Re: setting load limit for atd batch system? Andreas Eder <a_eder_muc@web.de> - 2024-05-24 17:25 +0200
Re: setting load limit for atd batch system? Rich <rich@example.invalid> - 2024-05-24 16:18 +0000
| From | Woozy Song <suzyw0ng@outlook.com> |
|---|---|
| Date | 2024-05-24 10:34 +0800 |
| Subject | setting load limit for atd batch system? |
| Message-ID | <v2oucb$252d9$1@dont-email.me> |
So the atd supposedly will not start another job until load factor falls below a limit. Different documentation gives the default as 0.8 or 1.5 Now I launch a job that uses 4 cores on a 6-core CPU. If I run top command, I see four processes running close to 100%. Now if I submit another job 10 seconds later, that starts thereby overloading the CPU. Documentation suggests setting load limit to more than n-1 for n CPU cores, but I think that is intended for single-thread jobs. I have tried altering the load limit in atd.service file to all sorts of values, but second job keeps starting while the first is flogging the CPU. I check with 'ps -ef|grep atd' to see it is using the desired load limit. I am aware that the load factor is an average, you can see it changes slowly in top/htop/glances. So I also increase the delay between jobs to 30 seconds, but still nothing works. So it looks like I have to specify a time like 'now+60 minutes' when I submit, requiring some guess how long first job runs. I know I can install a proper job scheduler such as Some Grid Engine, but that is more work. This is on Debian 11, by the way.
[toc] | [next] | [standalone]
| From | Woozy Song <suzyw0ng@outlook.com> |
|---|---|
| Date | 2024-05-24 12:24 +0800 |
| Message-ID | <v2p4p5$25ver$1@dont-email.me> |
| In reply to | #56433 |
Woozy Song wrote: > So the atd supposedly will not start another job until load factor falls > below a limit. Different documentation gives the default as 0.8 or 1.5 > Now I launch a job that uses 4 cores on a 6-core CPU. If I run top > command, I see four processes running close to 100%. > Now if I submit another job 10 seconds later, that starts thereby > overloading the CPU. Documentation suggests setting load limit to more > than n-1 for n CPU cores, but I think that is intended for single-thread > jobs. I have tried altering the load limit in atd.service file to all > sorts of values, but second job keeps starting while the first is > flogging the CPU. I check with 'ps -ef|grep atd' to see it is using the > desired load limit. I am aware that the load factor is an average, you > can see it changes slowly in top/htop/glances. So I also increase the > delay between jobs to 30 seconds, but still nothing works. So it looks > like I have to specify a time like 'now+60 minutes' when I submit, > requiring some guess how long first job runs. I know I can install a > proper job scheduler such as Some Grid Engine, but that is more work. > This is on Debian 11, by the way. I found the trick: you have to add '-q B' to command, then load-limit rule applies (it behaves like batch command instead of at). Otherwise it uses default queue 'a' that only uses time without load limit.
[toc] | [prev] | [next] | [standalone]
| From | Andreas Eder <a_eder_muc@web.de> |
|---|---|
| Date | 2024-05-24 17:25 +0200 |
| Message-ID | <87msofz0yf.fsf@eder.anydns.info> |
| In reply to | #56435 |
On Fr 24 Mai 2024 at 12:24, Woozy Song <suzyw0ng@outlook.com> wrote: > Woozy Song wrote: >> So the atd supposedly will not start another job until load factor falls >> below a limit. Different documentation gives the default as 0.8 or 1.5 >> Now I launch a job that uses 4 cores on a 6-core CPU. If I run top >> command, I see four processes running close to 100%. >> Now if I submit another job 10 seconds later, that starts thereby >> overloading the CPU. Documentation suggests setting load limit to more >> than n-1 for n CPU cores, but I think that is intended for single-thread >> jobs. I have tried altering the load limit in atd.service file to all >> sorts of values, but second job keeps starting while the first is flogging >> the CPU. I check with 'ps -ef|grep atd' to see it is using the desired >> load limit. I am aware that the load factor is an average, you can see it >> changes slowly in top/htop/glances. So I also increase the delay between >> jobs to 30 seconds, but still nothing works. So it looks like I have to >> specify a time like 'now+60 minutes' when I submit, requiring some guess >> how long first job runs. I know I can install a proper job scheduler such >> as Some Grid Engine, but that is more work. >> This is on Debian 11, by the way. > > I found the trick: you have to add '-q B' to command, then load-limit rule > applies (it behaves like batch command instead of at). Otherwise it uses > default queue 'a' that only uses time without load limit. I think ot is '-q b', the small letter b for the batch queue (a is for at). The other letters are not used by default and just serve to indicate niceness. 'Andreas -- ceterum censeo redmondinem esse delendam
[toc] | [prev] | [next] | [standalone]
| From | Rich <rich@example.invalid> |
|---|---|
| Date | 2024-05-24 16:18 +0000 |
| Message-ID | <v2qekd$2d6e3$2@dont-email.me> |
| In reply to | #56433 |
In comp.os.linux.misc Woozy Song <suzyw0ng@outlook.com> wrote: > So the atd supposedly will not start another job until load factor falls > below a limit. Different documentation gives the default as 0.8 or 1.5 > Now I launch a job that uses 4 cores on a 6-core CPU. If I run top > command, I see four processes running close to 100%. > Now if I submit another job 10 seconds later, that starts thereby > overloading the CPU. Documentation suggests setting load limit to more > than n-1 for n CPU cores, but I think that is intended for single-thread > jobs. I have tried altering the load limit in atd.service file to all > sorts of values, but second job keeps starting while the first is > flogging the CPU. I check with 'ps -ef|grep atd' to see it is using the > desired load limit. I am aware that the load factor is an average, you > can see it changes slowly in top/htop/glances. So I also increase the > delay between jobs to 30 seconds, but still nothing works. So it looks > like I have to specify a time like 'now+60 minutes' when I submit, > requiring some guess how long first job runs. I know I can install a > proper job scheduler such as Some Grid Engine, but that is more work. > This is on Debian 11, by the way. An alternative to using at and batch (batch is what observes the load limit by-the-way) is to install Task Spooler and use it for 'background jobs'. You can tell it to run jobs sequentially, or max X in parallel (you get to pick X). https://viric.name/soft/ts/ You can also submit jobs that "depend upon" other jobs, so that the dependent job won't run until the "parent" completes successfully.
[toc] | [prev] | [standalone]
Back to top | Article view | comp.os.linux.misc
csiph-web