Parallelized for loop in Bashbash script for printer administrationAn atexit for BashParallel for loop in Java 8Parallel for loop in Java 8 - follow-upParallel for loop in Java - follow-up 2Process files in all subdirectories and save output to new files based on their current pathLocating the Bash history file for a userSimple Bash Parallel Tool (env_parallel dies on big env)bash input parameter logicLocal backup bash
Keeping track of theme when improvising
ISP is not hashing the password I log in with online. Should I take any action?
Why does this Apple //e drops into system monitor when booting?
Reviewing papers at a journal where your own work is currently submitted
How to search for Android apps without ads?
Can Mage Hand be used to indirectly trigger an attack?
Any gotchas in buying second-hand sanitary ware?
Idiom for 'person who gets violent when drunk"
How effective would a full set of plate armor be against wild animals found in temperate regions (bears, snakes, wolves)?
What do I need to do, tax-wise, for a sudden windfall?
My mom's return ticket is 3 days after I-94 expires
Why is gun control associated with the socially liberal Democratic party?
What are the advantages of using TLRs to rangefinders?
Can I attach a DC blower to intake manifold of my 150CC Yamaha FZS FI engine?
Boss making me feel guilty for leaving the company at the end of my internship
Is it ethical to cite a reviewer's papers even if they are rather irrelevant?
Are athletes' college degrees discounted by employers and graduate school admissions?
What do you call the action of "describing events as they happen" like sports anchors do?
Why is my Taiyaki (Cake that looks like a fish) too hard and dry?
Why would a home insurer offer a discount based on credit score?
Why are backslashes included in this shell script?
I received a gift from my sister who just got back from
What did the 8086 (and 8088) do upon encountering an illegal instruction?
The instances where verbs might take the genitive case
Parallelized for loop in Bash
bash script for printer administrationAn atexit for BashParallel for loop in Java 8Parallel for loop in Java 8 - follow-upParallel for loop in Java - follow-up 2Process files in all subdirectories and save output to new files based on their current pathLocating the Bash history file for a userSimple Bash Parallel Tool (env_parallel dies on big env)bash input parameter logicLocal backup bash
.everyoneloves__top-leaderboard:empty,.everyoneloves__mid-leaderboard:empty,.everyoneloves__bot-mid-leaderboard:empty margin-bottom:0;
$begingroup$
I am using a Bash script to execute a Python script multiple times. In order to speed up the execution, I would like to execute these (independent) processes in parallel. The code below does so:
#!/usr/bin/env bash
script="path_to_python_script"
N=16 # number of processors
mkdir -p data/
for i in `seq 1 1 100`; do
for j in 1..100; do
((q=q%N)); ((q++==0)) && wait
if [ -e data/file_$i-$j.txt ]
then
echo "data/file_$i-$j.txt exists"
else
($script -args_1 $i > data/file_$i-$j.txt ;
$script -args_1 $i -args_2 value -args_3 value >> data/file_$i-$j.txt) &
fi
done
done
However, I am wondering if this code follow common best practices of parallelization of for loops in Bash? Are there ways to improve the efficiency of this code?
bash concurrency iteration multiprocessing
New contributor
$endgroup$
add a comment |
$begingroup$
I am using a Bash script to execute a Python script multiple times. In order to speed up the execution, I would like to execute these (independent) processes in parallel. The code below does so:
#!/usr/bin/env bash
script="path_to_python_script"
N=16 # number of processors
mkdir -p data/
for i in `seq 1 1 100`; do
for j in 1..100; do
((q=q%N)); ((q++==0)) && wait
if [ -e data/file_$i-$j.txt ]
then
echo "data/file_$i-$j.txt exists"
else
($script -args_1 $i > data/file_$i-$j.txt ;
$script -args_1 $i -args_2 value -args_3 value >> data/file_$i-$j.txt) &
fi
done
done
However, I am wondering if this code follow common best practices of parallelization of for loops in Bash? Are there ways to improve the efficiency of this code?
bash concurrency iteration multiprocessing
New contributor
$endgroup$
add a comment |
$begingroup$
I am using a Bash script to execute a Python script multiple times. In order to speed up the execution, I would like to execute these (independent) processes in parallel. The code below does so:
#!/usr/bin/env bash
script="path_to_python_script"
N=16 # number of processors
mkdir -p data/
for i in `seq 1 1 100`; do
for j in 1..100; do
((q=q%N)); ((q++==0)) && wait
if [ -e data/file_$i-$j.txt ]
then
echo "data/file_$i-$j.txt exists"
else
($script -args_1 $i > data/file_$i-$j.txt ;
$script -args_1 $i -args_2 value -args_3 value >> data/file_$i-$j.txt) &
fi
done
done
However, I am wondering if this code follow common best practices of parallelization of for loops in Bash? Are there ways to improve the efficiency of this code?
bash concurrency iteration multiprocessing
New contributor
$endgroup$
I am using a Bash script to execute a Python script multiple times. In order to speed up the execution, I would like to execute these (independent) processes in parallel. The code below does so:
#!/usr/bin/env bash
script="path_to_python_script"
N=16 # number of processors
mkdir -p data/
for i in `seq 1 1 100`; do
for j in 1..100; do
((q=q%N)); ((q++==0)) && wait
if [ -e data/file_$i-$j.txt ]
then
echo "data/file_$i-$j.txt exists"
else
($script -args_1 $i > data/file_$i-$j.txt ;
$script -args_1 $i -args_2 value -args_3 value >> data/file_$i-$j.txt) &
fi
done
done
However, I am wondering if this code follow common best practices of parallelization of for loops in Bash? Are there ways to improve the efficiency of this code?
bash concurrency iteration multiprocessing
bash concurrency iteration multiprocessing
New contributor
New contributor
edited 5 hours ago
200_success
133k20165437
133k20165437
New contributor
asked 8 hours ago
user213544user213544
333
333
New contributor
New contributor
add a comment |
add a comment |
1 Answer
1
active
oldest
votes
$begingroup$
Some suggestions:
- The trailing slash in the
mkdir
command is redundant. $(…)
is preferred over backticks for command substitution.- Why use
seq
in one command? They both do the same loop, so you might as well use1..100
in both places. - Semicolons are unnecessary in the vast majority of cases. Simply use a newline to achieve the same separation between commands.
- Use More Quotes™
set -o errexit -o noclobber -o nounset
at the start of the script will be helpful. It'll exit the script instead of overwriting any files, for example, so you can get rid of the innerif
statement if it's OK that the script stops when the file exists.[[
is preferred over[
.- The whole exercise is probably easier to achieve with some standard pattern like GNU parallel. Currently the script starts
N
commands, then waits for all of them to finish before starting any more. Unless the processes take very similar time this is going to waste a lot of time waiting. N
(or for exampleprocessors
for readability) should be determined dynamically, using for examplenproc --all
, rather than hardcoded.- If you're worried about speed you should probably not create a subshell for your two script commands.
and
will group commands without creating a subshell.
- For the same reason you probably want to do a single redirection like
"$script" … && "$script" …; > "data/file_$i-$j.txt"
Since you're "only" counting to 10,000 you don't need to reset
q
every time. You can for example setprocess_count=0
outside the outer loop and check the modulo in a readable way such as:if [[ "$process_count" % "$processors" -eq 0 ]]
then
wait
fi- The inner code (from the line starting with
((q=q%N))
) should be indented one more time.
$endgroup$
2
$begingroup$
[
is more portable and more consistent than[[
. So your preference is certainly arguable.
$endgroup$
– Toby Speight
6 hours ago
add a comment |
Your Answer
StackExchange.ifUsing("editor", function ()
StackExchange.using("externalEditor", function ()
StackExchange.using("snippets", function ()
StackExchange.snippets.init();
);
);
, "code-snippets");
StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "196"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);
else
createEditor();
);
function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: false,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: null,
bindNavPrevention: true,
postfix: "",
imageUploader:
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
,
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);
);
user213544 is a new contributor. Be nice, and check out our Code of Conduct.
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fcodereview.stackexchange.com%2fquestions%2f222141%2fparallelized-for-loop-in-bash%23new-answer', 'question_page');
);
Post as a guest
Required, but never shown
1 Answer
1
active
oldest
votes
1 Answer
1
active
oldest
votes
active
oldest
votes
active
oldest
votes
$begingroup$
Some suggestions:
- The trailing slash in the
mkdir
command is redundant. $(…)
is preferred over backticks for command substitution.- Why use
seq
in one command? They both do the same loop, so you might as well use1..100
in both places. - Semicolons are unnecessary in the vast majority of cases. Simply use a newline to achieve the same separation between commands.
- Use More Quotes™
set -o errexit -o noclobber -o nounset
at the start of the script will be helpful. It'll exit the script instead of overwriting any files, for example, so you can get rid of the innerif
statement if it's OK that the script stops when the file exists.[[
is preferred over[
.- The whole exercise is probably easier to achieve with some standard pattern like GNU parallel. Currently the script starts
N
commands, then waits for all of them to finish before starting any more. Unless the processes take very similar time this is going to waste a lot of time waiting. N
(or for exampleprocessors
for readability) should be determined dynamically, using for examplenproc --all
, rather than hardcoded.- If you're worried about speed you should probably not create a subshell for your two script commands.
and
will group commands without creating a subshell.
- For the same reason you probably want to do a single redirection like
"$script" … && "$script" …; > "data/file_$i-$j.txt"
Since you're "only" counting to 10,000 you don't need to reset
q
every time. You can for example setprocess_count=0
outside the outer loop and check the modulo in a readable way such as:if [[ "$process_count" % "$processors" -eq 0 ]]
then
wait
fi- The inner code (from the line starting with
((q=q%N))
) should be indented one more time.
$endgroup$
2
$begingroup$
[
is more portable and more consistent than[[
. So your preference is certainly arguable.
$endgroup$
– Toby Speight
6 hours ago
add a comment |
$begingroup$
Some suggestions:
- The trailing slash in the
mkdir
command is redundant. $(…)
is preferred over backticks for command substitution.- Why use
seq
in one command? They both do the same loop, so you might as well use1..100
in both places. - Semicolons are unnecessary in the vast majority of cases. Simply use a newline to achieve the same separation between commands.
- Use More Quotes™
set -o errexit -o noclobber -o nounset
at the start of the script will be helpful. It'll exit the script instead of overwriting any files, for example, so you can get rid of the innerif
statement if it's OK that the script stops when the file exists.[[
is preferred over[
.- The whole exercise is probably easier to achieve with some standard pattern like GNU parallel. Currently the script starts
N
commands, then waits for all of them to finish before starting any more. Unless the processes take very similar time this is going to waste a lot of time waiting. N
(or for exampleprocessors
for readability) should be determined dynamically, using for examplenproc --all
, rather than hardcoded.- If you're worried about speed you should probably not create a subshell for your two script commands.
and
will group commands without creating a subshell.
- For the same reason you probably want to do a single redirection like
"$script" … && "$script" …; > "data/file_$i-$j.txt"
Since you're "only" counting to 10,000 you don't need to reset
q
every time. You can for example setprocess_count=0
outside the outer loop and check the modulo in a readable way such as:if [[ "$process_count" % "$processors" -eq 0 ]]
then
wait
fi- The inner code (from the line starting with
((q=q%N))
) should be indented one more time.
$endgroup$
2
$begingroup$
[
is more portable and more consistent than[[
. So your preference is certainly arguable.
$endgroup$
– Toby Speight
6 hours ago
add a comment |
$begingroup$
Some suggestions:
- The trailing slash in the
mkdir
command is redundant. $(…)
is preferred over backticks for command substitution.- Why use
seq
in one command? They both do the same loop, so you might as well use1..100
in both places. - Semicolons are unnecessary in the vast majority of cases. Simply use a newline to achieve the same separation between commands.
- Use More Quotes™
set -o errexit -o noclobber -o nounset
at the start of the script will be helpful. It'll exit the script instead of overwriting any files, for example, so you can get rid of the innerif
statement if it's OK that the script stops when the file exists.[[
is preferred over[
.- The whole exercise is probably easier to achieve with some standard pattern like GNU parallel. Currently the script starts
N
commands, then waits for all of them to finish before starting any more. Unless the processes take very similar time this is going to waste a lot of time waiting. N
(or for exampleprocessors
for readability) should be determined dynamically, using for examplenproc --all
, rather than hardcoded.- If you're worried about speed you should probably not create a subshell for your two script commands.
and
will group commands without creating a subshell.
- For the same reason you probably want to do a single redirection like
"$script" … && "$script" …; > "data/file_$i-$j.txt"
Since you're "only" counting to 10,000 you don't need to reset
q
every time. You can for example setprocess_count=0
outside the outer loop and check the modulo in a readable way such as:if [[ "$process_count" % "$processors" -eq 0 ]]
then
wait
fi- The inner code (from the line starting with
((q=q%N))
) should be indented one more time.
$endgroup$
Some suggestions:
- The trailing slash in the
mkdir
command is redundant. $(…)
is preferred over backticks for command substitution.- Why use
seq
in one command? They both do the same loop, so you might as well use1..100
in both places. - Semicolons are unnecessary in the vast majority of cases. Simply use a newline to achieve the same separation between commands.
- Use More Quotes™
set -o errexit -o noclobber -o nounset
at the start of the script will be helpful. It'll exit the script instead of overwriting any files, for example, so you can get rid of the innerif
statement if it's OK that the script stops when the file exists.[[
is preferred over[
.- The whole exercise is probably easier to achieve with some standard pattern like GNU parallel. Currently the script starts
N
commands, then waits for all of them to finish before starting any more. Unless the processes take very similar time this is going to waste a lot of time waiting. N
(or for exampleprocessors
for readability) should be determined dynamically, using for examplenproc --all
, rather than hardcoded.- If you're worried about speed you should probably not create a subshell for your two script commands.
and
will group commands without creating a subshell.
- For the same reason you probably want to do a single redirection like
"$script" … && "$script" …; > "data/file_$i-$j.txt"
Since you're "only" counting to 10,000 you don't need to reset
q
every time. You can for example setprocess_count=0
outside the outer loop and check the modulo in a readable way such as:if [[ "$process_count" % "$processors" -eq 0 ]]
then
wait
fi- The inner code (from the line starting with
((q=q%N))
) should be indented one more time.
edited 8 hours ago
answered 8 hours ago
l0b0l0b0
5,1991127
5,1991127
2
$begingroup$
[
is more portable and more consistent than[[
. So your preference is certainly arguable.
$endgroup$
– Toby Speight
6 hours ago
add a comment |
2
$begingroup$
[
is more portable and more consistent than[[
. So your preference is certainly arguable.
$endgroup$
– Toby Speight
6 hours ago
2
2
$begingroup$
[
is more portable and more consistent than [[
. So your preference is certainly arguable.$endgroup$
– Toby Speight
6 hours ago
$begingroup$
[
is more portable and more consistent than [[
. So your preference is certainly arguable.$endgroup$
– Toby Speight
6 hours ago
add a comment |
user213544 is a new contributor. Be nice, and check out our Code of Conduct.
user213544 is a new contributor. Be nice, and check out our Code of Conduct.
user213544 is a new contributor. Be nice, and check out our Code of Conduct.
user213544 is a new contributor. Be nice, and check out our Code of Conduct.
Thanks for contributing an answer to Code Review Stack Exchange!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
Use MathJax to format equations. MathJax reference.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fcodereview.stackexchange.com%2fquestions%2f222141%2fparallelized-for-loop-in-bash%23new-answer', 'question_page');
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown