Subject: Discussion group for the ARIA software
List archive
Re: [aria-discuss] Running Aria on multiple nodes on a cluster managed by Slurm
Chronological Thread
- From: "Benedikt Soeldner" <benedikt.soeldner AT tu-dortmund.de>
- To: bardiaux AT pasteur.fr
- Cc: aria-discuss AT services.cnrs.fr
- Subject: Re: [aria-discuss] Running Aria on multiple nodes on a cluster managed by Slurm
- Date: Mon, 3 May 2021 10:43:16 +0200
- Importance: Normal
Dear Benjamin,
thank you for the tip. I tried it out on friday, but unfortunately,
already at the beginning, the following error occured:
MESSAGE [Project]: Checking host list ...
WARNING [Job manager]: Command "sbatch --job-name=CNS_Aria_run_2021-04-30
--error=/.../cns_error_msg.txt --output=/.../cns_
output_msg.txt --partition=short --ntasks=1
--cpus-per-task=1" ... failed (connection failed or
unable to open temporary file.)
WARNING [Project]: At least one command could not be executed. Please check
your host list setup.
Is the reason for this problem maybe, that the check_host.csh file in the
temporary direction doesn't look like a Slurm Slurm job script (in
contrast to the files refine.csh and refine_water.csh in the temporary
directory)? If yes, does this file needs to be modified or is there
another problem?
Best regards,
Benedikt
> Dear Benedikt,
>
> If you're slurm setup allows submission from the node where you're running
> aria, you could use
>
> <host enabled="yes" command="sbatch --your.sbatch.options"
> executable="/.../programs/cns_solve_1.21/intel-x86_64bit-linux/bin/cns_solve"
> n_cpu="20" use_absolute_path="yes"/>
> </job_manager>
>
>
> this means that each CNS calculation will be submitted independently via
> sbatch, hence running in parallel.
>
> Best regards,
>
> Benjamin
>
>
> On 30/04/2021 11:19, Benedikt Soeldner wrote:
>> Dear Aria discussion group,
>>
>> since a few months, I'm running my Aria calculations on a cluster
>> managed
>> with the Slurm scheduling system. Unfortunately, I didn't find out yet,
>> how to run Aria, i.e., the CNS calculation processes, on multiple nodes
>> in
>> parallel. Does anyone of you know if and how this works?
>>
>> At the moment, I start my Aria calculations by submitting a job script,
>> which looks like the following:
>>
>> #!/bin/bash -l
>> #SBATCH --job-name=Aria-Project_run01
>> #SBATCH --output=Aria-Project_run01.out
>> #SBATCH --error=Aria-Project_run01.err
>> #SBATCH --partition=short
>> #SBATCH --ntasks=1
>> #SBATCH --cpus-per-task=20
>> #SBATCH --mem=8G
>> module purge
>> module load python/2.7.18
>> cd /.../Aria-Project
>> srun python -O /.../programs/aria2.3.2/aria2.py --output=run01msg.txt
>> Aria-Project_run01.xml
>>
>> And in the Aria project file, the following lines are written:
>>
>> <job_manager default_command="csh -f">
>> <host enabled="yes" command="csh -f"
>> executable="/.../programs/cns_solve_1.21/intel-x86_64bit-linux/bin/cns_solve"
>> n_cpu="20" use_absolute_path="yes"/>
>> </job_manager>
>>
>> When I increase the number of nodes (#SBATCH --nodes=... in the job
>> script), the job still runs just on one node, as there is only one
>> initial
>> task. I guess, the lines above from the project file also need to be
>> modified somehow and not only in the Slurm job script. Do you know, how
>> these two files should look like? Or do I need to modify other files in
>> addition?
>>
>> Thank you for your help!
>>
>> Best regards,
>>
>> Benedikt Söldner
>>
>> --------------------------------------
>> Benedikt Söldner (PhD student)
>> benedikt.soeldner AT tu-dortmund.de
>> Technical University Dortmund, Germany
>> Research group of Prof. Dr. Rasmus Linser
>>
>>
>>
>>
>>
>>
>
>
> --
> ---------------------------------------------
> Dr Benjamin Bardiaux bardiaux AT pasteur.fr
> Unité de Bioinformatique Structurale
> CNRS UMR3528 - Institut Pasteur
> 25,28 rue du Docteur Roux 75015 Paris, France
> ---------------------------------------------
>
-
Re: [aria-discuss] Running Aria on multiple nodes on a cluster managed by Slurm,
Benedikt Soeldner, 05/03/2021
-
Re: [aria-discuss] Running Aria on multiple nodes on a cluster managed by Slurm,
Benjamin Bardiaux, 05/03/2021
- Re: [aria-discuss] Running Aria on multiple nodes on a cluster managed by Slurm, Benedikt Soeldner, 05/07/2021
-
Re: [aria-discuss] Running Aria on multiple nodes on a cluster managed by Slurm,
Benjamin Bardiaux, 05/03/2021
Archive powered by MHonArc 2.6.19.