Applying Julia on Andromeda – ITS / Research Computing

Installing Julia Packages Locally

If possible, it is recommended to install the Julia package in your home directory. The example below demonstrates how to do this on Andromeda.

1. Start an interactive session and load Julia

[johnchris@andromeda ~]$ interactive Executing: srun --pty -N1 -n1 -c4 --mem=8g -pinteractive -t0-04:00:00 /bin/bash Press any key to continue or ctrl+c to abort. srun: job 710479 queued and waiting for resources srun: job 710479 has been allocated resources cpu-bind=MASK - c049, task 0 0 [3394141]: mask 0x888800000000 set [johnchris@c049 ~]$ module load julia

2. Create a directory for local Julia packages

Here we create a directory called Julia_lib in the home directory, then launch Julia:

[johnchris@c049 ~]$ cd ~ [johnchris@c049 ~]$ mkdir Julia_lib [johnchris@c049 ~]$ julia _ _ _ _(_)_ | Documentation: https://docs.julialang.org (_) | (_) (_) | _ _ _| |_ __ _ | Type "?" for help, "]?" for Pkg help. | | | | | | |/ _` | | | | |_| | | | (_| | | Version 1.10.9 (2025-03-10) _/ |\__'_|_|_|\__'_| | Official https://julialang.org/ release |__/ |

3. Activate the new package environment

In Julia, use Pkg.activate() to tell Julia to use the new directory for packages:

julia> import Pkg julia> Pkg.activate("/home/johnchris/Julia_lib") Activating new project at `~/Julia_lib`

Replace /home/johnchris with your actual home path.

4. Install and verify packages

Here, we install two packages:

julia> Pkg.add(name="ITensors", version="0.3.52") julia> Pkg.add("StatsBase")

Use Pkg.add(name=”PackageName”, version=”x.y.z”) to install a specific version.
Use Pkg.add(“PackageName”) to install the default compatible version.
Specifying the version is recommended to avoid unexpected compatibility issues and to ensure reproducibility.

Once the packages are installed, verify the installation with:

julia> Pkg.status() Status `~/Julia_lib/Project.toml` ⌃ [9136182c] ITensors v0.3.52 [2913bbd2] StatsBase v0.34.6

When the installation is complete, you can exit Julia:

julia> exit()

The installed packages will be stored in your home directory.

In future Julia sessions, remember to activate the corresponding environment before using these packages.

To do this, add the following two lines at the very top of your .jl file (before any using or import statements):

import Pkg Pkg.activate("/home/johnchris/Julia_lib")

Only after activation will Julia know to use the packages from that environment.

Running Multiple Julia Jobs with Parameters (Multiple Submissions)

1. Julia Script (log_ab.jl)

Suppose we want to compute log(a * b) for a range of values: a = 1, 2, 3, 4 and b = 2, 4, 6, 8.

Create a Julia script that reads a and b from command-line arguments:

a = parse(Float64, ARGS[1]) # 1,2,3,4 b = parse(Float64, ARGS[2]) # 2,4,6,8 println(a, ",", b, ",", log(a*b))

2. Single job submission (single_job.sl)

Submit the script via a Slurm wrapper that takes two arguments (a and b):

#!/bin/bash -l #SBATCH --job-name=log_ab #SBATCH --nodes=1 #SBATCH --ntasks=1 #SBATCH --cpus-per-task=1 #SBATCH --mem-per-cpu=4G #SBATCH --time=01:00:00 #SBATCH --partition=short module load julia julia ../log_ab.jl $1 $2

Here, $1 and $2 are placeholders for the values of a and b, passed in when sbatch is called.

Example call (single run):

To submit one job (e.g. a=3 and b=4):

[johnchris@c049 test]$ sbatch single_job.sl 3 4

This will compute log(3*4).

Note: The script above calls ../log_ab.jl. Place single_job.sl inside a subdirectory and log_ab.jl in its parent, or adjust the path accordingly.

3. Wrapper Script for All Jobs (multiple_submission.sl)

To automate all 16 combinations of a and b:

#!/bin/bash -l #SBATCH --job-name=multiple_submission #SBATCH --nodes=1 #SBATCH --ntasks=1 #SBATCH --cpus-per-task=1 #SBATCH --mem-per-cpu=1G #SBATCH --time=00:10:00 #SBATCH --partition=short for a in $(seq 1 1 4) do for b in $(seq 2 2 8) do mkdir "a$a""b$b" cd "a$a""b$b" sbatch ../single_job.sl $a $b cd .. done done

Each job is submitted from a separate subdirectory (a1b2, a1b4, …), which can help organize outputs.

4. Running the Batch

Make sure all the required files (multiple_submission.sl, single_job.sl and log_ab.jl) are in the targeted directory (for example, the test folder). Then, run:

[johnchris@c049 test]$ sbatch multiple_submission.sl

This will submit a batch of jobs corresponding to all combinations of a and b.

5. Monitoring the Jobs

You can check the queue with:

[johnchris@c049 test]$ squeue -u johnchris JOBID PARTITION NAME USER ST TIME NODES NODELIST(REASON) 710623 short log_ab gaoqc PD 0:00 1 (Priority) 710622 short log_ab gaoqc PD 0:00 1 (Priority) ... 710608 short log_ab gaoqc PD 0:00 1 (Priority)

Julia Parallel Computing

Running Multithreading Tasks Using Julia Threads

When you need to run multiple tasks in parallel with shared memory or communication between them, Julia’s multithreading is a simple and efficient solution.
This approach is suitable when your job runs on a single node with one task (–nodes=1, –ntasks=1), but uses multiple CPU cores.

1. Example: Generating Random Numbers in Parallel and Summing Them

Suppose we want to generate 10 random numbers between 0 and 1 in parallel, and then compute their sum.
The following Julia script (sum_of_random_num.jl) defines this task:

println("Number of threads: ", Threads.nthreads()) @time begin num = 10 test = zeros(num) Threads.@threads for i=1:num sleep(1) # simulate a time-consuming task test[i] = rand() end println(sum(test)) end

2. Slurm Script for Running the Multithreaded Job

To execute the script using 10 threads, configure Slurm as follows (sum_of_random_num.sl):

#!/bin/bash -l #SBATCH --job-name=sum_of_random_sum #SBATCH --nodes=1 #SBATCH --ntasks=1 #SBATCH --cpus-per-task=10 # Request 10 CPUs for threading #SBATCH --mem-per-cpu=4G #SBATCH --time=01:00:00 #SBATCH --partition=short module load julia export JULIA_NUM_THREADS=$SLURM_CPUS_PER_TASK # Set number of threads for Julia julia sum_of_random_num.jl

3. Running the Job

Make sure both sum_of_random_num.sl and sum_of_random_num.jl are in the targeted directory. Then, submit the job with:

[johnchris@c049 test]$ sbatch sum_of_random_num.sl

4. Example Output

In the output, we get

Number of threads: 10 4.847715876885147 1.140287 seconds (97.47 k allocations: 7.262 MiB, 33.40% compilation time)

This output confirms that:

10 threads were launched
10 random numbers were generated in parallel (note the ~1 second runtime despite sleep(1) in each thread)
Their sum was correctly computed

This is a basic but effective demonstration of multithreading with shared memory in Julia.

Running Distributed and Hybrid Tasks

When your calculations need to go beyond multithreading, you can distribute multiple Julia processes across different workers on different nodes with separate memory spaces. In addition, you can use multithreading within each worker using multiple CPU cores to maximize efficiency in a hybrid method.

1. Example: Julia Scripts and Different Parallelization Methods

The following Julia script (example.jl) demonstrates different parallelization methods for computing the sum of an array of x^2 values.

(a) To start with, we need to use packages for distributed parallelization on cluster:

#!/usr/bin/env julia using Distributed using Base.Threads using SlurmClusterManager # Launch workers across nodes via SlurmManager addprocs(SlurmManager()) @info "Master process: $(myid()), workers: $(workers()), threads per worker: $(Threads.nthreads())"

Where the “Distributed” package provides Julia’s distributed computing framework, and “addprocs(SlurmManager())” adds Julia worker processes distributed across the allocated nodes.

(b) Then we define the functions for our calculation of x^2 (expensive_function) and a multi-threaded function that calculates the sum of x^2 over an array for hybrid parallelization:

@everywhere function expensive_function(x) sleep(0.1) # simulate heavy work return x^2 end @everywhere function threaded_sum(arr) s = Threads.Atomic{Int64}(0) Threads.@threads for x in arr Threads.atomic_add!(s, expensive_function(x)) end return s[] end

Here, the “@everywhere” is very important, which makes the function available to all workers.

function sum_with_pmap(n) results = pmap(expensive_function, 1:n) return sum(results) end function sum_with_distributed(n) return @distributed (+) for i in 1:n expensive_function(i) end end function sum_with_spawn(n) tasks = [Distributed.@spawn expensive_function(i) for i in 1:n] results = fetch.(tasks) return sum(results) end function sum_with_hybrid(n) arr = 1:n nwrk = max(nworkers(), 1) parts = Iterators.partition(1:length(arr), cld(length(arr), nwrk)) chunks = [arr[first(p):last(p)] for p in parts] results = pmap(threaded_sum, chunks) return sum(results) end

2. Slurm Script for Running the Distributed Job

Use the slurm script with below options and test on different parameters (run_example.sl):

#!/bin/bash #SBATCH --job-name=julia-demo #SBATCH --output=julia-demo-%j.out #SBATCH --error=julia-demo-%j.err #SBATCH --nodes=2 #SBATCH --ntasks-per-node=2 #SBATCH --cpus-per-task=4 #SBATCH --time=00:10:00 module load julia # Export number of threads to Julia runtime export JULIA_NUM_THREADS=$SLURM_CPUS_PER_TASK # Prevent CPU binding conflicts with Julia's process management export SLURM_CPU_BIND=none julia example.jl

3. Running the Example

Here is an example Julia script that compare different functions’ running time (add it at the bottom of example.jl):

if myid() == 1 println("Number of workers per process: ", nworkers()) println("Number of threads per process: ", Threads.nthreads()) n = 200 # size of computation @time s1 = sum_with_pmap(n) println("Result: ", s1) @time s2 = sum_with_distributed(n) println("Result: ", s2) @time s3 = sum_with_spawn(n) println("Result: ", s3) @time s4 = sum_with_hybrid(n) println("Result: ", s4) println("All sums equal: ", s1 == s2 == s3 == s4) end

Similar to previous parts of the tutorial, run the job:

sbatch run_example.sl

In the output, we get:

Number of workers per process: 4 Number of threads per process: 4 6.008759 seconds (2.15 M allocations: 146.670 MiB, 12.86% compilation time) Result: 2686700 7.427056 seconds (480.65 k allocations: 32.088 MiB, 8.08% compilation time) Result: 2686700 1.216503 seconds (360.94 k allocations: 24.434 MiB, 74.88% compilation time) Result: 2686700 1.926931 seconds (787.02 k allocations: 53.600 MiB, 20.51% compilation time) Result: 2686700 All sums equal: true

Which confirms that we are running 2*2=4 workers, and 4 threads per worker. The time here depends on specific functions and cluster setup.