A1-to-A2 Data Migration

A computer in the HPC cluster is referred to as a node, and there are two types of nodes on Andromeda: login nodes and compute nodes.

Guide for copying data from Andromeda 1 to Andromeda 2

Before copying data it is important to be aware of a couple of things.

  • A lot of user and project data has already been copied to A2 under the a002:/migration/data directory, which is a temporary location during the migration. Please check there first to see if your data has already been copied. The data there is read-only, so please don’t try to modify it or use it as part of a slurm job. This data is currently being refreshed from A1 daily. If your data exists there, you can simply copy it to your home directory or project directory from the appropriate migration subdirectory (for example, using the “cp -pr” command). -p preserves permissions, ownership, and timestamps, and -r means copy data recursively, including subdirectories.
  • In order to better manage data on A2, there are quotas limiting how much data that can be stored in each type of space. Each user’s home directory has a quota of 50GB and each scratch and project directory has a quota limit of 1TB each. If you believe your source data exceeds these limits, then take that into consideration. For instance, if your home directory on A1 has 200GB, it won’t fit in your home directory on A2 due to the 50GB limit. However, you could copy that data to your project folder on A2, which has the 1TB limit. If you have any problems or questions related to this, please open a ticket here: http://bc.edu/researchhelp.

The rsync command can be used to easily copy data from A1 → A2. Here are instructions for doing so.

In the following example the user, represented by the environment variable $USER, will copy the directory called my_data, which is located under their home directory, to their home directory on a002 (Andromeda 2). The last character in the following command is a tilde “~”, which represents the home directory on the a002 remote host.

[l001 ~]$ rsync -avuz $HOME/my_data $USER@a002:~

Here is some more info about the command.

  • -a: Archive mode; preserves permissions, timestamps, and symbolic links.
  • -v: Verbose mode; shows you what files are being transferred.
  • -u: Ensures that newer files on the destination do not get overwritten.
  • -z: Compresses data during transfer, which can speed things up.
  • $HOME/my_data: The directory or file you want to copy.
    • Make sure to include a trailing slash if you want to copy the contents of the directory, not the directory itself.
  • $USER@a002: Your username and the hostname of the Andromeda 2 login node.
  • $USER@a002:/home/$USER: The directory where you want to copy the data on the destination node.

*If you are copying a large amount of data, consider running the rsync command within a screen or tmux session to prevent interruptions.


Setting Up Passwordless SSH (optional)

Passwordless SSH lets you log in without typing your password every time, making rsync transfers more convenient.

1. Generate SSH Keys (Only if you don’t have them already. To check if you have them, you can list the contents of $HOME/.ssh).

On a001, run:

ssh-keygen -t ed25519

  • Press Enter to accept the default file location (~/.ssh/id_ed25519).
  • When prompted for a passphrase, you can either enter one (for extra security) or press Enter twice to leave it blank. 

ssh-copy-id $USER@a002

2. Test the Connection:

  • If you set up everything correctly, you should be logged in without entering your password.

ssh $USER@a002 hostname

It should return the following:

a002.m31.bc.edu

Scroll to Top