This guide explains how to setup and use rclone to sync data between HPC clusters and Google Drive.

Rclone can be used with a wide variety of cloud services including box, but this example deals specifically with Google Drive.

Full documentation on using rclone can be found here: https://rclone.org/

We recommend  using rclone with your ISU Google account which provides unlimited space. You can use your personal account as well of course, but it will not have unlimited space.

To use rclone you must have a cymail account and have accessed it at least once to initialize it in the google cloud.

To setup rclone login to a cluster dtn node (condodtn, novadtn, hpc-class-dtn)  with ssh, load the rclone module with “module load rclone”  and then run “rclone config”.

Then run though the setup dialog as shown below.

$ rclone config

2019/02/12 16:28:28 NOTICE: Config file “/home/netid/.config/rclone/rclone.conf” not found – using defaults

No remotes found – make a new one

n) New remote

s) Set configuration password

q) Quit config

n/s/q> n

The “name” can be whatever you like, you will need to type it in every rclone command so you might want to keep it short and memorable.

name> gdrive

At the time this documentation was created Google Drive is 11

Type of storage to configure.

Enter a string value. Press Enter for the default (“”).

Choose a number from below, or type in your own value

 1 / Alias for a existing remote

   \ “alias”

 2 / Amazon Drive

   \ “amazon cloud drive”

 3 / Amazon S3 Compliant Storage Providers (AWS, Ceph, Dreamhost, IBM COS, Minio)

   \ “s3”

 4 / Backblaze B2

   \ “b2”

 5 / Box

   \ “box”

 6 / Cache a remote

   \ “cache”

 7 / Dropbox

   \ “dropbox”

 8 / Encrypt/Decrypt a remote

   \ “crypt”

 9 / FTP Connection

   \ “ftp”

10 / Google Cloud Storage (this is not Google Drive)

   \ “google cloud storage”

11 / Google Drive

   \ “drive”

12 / Hubic

   \ “hubic”

13 / JottaCloud

   \ “jottacloud”

14 / Local Disk

   \ “local”

15 / Mega

   \ “mega”

16 / Microsoft Azure Blob Storage

   \ “azureblob”

17 / Microsoft OneDrive

   \ “onedrive”

18 / OpenDrive

   \ “opendrive”

19 / Openstack Swift (Rackspace Cloud Files, Memset Memstore, OVH)

   \ “swift”

20 / Pcloud

   \ “pcloud”

21 / QingCloud Object Storage

   \ “qingstor”

22 / SSH/SFTP Connection

   \ “sftp”

23 / Webdav

   \ “webdav”

24 / Yandex Disk

   \ “yandex”

25 / http Connection

   \ “http”

Storage> 11

The next 2 parameters should be left blank

Google Application Client Id

Leave blank normally.

Enter a string value. Press Enter for the default (“”).

client_id> 

Google Application Client Secret

Leave blank normally.

Enter a string value. Press Enter for the default (“”).

client_secret> 

 

Scope will be Full Access if you wish to write to the drive

Scope that rclone should use when requesting access from drive.

Enter a string value. Press Enter for the default (“”).

Choose a number from below, or type in your own value

 1 / Full access all files, excluding Application Data Folder.

   \ “drive”

 2 / Read-only access to file metadata and file contents.

   \ “drive.readonly”

   / Access to files created by rclone only.

 3 | These are visible in the drive website.

   | File authorization is revoked when the user deauthorizes the app.

   \ “drive.file”

   / Allows read and write access to the Application Data folder.

 4 | This is not visible in the drive website.

   \ “drive.appfolder”

   / Allows read-only access to file metadata but

 5 | does not allow any access to read or download file content.

   \ “drive.metadata.readonly”

scope> 1

Leave the next 2 parameters blank

ID of the root folder

Leave blank normally.

Fill in to access “Computers” folders. (see docs).

Enter a string value. Press Enter for the default (“”).

root_folder_id> 

Service Account Credentials JSON file path 

Leave blank normally.

Needed only if you want use SA instead of interactive login.

Enter a string value. Press Enter for the default (“”).

service_account_file> 

Enter n for advanced config

Edit advanced config? (y/n)

y) Yes

n) No

y/n> n

Enter n for auto config

Remote config

Use auto config?

 * Say Y if not sure

 * Say N if you are working on a remote or headless machine or Y didn’t work

y) Yes

n) No

y/n> n

If your browser doesn’t open automatically go to the following link: https://accounts.google.com/o/oauth2/auth?access_type=offline&client_id=3534534f44.apps.googleusercontent.com&redirect_uri=urn%3Aietf%3Awg%3Aoauth%3A2.0%3Aoob&response_type=code&scope=https%3A%2F%2Fwww.googleapis.com%2Fauth%2Fdrive&state=fgsdrgrvaeebwergrvsgbe5rd9e7

Log in and authorize rclone for access

Enter verification code>

At this point open a browser on your local workstation and copy the entire URL into a private/incogito browser window.

enter your ISU NetID as an emailGoogle Sign-in PageGoogle Sign-in Page

You should then get your usual ISU Okta SSO login

After correctly identifying you will see this screen.

Click Allow

Now enter the code shown on the next screen back into the rclone config

Enter verification code> rgsgsgsrgsgreX06ZesfaevaefavsdgsrsrsrvsvrvsrrYff

Do not configure this as a team drive

Configure this as a team drive?

y) Yes

n) No

y/n> n

You will now get a summary, type y if everything is ok 

——————–

[gdrive]

type = drive

scope = drive

token = {“access_token”:”asdfadsfasdfaeaefaseaefaefaefaeJ6asdfaefaewfaaefaevaeaefaewfawefawefeffvvrvrvrvr”,”token_type”:”Bearer”,”refresh_token”:”1asdfasdfasfaefaefaefafaafaefawesfaefaewf”,”expiry”:”2019-02-12T17:31:49.265667681-06:00″}

——————–

y) Yes this is OK

e) Edit this remote

d) Delete this remote

y/e/d> y

You will again get a summary, now type  q

Current remotes:

 

Name                 Type

====                 ====

gdrive               drive

 

e) Edit existing remote

n) New remote

d) Delete remote

r) Rename remote

c) Copy remote

s) Set configuration password

q) Quit config

e/n/d/r/c/s/q> q

You can now use it like this,

List directories in top level of your drive

rclone lsd gdrive:

List all the files in your drive

rclone ls gdrive:

To copy a local directory to a drive directory called backup

rclone copy /home/source gdrive:backup

 

Limitations

Google Drive has rate limiting. This causes rclone to be limited to transferring about 2 files per second only. Individual files may be transferred much faster at 100s of MBytes/s but lots of small files can take a long time.  Google Drive limits us to 750 GB/person/day.   If you exceed 750G you may be banned until the following day.  This is problematic for large transfers or lots of files.  If your planning on transferring more than 750G use the “–bwlimit 8.6” option this should keep you under the 750G limit. Doing  so should allow arbitrarily large transfers. Google Drive supports single files up to 5TB. There are also some limits on filenames.  The “Advanced Rclone” instructions show a possible way of dealing with these limitations.

More Commands

For full information on the rclone commands and their syntax see here: https://rclone.org/

Group Accounts

We recommend using group role accounts to store data long term so  your entire research group can access it in the future.  Google Drive does have a “team drive” feature but that is not currently enabled for Iowa State accounts.

Transferring ownership of files

If you store your data in an individual account when you leave ISU you may need to transfer your files to another user or your major professor. 

These are the Google instructions on doing so:

from https://support.google.com/drive/answer/2494892?hl=en

Transfer file ownership

You’re the owner by default for files that you create in Docs, Sheets, and Slides, or upload into Drive. But, you can transfer ownership of your Google files (Docs, Sheets, and Slides) and folders to anyone you’d like, as long as that person has a Google Account.

Note: If you use Google apps through work or school, you can’t transfer ownership to or receive ownership from someone else who is outside of your domain.

How to change owners

You can change who owns a file or folder in Drive.

  1. Go to Drive or a Docs, Sheets, or Slides home screen.
  2. Open the sharing box:
    • In Drive: Select the file or folder and click the share icon at the top .
    • In a Docs, Sheets, or Slides home screen: Open the file and click Share in the top-right corner of the file
  3. If the new owner already has edit access, skip to Step 4. Otherwise, follow these steps:
    1. Type the email address of the new owner in the “Invite people” field
    2. Click Share & save.
  4. Click Advanced in the bottom-right corner of the sharing box.
  5. Click the drop-down menu next to the name of the person you want to own the file or folder.
  6. Select Is owner.
  7. Click Done.

You’ll have access to the file as an editor after you transfer ownership.

Things to consider before you transfer ownership

  • The things you’ll no longer be able to do once you transfer file ownership include:
    • Remove others from the file
    • Share with as many people as you like
    • Change visibility options
    • Allow your collaborators to change access privileges for others
    • Permanently delete something from Google Drive. After it’s deleted, no one can access it, including those it was shared with.
  • When you transfer ownership of a folder from yourself to another person, the new owner of the folder becomes an editor of the files in that folder. The original owners of the files remain the owners, and if the original owner deletes a file, it’ll be removed from the folder.
  • If your current Google Account is being deleted, transfer ownership of your files, folders, and Google files to another active account. Once the original account is deleted, you won’t be able to recover any of your files or folders from it.