How to Backup DocuSign Documents via SFTP Using Mirror
One company uses DocuSign and stores a large number of documents there. At some point a requirement appeared to periodically download and backup all files.
The task was simple, but DocuSign's is quite limited for backups... Connect via FTP or API, download everything, and store it somewhere safe. Unfortunately, reality is a bit more complicated.
DocuSign does not provide many convenient ways to export or push data to external storage like S3. There is no simple full backup mechanism through their API either. The only supported method in this case was SCP/SFTP access.
And that would still be manageable, but there is another limitation: authentication requires an SSH key protected with a password. This makes fully automated backups slightly more complicated.
Another challenge is the amount of data. Downloading the full dataset every day could easily mean transferring tens or hundreds of gigabytes. That is obviously inefficient.
The goal of this approach is simple: avoid transferring the same data again and again. Instead of downloading everything every day, we can use a mirroring approach and only fetch new or changed files.
Mirroring with lftp
The following script uses lftp and its mirror functionality to synchronize remote files with a local directory.
$ cat backup-expect.sh
#!/bin/bash
...
USERHOST=11111111-2222-3333-4444-555555555555@sftp00.springcm.com
...
lftp -c "set sftp:connect-program 'ssh -a -x -i /path/to/id_rsa'; connect sftp://$USERHOST; mirror -cep -P5 . /path/to/dst"
...
# rotating process...
When running this script manually, the system will ask for the SSH key password. That is fine for manual usage, but not ideal for automation.
Automating password input with Expect
To simulate user input we can use the expect tool. It allows us to automatically respond to prompts during script execution.
$ cat backup.sh
#!/usr/bin/expect
spawn /path/to/backup-expect.sh
expect "Password:"
send "MyPassword\r"
set timeout -1
expect eof
Now the process can be automated.
A scheduler such as cron or an automation platform (I personally prefer Rundeck) runs backup.sh. The expect script starts backup-expect.sh, waits for the password prompt, and automatically provides the password required for the SSH key.
The workflow looks like this:
backup.sh -> run -> backup-expect.sh -> run -> lftp auth -> enter password -> start mirroring...
With this approach only new or changed files are transferred, which makes periodic backups significantly faster and reduces network traffic.
Comments
Post a Comment