Tricking a shell script to show tar progress in GitLab jobs
Hello, Internet! In this short post I’d like to share a small trick I used to solve a problem some time ago.
In brief, I’ll show how to “trick” a script into using a modified tar
executable with an additional parameter.
If that sounds interesting - this post is for you.
The problem
This was a very specific issue, so let me first describe the setup. I had a server where a large amount of data was stored (not on the server itself, but on an attached network volume). Once a day, a CI job in GitLab triggered a custom backup script that handled this data. The script did a number of things, but at some point it called tar
like this:
tar -cf ${backup_tars_path}/${backup_name}.tar -C $backups_path .
It worked quite well, except for one major issue. The tar
command was set to run silently, and since there was a huge amount of data to back up (sometimes taking 2-3, even 4 hours) there was no output in a job log. As a result, GitLab terminated the job, considering it stuck due to the lack of log activity.
The obvious solution would be to modify the script - but that wasn’t an option. Due to various restrictions, I wasn’t allowed to change the script itself. I also wasn’t allowed to modify GitLab runner’s settings as well.
That’s why I had to find…
The solution
Since I couldn’t change the script itself, the only option was to adjust the environment it was running in. So that’s exactly what I did.
First, I created a small wrapper script like this:
#!/bin/bash
exec /usr/bin/tar --checkpoint=1000 --checkpoint-action=exec='echo Backup in progress...' "$@"
Then, I placed it in the home directory of the user that runs the backup job and named this file tar
. Then I made it executable:
chmod +x /home/backup/apps/tar
The key part of this wrapper is the options
--checkpoint=1000 --checkpoint-action=exec='echo Backup in progress...'
These flags tell tar to execute a command every 1000 records (in my case - print Backup in progress...
after every 1000 files are processed). This feature explained in detail in the official documentation.
And finally, I added /home/backup/apps
to the environment variable $PATH
but only for the backup process, so no other users or processes were affected:
PATH=/home/backup/apps:$PATH /path/to/original/backup-script
And that’s it! Now, when the backup script runs, it uses the wrapper tar
from /home/backup/apps
instead of the default /usr/bin/tar
. Effectively, this means that the command executed by the backup script becomes:
/usr/bin/tar --checkpoint=1000 --checkpoint-action=exec='echo Backup in progress...' -cf ${backup_tars_path}/${backup_name}.tar -C $backups_path .
As a result, the GitLab job log periodically displays Backup in progress...
, keeping GitLab happy with ongoing output. And me happy as well.
And that’s all from me today. See you next time!