menu

Questions & Answers

How to capture error messages from a program that fails only outside the terminal?

On a Linux server, I have a script here that will work fine when I start it from the terminal, but fail when started and then detached by another process. So there is probably a difference in the script's environment to fix.

The trouble is, the other process integrating that script does not provide access to the its error messages when it fails. What is an easy (and ideally generic) way to see the output of such a script when it's failing?

Let's assume I have no easy way to change the code of the application calling this script. The failure happens right at the start of the script's run, so there is not enough time to manually attach to it with strace to see its output. An automated solution to attach to it, maybe using a shell script, would be great.

(The specifics should not matter, but for what it's worth: the failing script is the backup script of Discourse, a widespread open source forum software. Discourse and this script are written in Ruby.)

Comments:
2023-01-21 23:52:04
Is it allowed to rename original script and put wrapper with the name of original script? Or if the script is searched in PATH, put wrapper with the same name to a separate directory and provide modified PATH with this directory first to the caller. Wrapper can capture stdout and stderr of original script to a file(s).
2023-01-21 23:52:04
strace -f -p $pid_of_parent_process will monitor the parent and every child it forks. Assuming the moment of spawning the child process is deterministic (e.g. can be triggered manually or happens according to a well-known cron schedule) and stracing the parent temporarily is acceptable.
2023-01-21 23:52:04
@dimich and Bartosz Good ideas, welcome to put them into answers. (A wrapper script would just need to pass through the command line arguments.)
Answers(1) :

You can use a bash script that (1) does "busy waiting" until it sees the targeted process, and then (2) immediately attaches to it with strace and prints its output to the terminal.

#!/bin/sh

# Adapt to a regex that matches only your target process' full command.
name_pattern="bin/ruby.*spawn_backup_restore.rb"

# Wait for a process to start, based on its name, and capture its PID.
# Inspiration and details: https://unix.stackexchange.com/a/410075
pid=
while [ -z "$pid" ] ; do
    pid="$(pgrep --full "$name_pattern" | head -n 1)"

    # Set delay for next check to 1ms to try capturing all output.
    # Remove completely if this is not enough to capture from the start.
    sleep 0.001
done

echo "target process has started, pid is $pid"

# Print all stdout and stderr output of the process we found.
# Source and explanations: https://unix.stackexchange.com/a/58601
strace -p "$pid" -s 9999 -e write