On a Linux server, I have a script here that will work fine when I start it from the terminal, but fail when started and then detached by another process. So there is probably a difference in the script's environment to fix.
The trouble is, the other process integrating that script does not provide access to the its error messages when it fails. What is an easy (and ideally generic) way to see the output of such a script when it's failing?
Let's assume I have no easy way to change the code of the application calling this script. The failure happens right at the start of the script's run, so there is not enough time to manually attach to it with strace
to see its output. An automated solution to attach to it, maybe using a shell script, would be great.
(The specifics should not matter, but for what it's worth: the failing script is the backup script of Discourse, a widespread open source forum software. Discourse and this script are written in Ruby.)
PATH
, put wrapper with the same name to a separate directory and provide modified PATH
with this directory first to the caller. Wrapper can capture stdout and stderr of original script to a file(s). strace -f -p $pid_of_parent_process
will monitor the parent and every child it forks. Assuming the moment of spawning the child process is deterministic (e.g. can be triggered manually or happens according to a well-known cron schedule) and stracing the parent temporarily is acceptable. You can use a bash script that (1) does "busy waiting" until it sees the targeted process, and then (2) immediately attaches to it with strace
and prints its output to the terminal.
#!/bin/sh
# Adapt to a regex that matches only your target process' full command.
name_pattern="bin/ruby.*spawn_backup_restore.rb"
# Wait for a process to start, based on its name, and capture its PID.
# Inspiration and details: https://unix.stackexchange.com/a/410075
pid=
while [ -z "$pid" ] ; do
pid="$(pgrep --full "$name_pattern" | head -n 1)"
# Set delay for next check to 1ms to try capturing all output.
# Remove completely if this is not enough to capture from the start.
sleep 0.001
done
echo "target process has started, pid is $pid"
# Print all stdout and stderr output of the process we found.
# Source and explanations: https://unix.stackexchange.com/a/58601
strace -p "$pid" -s 9999 -e write