Showing posts with label regex. Show all posts
Showing posts with label regex. Show all posts

14 March 2007

get environment variables of a process with ps - 2: the win!

Yeah, I got it!
First the solution, some comment right afterwards:
#!/usr/bin/perl -w

$catch = shift @ARGV;
while (<>) {
/(.*?) ([A-Z|_]*?=.* )+$catch=(\S*?) /;
print $1." ".$catch."=".$3."\n";
}
Things was more confusing with sed due to the fact that it pretends to work with the /pattern/substitution/ pair. Now I simply parse my line capturing what I'm interested in, and then using only it to make the output.
The interpretation is now straight forward:
  • take the first part of the string until the first group of capital letters or underscore followed by an equal sign; this will be the normal output for each process, and this will be printed out;
  • not be greedy looking for the equal sign, and take the firs you encounter (the ? after the *) so that this will match a single pair VAR=value; but then let this pairs be more than one (the + just after the second parentheses group);
  • then you will find a particular pair, which starts with the first argument i gave you ($catch)
Let's say that this perl script is named select.pl, you can pipe it after you ps call just my previous alias... but now it's parametric:
ps -ely --forest | grep myscript.pl | select.pl PWD
ps -ely --forest | grep myscript.pl | select.pl PATH
ps -ely --forest | grep myscript.pl | select.pl LOGIN
And if you now can't live without the --color feature of grep, perl can emulate it for you using the Term::ANSIColor module:
#!/usr/bin/perl -w

use Term::ANSIColor;

$catch = shift @ARGV;
while (<>) {
/(.*?) ([A-Z|_]*?=.* )+$catch=(\S*?) /;
print $1." ";
print color("red bold"), $catch, color("reset");
print "=".$3."\n";
}
Please enjoy!

get environment variables of a process with ps

Have you ever heard of the e option of ps? As man says, it shows the environment after the command. Used in conjunction with the twice wide-output option, ww, it gets ps to swamp the screen with the whole information about the environment setup at the moment each process was launched. Try it yourself, for example:
ps eww | tail -1
As the tail -1 says, the huge mess you get is a single line of output, corresponding to a single process (most likely it is just the last tail process itself...). If you take a look at that mess, you can see that it is a long list of pairs VAR=value, and it actually represents, as said, a snapshot of the environment at the moment the process started. For example you can find SHELL=/bin/bash or USER=hronir, or HOME=/home/hronir and so on.
Yeah, all this information is too much, but there are situation where one or few of those info could be valuable. Imagine, for example, that you are launching the same script from different directories (for example in order to analyze different set of data, placed in those different directories). Then, maybe after some period, you poor find out that some of these scripts, for any reason, failed to get done. For example you check with top or ps, grepping for your script, an it turns out that there are less scripts running than you launched. The big question, now, is: which script is died, and which not? Which data are still under analysis and which need to be re-submitted? (Your analysis take a lot of time, and you hope to find a way not to resubmit all the scripts...!)
Well, the answer for all these questions is just in the eww options of ps. And in particular in the PWD=/full/path/ pair, which will tell you where was the still running script launched from.
The point, now, is to make it easy to read out the value of this pair among the many others, since a huge number of screen-lines for each ps-output line is very very cumbersome to handle.
Well, after a full afternoon struggling with sed, awk and regex, I came out with this very poor result. Take your ps call grepping your scripts
ps -ely --forest | grep myscript.pl
make sure to add the eww options:
ps -ely --forest eww | grep myscript.pl
and pipe its output to sed as follows:
ps -ely --forest | grep myscript.pl | sed -r 's/(.* )? ([A-Z|_]+)=.* (PWD=\S* ).*/\1 \3/ ' | grep --color=auto PWD
It would be too boring and pedantic to explain the full path to that regular expression pattern. Let me notice only a few things.
First of all, I still not understand the behavior of this regex pattern, in particular the (.* )? (which is supposed to be related to the greedy expressions, but I think to well understand this concept and not the particular behavior I find in this case) and and some fair variations I tried...
Moreover, most of the time I wasted was spent trying to get a parametric version of this solution. A way, I mean, of asking to take out any of the pairs ps eww streams out. Actually I tryed to make a (ba)sh script (function), a perl script... but I didn't find the way to make something very svelte to be used beside the ps command. The utmost I get is to define an alias like this:
alias selectPWD="sed -r 's/(.* )? ([A-Z|_]+)=.* (PWD=\S* ).*/\1 \3/ ' | grep --color=auto PWD"
to be used as follows:
ps -ely --forest | grep myscript.pl | selectPWD
From this, of course, I could easily get any selectXXX I would need, but... can you find a parametric solution?!?
 
PS
Have you ever heard of the --color option for grep? Long ago I set alias grep='grep --color=auto' in my .bashrc...