Groups > comp.os.linux.misc > #1473 > unrolled thread

Re (2): Can awk use `grep`?

Started by	no.top.post@gmail.com
First post	2011-06-26 18:03 +0000
Last post	2011-07-05 17:34 +0000
Articles	9 — 6 participants

Back to article view | Back to comp.os.linux.misc

This discussion starts older than the indexed window; earlier articles aren't shown. The article labeled Started by below is the oldest one visible, not the original post.

  Re (2): Can awk use `grep`? no.top.post@gmail.com - 2011-06-26 18:03 +0000
    Re: Re (2): Can awk use `grep`? Bill Marcum <bill@lat.localnet> - 2011-06-27 01:55 -0400
      Re: Re (2): Can awk use `grep`? Geoff Clare <geoff@clare.See-My-Signature.invalid> - 2011-06-27 12:51 +0100
        Re: Re (2): Can awk use `grep`? gazelle@shell.xmission.com (Kenny McCormack) - 2011-06-27 12:18 +0000
          Re: Re (2): Can awk use `grep`? Geoff Clare <geoff@clare.See-My-Signature.invalid> - 2011-06-28 13:30 +0100
      Re (3): Can awk use `grep`? no.top.post@gmail.com - 2011-06-28 03:24 +0000
    Re: Re (2): Can awk use `grep`? Loki Harfagr <l0k1@thedarkdesign.free.fr.INVALID> - 2011-06-27 07:18 +0000
      Re: Re (2): Can awk use `grep`? Janis Papanagnou <janis_papanagnou@hotmail.com> - 2011-06-27 22:02 +0200
      Re (3): Can awk use `grep`? no.top.post@gmail.com - 2011-07-05 17:34 +0000

#1473 — Re (2): Can awk use `grep`?

From	no.top.post@gmail.com
Date	2011-06-26 18:03 +0000
Subject	Re (2): Can awk use `grep`?
Message-ID	<iu7sa1$3un$1@dont-email.me>

In article <iu7414$bnp$1@speranza.aioe.org>, pk <pk@pk.invalid> wrote: 

> On Sun, 26 Jun 2011 10:36:52 +0000 (UTC)
> no.top.post@gmail.com wrote:
> 
> > I've got two inputs.  Input1 looks like:-
> > 3
> > 5
> > 7
> > 9
> > 
> > Input2 looks like:-
> > 9  pathC
> > 3  pathB
> > 7  pathA
> > 
> > I want to get:-
> > 3  pathB
> > 5
> > 7  pathA
> > 9  pathC
> 
> 
> With awk, the usual way to do that is:
> 
> awk 'NR==FNR{a[$1]=$2;next}{print $0, a[$0]}' Input2 Input1
> 
Thanks, a superficial test looks ok.
My real problem is more complex than I described,
and I'd like Input2 Input1 to be 2 functions: PT DV.
So I'm guessing that I'd just use:
`PT' `DV`

I too used to be guilty of one-lineism when younger.
Too bad for other maintainers - I'm not paid to be a tutor.
{   // the matching conditions are
a[$1]  // the 1st field of an entity bound to 'a'
=$2  // equals the 2nd fiel of the input line
;    // and also  ??
}   // end of matching conditions
{  // start action for matching condition/s
print $0   // print 1st field of the input line
,   // and
print 1st field  <related to> the 'a' entity
}'  // end of <awk> except for input args
Input2 Input1  // input 2 files in unknown sequence.

`awk` is a DEVIL !!

[toc] | [next] | [standalone]

#1477

From	Bill Marcum <bill@lat.localnet>
Date	2011-06-27 01:55 -0400
Message-ID	<slrnj0g6qm.3ft.bill@lat.localnet>
In reply to	#1473

On 2011-06-26, no.top.post@gmail.com <no.top.post@gmail.com> wrote:
> In article <iu7414$bnp$1@speranza.aioe.org>, pk <pk@pk.invalid> wrote: 
>
>> With awk, the usual way to do that is:
>> 
>> awk 'NR==FNR{a[$1]=$2;next}{print $0, a[$0]}' Input2 Input1
>> 
> Thanks, a superficial test looks ok.
> My real problem is more complex than I described,
> and I'd like Input2 Input1 to be 2 functions: PT DV.
> So I'm guessing that I'd just use:
> `PT' `DV`
>
> I too used to be guilty of one-lineism when younger.
> Too bad for other maintainers - I'm not paid to be a tutor.
> {   // the matching conditions are
> a[$1]  // the 1st field of an entity bound to 'a'
> =$2  // equals the 2nd fiel of the input line
> ;    // and also  ??
> }   // end of matching conditions
No, conditions are outside of braces.  The expression above is the
action for the condition NR==FNR. You left out "next", which means to
ignore any following actions and start processing the next record.
The following action is always executed unless it's skipped by "next".
By the way, NR==FNR is true while reading the first file in a list.
FNR counts the records in the current file, and NR counts all records.
> {  // start action for matching condition/s
> print $0   // print 1st field of the input line
> ,   // and
> print 1st field  <related to> the 'a' entity
> }'  // end of <awk> except for input args
> Input2 Input1  // input 2 files in unknown sequence.
>
> `awk` is a DEVIL !!
>
>


-- 
Inside every older person is a younger person wondering what the hell
happened.

[toc] | [prev] | [next] | [standalone]

#1481

From	Geoff Clare <geoff@clare.See-My-Signature.invalid>
Date	2011-06-27 12:51 +0100
Message-ID	<pf7md8-1i8.ln1@leafnode-msgid.gclare.org.uk>
In reply to	#1477

Bill Marcum wrote:

> By the way, NR==FNR is true while reading the first file in a list.

Nit-pick: NR==FNR is true while reading the first *non-empty* file.

-- 
Geoff Clare <netnews@gclare.org.uk>

[toc] | [prev] | [next] | [standalone]

#1482

From	gazelle@shell.xmission.com (Kenny McCormack)
Date	2011-06-27 12:18 +0000
Message-ID	<iu9sep$t8t$1@news.xmission.com>
In reply to	#1481

In article <pf7md8-1i8.ln1@leafnode-msgid.gclare.org.uk>,
Geoff Clare  <geoff@clare.See-My-Signature.invalid> wrote:
>Bill Marcum wrote:
>
>> By the way, NR==FNR is true while reading the first file in a list.
>
>Nit-pick: NR==FNR is true while reading the first *non-empty* file.

The point being that it (NR==FNR) is a kludge for what is really intended
(which is: Is this the first file?  Or, more generally: which file is this?)

What people *should* do is "man up" and accept that we are all using GAWK
(and if we're not, we should be) and use ARGIND as it was intended to be
used.  Using the kludge for the sake of compatibility with old, broken AWKs,
is stupid (IMHO, of course...)

Note that NR==FNR doesn't generalize to other than the first file (without
some serious hackery to make it work).

-- 
> No, I haven't, that's why I'm asking questions. If you won't help me,
> why don't you just go find your lost manhood elsewhere.

CLC in a nutshell.

[toc] | [prev] | [next] | [standalone]

#1493

From	Geoff Clare <geoff@clare.See-My-Signature.invalid>
Date	2011-06-28 13:30 +0100
Message-ID	<v3uod8-7q2.ln1@leafnode-msgid.gclare.org.uk>
In reply to	#1482

Kenny McCormack wrote:

> Geoff Clare  <geoff@clare.See-My-Signature.invalid> wrote:
>>Bill Marcum wrote:
>>
>>> By the way, NR==FNR is true while reading the first file in a list.
>>
>>Nit-pick: NR==FNR is true while reading the first *non-empty* file.
>
> The point being that it (NR==FNR) is a kludge for what is really intended
> (which is: Is this the first file?  Or, more generally: which file is this?)
>
> What people *should* do is "man up" and accept that we are all using GAWK
> (and if we're not, we should be) and use ARGIND as it was intended to be
> used.  Using the kludge for the sake of compatibility with old, broken AWKs,
> is stupid (IMHO, of course...)

Personally, I try to stick to what POSIX guarantees, and avoid using
extensions in any version of awk.

BTW I'm reading this thread in comp.lang.awk. Those reading in
comp.os.linux.misc might be less concerned about portability and
therefore more inclined to use extensions. However, I have come across
Linux systems in the past where "awk" was mawk, not gawk, so if you
use gawk extensions make sure you use the name gawk to run it, not
just "awk".

-- 
Geoff Clare <netnews@gclare.org.uk>

[toc] | [prev] | [next] | [standalone]

#1489 — Re (3): Can awk use `grep`?

From	no.top.post@gmail.com
Date	2011-06-28 03:24 +0000
Subject	Re (3): Can awk use `grep`?
Message-ID	<iubhh8$upg$1@dont-email.me>
In reply to	#1477

In article <slrnj0g6qm.3ft.bill@lat.localnet>, Bill Marcum <bill@lat.localnet> wrote: 

> On 2011-06-26, no.top.post@gmail.com <no.top.post@gmail.com> wrote:
> > In article <iu7414$bnp$1@speranza.aioe.org>, pk <pk@pk.invalid> wrote: 
> >
> >> With awk, the usual way to do that is:
> >> 
> >> awk 'NR==FNR{a[$1]=$2;next}{print $0, a[$0]}' Input2 Input1
> >> 
> > Thanks, a superficial test looks ok.
> > My real problem is more complex than I described,
> > and I'd like Input2 Input1 to be 2 functions: PT DV.
> > So I'm guessing that I'd just use:
> > `PT' `DV`
> >
> > I too used to be guilty of one-lineism when younger.
> > Too bad for other maintainers - I'm not paid to be a tutor.
> > {   // the matching conditions are
> > a[$1]  // the 1st field of an entity bound to 'a'
> > =$2  // equals the 2nd fiel of the input line
> > ;    // and also  ??
> > }   // end of matching conditions
> No, conditions are outside of braces.  
> The expression above is the action for the condition NR==FNR. 
> You left out "next", which means to ignore any following 
> actions and start processing the next record.
> The following action is always executed unless it's skipped by "next".
> By the way, NR==FNR is true while reading the first file in a list.
> FNR counts the records in the current file, and NR counts all records.

Great thanks, I'm filing this for further study.
What about how "a" <looks at the other file>?
Are the input-files bound to "a,b,c.." automatically?

== TIA.

PS. I've just read d/l-ed the other contributions
Which I'll study before I have inet acces again,
next weekend.

[toc] | [prev] | [next] | [standalone]

#1478

From	Loki Harfagr <l0k1@thedarkdesign.free.fr.INVALID>
Date	2011-06-27 07:18 +0000
Message-ID	<4e082ecd$0$26323$426a74cc@news.free.fr>
In reply to	#1473

Sun, 26 Jun 2011 18:03:47 +0000, no.top.post did cat :

> In article <iu7414$bnp$1@speranza.aioe.org>, pk <pk@pk.invalid> wrote:
> 
>> On Sun, 26 Jun 2011 10:36:52 +0000 (UTC) no.top.post@gmail.com wrote:
>> 
>> > I've got two inputs.  Input1 looks like:- 3
>> > 5
>> > 7
>> > 9
>> > 
>> > Input2 looks like:-
>> > 9  pathC
>> > 3  pathB
>> > 7  pathA
>> > 
>> > I want to get:-
>> > 3  pathB
>> > 5
>> > 7  pathA
>> > 9  pathC
>> 
>> 
>> With awk, the usual way to do that is:
>> 
>> awk 'NR==FNR{a[$1]=$2;next}{print $0, a[$0]}' Input2 Input1
>> 
> Thanks, a superficial test looks ok.
> My real problem is more complex than I described, and I'd like Input2
> Input1 to be 2 functions: PT DV. So I'm guessing that I'd just use:
> `PT' `DV`
> 
> I too used to be guilty of one-lineism when younger.

Were you ever younger? Anyway, posting on these groups you may be
surprised about the age of some posters, now of course there are
other meanings to 'youth'

> Too bad for other
> maintainers - I'm not paid to be a tutor.

It's so nice we're heavily paid to post on Usenet !->

> {   // the matching conditions
> are
> a[$1]  // the 1st field of an entity bound to 'a' =$2  // equals the 2nd
> fiel of the input line ;    // and also  ??
> }   // end of matching conditions
> {  // start action for matching condition/s print $0   // print 1st
> field of the input line ,   // and
> print 1st field  <related to> the 'a' entity }'  // end of <awk> except
> for input args Input2 Input1  // input 2 files in unknown sequence.

You'd better keep on one-liners if your deployments all go that wrong ;D)
here's the developed version of pk's post:
--------
awk '
NR==FNR{	a[$1]=$2
		next
}
{print $0, a[$0]}
' Input2 Input1
--------

If you're really have allergies on using/understanding 'next'
here's a "nextless" version (to the possible price of RAM use
and possibly goofs if you data don't follow your initial rules,
the possible goofs search is left as an exercise to the million
overpriced paid tutors posting here ;-)
--------
awk '
NR>FNR{print $0,a[$0]}
{a[$1]=$2}
' Input2 Input1
--------

and mind that in any case your assertion:
 "input 2 files in unknown sequence."
is *wrong*, then if your 'coming soon' 2 "functions" are
respectiveley PT for Input2 and DV for Input1, and supposing,
yes hust supposing that these "functions" are actually scripts
or shell functions, you may have a start with:
--------
awk '
NR>FNR{print $0,a[$0]}
{a[$1]=$2}
' <(PT) <(DV)
--------

e-g:
--------
DV() { printf "3\n5\n7\n9\n" ; }
PT() { printf "9 pathC\n3 pathB\n7 pathA\n" ; }
--------

--------
awk 'NR>FNR{print $0,a[$0]}{a[$1]=$2}' <(PT) <(DV)           
3 pathB
5 
7 pathA
9 pathC
--------

[toc] | [prev] | [next] | [standalone]

#1484

From	Janis Papanagnou <janis_papanagnou@hotmail.com>
Date	2011-06-27 22:02 +0200
Message-ID	<iuanlh$2c2$1@news-1.m-online.net>
In reply to	#1478

On 27.06.2011 09:18, Loki Harfagr wrote:
>
[snip]
> 
> You'd better keep on one-liners if your deployments all go that wrong ;D)
> here's the developed version of pk's post:
> --------
> awk '
> NR==FNR{	a[$1]=$2
> 		next
> }
> {print $0, a[$0]}
> ' Input2 Input1
> --------
> 
> If you're really have allergies on using/understanding 'next'
> here's a "nextless" version (to the possible price of RAM use
> and possibly goofs if you data don't follow your initial rules,
> the possible goofs search is left as an exercise to the million
> overpriced paid tutors posting here ;-)
> --------
> awk '
> NR>FNR{print $0,a[$0]}
> {a[$1]=$2}
> ' Input2 Input1
> --------

Another form of a next-less version, one that I often prefer BTW,
is to specify all conditions (in whatever order)...

  NR == FNR {a[$1]=$2}
  NR != FNR {print $0,a[$0]}

(With greetings to Dijkstra.)

Janis

>
> [...]

[toc] | [prev] | [next] | [standalone]

#1601 — Re (3): Can awk use `grep`?

From	no.top.post@gmail.com
Date	2011-07-05 17:34 +0000
Subject	Re (3): Can awk use `grep`?
Message-ID	<iuvhvb$k3d$1@dont-email.me>
In reply to	#1478

In article <4e082ecd$0$26323$426a74cc@news.free.fr>, Loki Harfagr <l0k1@thedarkdesign.free.fr.INVALID> wrote: 

> Sun, 26 Jun 2011 18:03:47 +0000, no.top.post did cat:
> 
> > In article <iu7414$bnp$1@speranza.aioe.org>, pk <pk@pk.invalid> wrote:
> > 
> >> On Sun, 26 Jun 2011 10:36:52 +0000 (UTC) no.top.post@gmail.com wrote:
> >> 
> >> > I've got two inputs.  Input1 looks like:- 3
> >> > 5
> >> > 7
> >> > 9
> >> > 
> >> > Input2 looks like:-
> >> > 9  pathC
> >> > 3  pathB
> >> > 7  pathA
> >> > 
> >> > I want to get:-
> >> > 3  pathB
> >> > 5
> >> > 7  pathA
> >> > 9  pathC
> >> 
> >> 
> >> With awk, the usual way to do that is:
> >> 
> >> awk 'NR==FNR{a[$1]=$2;next}{print $0, a[$0]}' Input2 Input1
> >> 
> > Thanks, a superficial test looks ok.
> > My real problem is more complex than I described, and I'd like Input2
> > Input1 to be 2 functions: PT DV. So I'm guessing that I'd just use:
> > `PT' `DV`
> > 
> > I too used to be guilty of one-lineism when younger.
> 
> Were you ever younger? Anyway, posting on these groups you may be
> surprised about the age of some posters, now of course there are
> other meanings to 'youth'
> 
> > Too bad for other
> > maintainers - I'm not paid to be a tutor.
> 
> It's so nice we're heavily paid to post on Usenet !
> 
> > {   // the matching conditions
> > are
> > a[$1]  // the 1st field of an entity bound to 'a' =$2  // equals the 2nd
> > fiel of the input line ;    // and also  ??
> > }   // end of matching conditions
> > {  // start action for matching condition/s print $0   // print 1st
> > field of the input line ,   // and
> > print 1st field  <related to> the 'a' entity }'  // end of <awk> except
> > for input args Input2 Input1  // input 2 files in unknown sequence.
> 
> You'd better keep on one-liners if your deployments all go that wrong ;D)
> here's the developed version of pk's post:
> --------
> awk '
> NR==FNR{	a[$1]=$2  //IF AllRcdCnt==CurrentFileRcdCnt
> 		next
> }
> {print $0, a[$0]}  // print the Current Rcd, a[$0]
> ' Input2 Input1
> --------
> 
> If you're really have allergies on using/understanding 'next'
> here's a "nextless" version (to the possible price of RAM use
> and possibly goofs if you data don't follow your initial rules,
> the possible goofs search is left as an exercise to the million
> overpriced paid tutors posting here ;-)
> --------
> awk '
> NR>FNR{print $0,a[$0]}
> {a[$1]=$2}
> ' Input2 Input1
> --------
> 
> > and mind that in any case your assertion:
> >  "input 2 files in unknown sequence."
> > is *wrong*, then if your 'coming soon' 2 "functions" are
> > respectiveley PT for Input2 and DV for Input1, and supposing,
> > yes hust supposing that these "functions" are actually scripts
> > or shell functions, you may have a start with:
> > --------
> > awk '
> > NR>FNR{print $0,a[$0]}
> > {a[$1]=$2}
> > ' <(PT) <(DV)
> > --------
WHILE (#All Rcds Read) > (#Rcds Read CuurentFile)
FNR < NR ==> reading records of the 1st file
?? Define a[$0]

Yes PT & DV *are* shell-functions.
So far I just first save them to pt & dv and then feed these 
2 files in, because I'm overwhelmed with the choas/complexity.

This thing is killing me. It's much bigger than `awk`.
It's about seeing immediately where each of 20 odd terminals'
paths are.  This SAME awk-task demonstrates WHY *you* need 
such a facility. THIS task needs terminals 1 to test PT, 2 to test DV,
3 to make notes, 4 to read the NewsArticle, 5 to compare 
previous notes ..6, 7.

So if you've got 2 or 3 projects going, you need 20 odd terminals.
That's why I compared it with a factory in my previous article.

I mentioned the tutorial aspect vs. throwing-out one-liners.
There's a 3rd level:
1. a specific awk query.
2. passing on general knowledge about awk.
3. colaboratively building a tool which is of value to the
 whole linux community.

The tool that I built for old Mandrake9+kde was very
valuable, but it doesn't work with my newer systems.
New kde is a monster, and `pstree -p` for xfce doesn't draw
a 'tree' with a node for each Desktop, like my old one did.

Since you've got competence and enthusiasm, you'd be
interested in looking at this tool.

= set up a number of terminals with some app. I use `mc` 
because it can navigate the dir & view/edit/copy/move... 
But `top`would probably do 'as a tracer'.

I use:--   [this is where all the terminals needed starts !!]
/usr/local/sbin/PT ==
lsof | grep DIR | grep mc | grep "/".  | cut -b-17,67- |\
awk '{print $1 " " $2 " " $4}'

exit
------------> I end my scripts with `exit`because I keep 
notes below. You can use different/better methods to 
<order the pid/s of each `mc` & show its 'path'>

PT gives me a list of all `mc` with their <path> and pid.
And `DV1`, gives me the <list of pid/S arranged/ordered 
so that I can work-out/see which (desktop,terminal) the 
pid has. 

/usr/local/sbin/DV1 ==
# echo 'replace braket/s with space, as field separator'
DV | tr /\(/ /" "/ | tr /\)/ /" "/  |\

# echo 'discard fields 5,6'
awk 'FS=" " {print $1 " " $2 " " $3 " " $4}'  > dv1

cat dv1

# instead of field4 print line in PT 'matching' $4
#awk 'FS=" " {print $1 " " $2 " " $3 " " `grep $4   -f `PT` `}'
# can't do `grep` inside `awk`

exit
---------------DV ==
pstree -p |\
gawk '/Terminal/ { print $0 }
# this 1st line has NOspaceTab
       /bash/ { print $2 }'
# gawk matching action
exit
---------------> PT gives me:--
mc 3481 /mnt/p14/usr/local/bin
mc 3487 /usr/local/sbin
mc 3502 /mnt/p11/April2010
...
mc 4048 /mnt/p14/tmp
mc 4299 /mnt/hd/wily-0.13.42/misc

---> and DV1 gives me:--
|-bash 3463 ---linux.oberon.no <- 3479
|-bash 3464 ---mc <- 3481
|-bash 3465  
|-bash 3466 ---mc 4299
...
|-bash 3466 ---mc 4299
|-bash 3467 ---mc 3502
|-bash 3468 ---mc 3506
...

---> and the last/$5 [I think] field of DV
matches the $2 of PT. That's the pid.

---> so If you can construct:-
3463 linux.oberon.no 3479 <not `mc` type>
3464 mc 3481 /mnt/p14/usr/local/bin
3465  
3466 mc 4299 /mnt/hd/wily-0.13.42/misc
...
--> then when you want to 'access' <wily>, you know
 that it's at 3 terminals beyond `linux.oberon.no` @
Desktop1, Terminal1 + 3 = Desktop1, Terminal4,
in my case, where Deskto1 has [currently] 4 terminals.

Give a man a fish and you've fed him for one day.
Teach him to fish and you've fed him for life.

== TIA.

PS, I've just discovered that:
 if you `pstree -p`
and select the indicated 'pid' of some proccess;
then the last-field of the 1st-record of
lsof | grep <pid> 
  contains the <path of pid> !!
There are possible suprious results from `grep <pid>`,
which can be removed if you <grep only the 2nd-field>.
Which seems like a job for `awk`.  So:
Show me the records with 'pid = 3481'
=> gawk '$2=3481` <(lsof)
  and
=> lsof | gawk '$2=2030`
 both generate a MASS of garbage, as if "gawk 2030"
is generating its own records in `lsof` which are shown.

In this case [where there are no other records containing
2030 as a non-pid in `lsof`] 
lsof | grep | awk 'NR=1{print $9}
shows ALL the <paths>
Whereas I expected "NR=1" to pass only the 1st record.

PSS. this was written on different computers, with
diffferent versions of linux.
Knowledge which can't be developed over time,
is just instant twitter-crap.

[toc] | [prev] | [standalone]

csiph-web

Re (2): Can awk use `grep`?

Contents

#1473 — Re (2): Can awk use `grep`?

#1477

#1481

#1482

#1493

#1489 — Re (3): Can awk use `grep`?

#1478

#1484

#1601 — Re (3): Can awk use `grep`?