-
Notifications
You must be signed in to change notification settings - Fork 23
uniq_vals
uniq_vals selects records from the stream by checking values of a given key. If a duplicate
record exists based on the given key, it will only output one record does not locate records
where the value to the specified key is located only once ). If the -i
switch
is used, then non-unique records are located.
... | uniq_vals [options]
[-? | --help] # Print full usage description.
[-k <string> | --key=<string>] # Key for which the value is checked for uniqueness.
[-i | --invert] # Display non-unique records.
[-I <file!> | --stream_in=<file!>] # Read input from stream file - Default=STDIN
[-O <file> | --stream_out=<file>] # Write output to stream file - Default=STDOUT
[-v | --verbose] # Verbose output.
Consider the following two column table in the file test.tab
:
Human H1
Human H2
Human H3
Dog D1
Dog D2
Mouse M1
To locate all unique values of the first columen we use read_tab:
read_tab -i test.tab | uniq_vals -k V0
V0: Human
V1: H1
---
V0: Dog
V1: D1
---
V0: Mouse
V1: M1
---
The result is three records, one unique for each V0.
If we instead want the non-unique records we use the -i
switch with uniq_vals:
read_tab -i test.tab | uniq_vals -k V0 -i
V0: Human
V1: H2
---
V0: Human
V1: H3
---
V0: Dog
V1: D2
---
... and the result shows those records which duplicate values to V0.
So, how do we get the non-duplicated record with the Mouse
? That is in fact not a job
for uniq_vals.
read_tab -i test.tab | count_vals -k V0 | grab -e 'V0_COUNT=1'
V0: Mouse
V1: M1
V0_COUNT: 1
---
However, if we use both count_vals we can obtain a list of how many times each of the records were duplicated based on the first column:
read_tab -i test.tab | count_vals -k V0 | uniq_vals -k V0_COUNT
V0: Human
V1: H1
V0_COUNT: 3
---
V0: Dog
V1: D1
V0_COUNT: 2
---
V0: Mouse
V1: M1
V0_COUNT: 1
---
Martin Asser Hansen - Copyright (C) - All rights reserved.
August 2007
GNU General Public License version 2
http://www.gnu.org/copyleft/gpl.html
uniq_vals is part of the Biopieces framework.