[Home]TreeCompare

HomePage | RecentChanges | Preferences

==Tree Compare ==

I have been using Tree Pad http://www.treepad.com/

I have merged many tree pad files and need to remove duplicate articles.

I have a way of grouping articles and counting identicle lines.


#
# Lines.awk - used to analysye K1205 rec2Ascii ISUP decode and prepend lines with a sortable string.
#
# usage mawk -f linesPW.awk info.hjt
# usage mawk -f linesPW.awk info.hjt | sort | mawk -f unique.awk
#
# Description: try to analyse ISUP and collate messages to find a profile of usage. 
#	The BIG problem is that the optional parameters are in any order. 
#	It could be thought of as a Tree with leaves of variable size.
#	The branches are variable as well.
#
#
# messages have mandatory, variable mandatory, optional parameters.
#
# parameters are multi-line. 
#
# The objective is to try to totalize unique usages of lines of a parameter.
#
# find blocks of text, reset line number and prepend with block name and line number
#
# sort these lines and then use unique to count duplicate lines.
#
#


BEGIN {
  cm = ""
}

#
# define some rules tofind start of blocks of text. Set / reset want
#


##============================================================
#
# Turn off want events
#


##============================================================
#
# Turn on want events
#


## find parameter name which has four spaces preceeding.
#<Treepad version 3.0>
#dt=Text
#<node>
#Personal Notes-pruned
#0
#<end node> 5P9i0s8y19Z
#dt=Text
#<node>
#ADSL tests
#2

/<Treepad version 3.0>|<end node> 5P9i0s8y19Z/{
  msg = $0

  want = 0

  getline
  getline
  getline
  msg = $0
  getline
  depth = $0 
  cm = ":" msg ":" depth ":"

  ln = 10000
  want = 1
}



##============================================================

#
# While ( WANT ) print of line with a prefix with a line number.
#

( want ) {
  print cm ln " :" $0
  ln   = ln + 1
}


The AWK script below finds duplicate adjacent lines and counts them.


#
#
# unique.awk  
# sort op.txt | mawk -f unique.awk >> op_unique.txt
#
# sort op.txt | mawk -f unique.awk >> op_unique.txt
#
#

BEGIN {
  lastline = ""
  linecnt = 1
}

( $0 !=lastline ){ 
  # print count of unique usage
  print linecnt "\t " lastline
  lastline = $0
  linecnt  = 1
  next
} 

{
  linecnt = linecnt + 1
}



HomePage | RecentChanges | Preferences
This page is read-only | View other revisions
Last edited June 19, 2013 7:18 am by dougrice.plus.com
Search:
dougFooter