Shell script for comparing files


Results 1 to 11 of 11

Thread: Shell script for comparing files

  1. #1
    Join Date
    Oct 2003
    Posts
    193

    Shell script for comparing files

    Morning people.

    I am currently writing a shell script to take 2 sets of 10 files from 2 different system enviroments and run a diff on them to make sure that they are the same and that there are not data inconsistancies between the two.

    I need to be able to pass two arguments to the diff command but am unsure how i can automate this process in a loop, that runs through each file in two seperate directories in turn.

    I thouhg a good way is to create a ls -1 list of the files in each directory, and pass the file list into a loop, this works for one set of files but how can i do it so i can pass both arguments into the loop?

    I hope this is clear..if not i can provide more information.

    I thank anyone in advance who can guide me to the right direction.

  2. #2
    Join Date
    Apr 2001
    Location
    SF Bay Area, CA
    Posts
    14,936
    Do the files have the same names in both directories? If so:

    Code:
    for file in /dir1/* ; do
        base=$(basename $file)
    
        diff -u /dir1/$base /dir2/$base
    done >/path/output.txt
    might work.

    If the files don't have the same name in both directories, then how do you decide which files to compare?

  3. #3
    Join Date
    Oct 2003
    Posts
    193
    Thanks, the names are the same and I did it exactly how you described:

    #!/bin/ksh


    TRACS_DIR="TRACS_delta"
    MQ_DIR="MQ_delta"
    TRACS_LIST="tracs_list.$$"
    MQ_LIST="mq_list.$$"

    touch ${TRACS_LIST}
    cd ${TRACS_DIR}
    ls -1 > "../${TRACS_LIST}"
    cd - > /dev/null

    touch ${MQ_LIST}
    cd ${MQ_DIR}
    ls -1 > "../${MQ_LIST}"
    cd - > /dev/null

    # Check to see that there are equal files to compare

    typeset -i NUM_TRACS
    typeset -i NUM_MQ

    NUM_TRACS=`cat ${TRACS_LIST}|wc -l `
    NUM_MQ=`cat ${MQ_LIST}|wc -l `

    if [ ${NUM_TRACS} == ${NUM_MQ} ]
    then
    while read LIST
    do
    echo "comparing ${LIST}'s"

    sort "${TRACS_DIR}/${LIST}" > /tmp/tracsksh.$$
    sort "${MQ_DIR}/${LIST}" > /tmp/mqksh.$$

    diff /tmp/mqksh.$$ /tmp/tracsksh.$$ >> "results/${LIST}.txt"



    done < "${MQ_LIST}"
    else
    echo "not good"
    fi

    rm tracs_list.$$ mq_list.$$

    Now i just have to create a seperate report file for each couplet of files compared, instead of piping it into one big file.

    Im sure I can figure it out with time, but if someone wants to give me a head start feel free .

    Thanks for you help.

  4. #4
    Join Date
    Apr 2001
    Location
    SF Bay Area, CA
    Posts
    14,936
    Why are you dumping the file list to another file? That will break as soon as a filename has a newline in it (which is valid in filenames).

    "for file in *" won't break no matter what characters the filenames have in them. (OTOH, you will need quotes around $file and $base inside the for loop. Forgot that the first time around...)

  5. #5
    Join Date
    Oct 2003
    Posts
    193
    Hi, I have no idea in what you mean bkwaz, could you explain to me a little clearer please?

  6. #6
    Join Date
    Apr 2001
    Location
    SF Bay Area, CA
    Posts
    14,936
    OK, let's say I do something like this:

    Code:
    $ mkdir tempdir
    $ cd tempdir
    $ touch 'file with a
    > newline in its name'
    $ ls -1
    file with a?newline in its name
    $ ls -1 >../filelist
    $ cat ../filelist
    file with a
    newline in its name
    $ while read var ; do
    >  echo "I read a line.  var is set to:"
    >  echo "$var"
    > done <../filelist
    I read a line.  var is set to:
    file with a
    I read a line.  var is set to:
    newline in its name
    $ for file in * ; do
    >  echo "got a file.  it was named:"
    >  echo "$file"
    > done
    got a file.  it was named:
    file with a
    newline in its name
    $
    As you can see, when I dump the results of "ls -1" to a file, then read them back, the newline in the filename screws stuff up (the "read" executes twice, not once). But when I use "for file in *", it works fine.

  7. #7
    Join Date
    Oct 2003
    Posts
    193
    Thanks i understand what your saying, but im a little confused how i would set file as the variable in the loop??. Would I still have to redirect it into the loop??.

  8. #8
    Join Date
    Apr 2001
    Location
    SF Bay Area, CA
    Posts
    14,936
    Quote Originally Posted by Linux_cat
    im a little confused how i would set file as the variable in the loop??.
    You don't.

    First, the shell will expand * into a list of words, with each word containing one filename. (Below, I will refer to this list as [word-list].)

    Then, the shell will execute "for file in [word-list] ; do", which starts a for loop. When running that loop, the shell sets $file equal to each word in [word-list] in turn. So you don't have to set it manually; the shell does it for you.

    And since the list has a separate item for each filename, it will work with any combination of legal filenames.

    Would I still have to redirect it into the loop??
    No, because the data is coming from the shell, not the extra files (${TRACS_LIST} and ${MQ_LIST} in your script). Those files won't need to be created anymore.

  9. #9
    Join Date
    Oct 2003
    Posts
    193
    Right I get ya, so in order for the shell to pick up the wordlist is needs to be run inside a directory with a 1 set of files that i need to compare??

    Im getting it (i think), thank you for the advice, it will come in very handy.

  10. #10
    Join Date
    Oct 2003
    Posts
    193
    Also, is there an option in diff that lets you ignore the order in which the lines in the file are placed?. I want to run the comaprison and just check for missing lines and lines that differ, as apposed to lines that are in both files but are in a different order.

    IS that possible??.

    I have read the manpage and it doesnt give a very detailed account of how to read the standard output from the diff commant, could you point me to a resource on the web that might have this information??

    Thanks Again.

  11. #11
    Join Date
    Apr 2001
    Location
    SF Bay Area, CA
    Posts
    14,936
    Yikes.... no, I don't know how to do that.

    Perhaps run both the files through "sort" first, and then run diff on them? That'd require at least one pair of temp files...

    is needs to be run inside a directory with a 1 set of files that i need to compare??
    Yes, it does. Or, you can put the path in front of the *, like so:

    for file in /home/you/whatever/dir1/* ; do

    but this might run into the system limit on a command's length, especially if the path part is long.

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •