how can I learn yhe very basics of sorting?


Page 1 of 2 12 LastLast
Results 1 to 15 of 21

Thread: how can I learn yhe very basics of sorting?

  1. #1
    Join Date
    Mar 2008
    Posts
    22

    how can I learn yhe very basics of sorting?

    Good evening,
    The forum is always my last stop. I 've been trying to sort an alphabetacal
    list ,and put the sorted list back into the same file with a one line command.
    Such as....$ sort file 1 > file 1. This gives me a blank page for an output.
    Why does the command ( sort file 1 > file 1) do this? I read that the sort command empties the first file then sorts. That doesn't make sense.
    Of course I can write......sort file 1 > file 2 with no problems,Or I can sort
    the list using GUI (gedit),the sort plug-in,and the name of the file will stay the same.
    However ,it seems, that can't be done in the terminal..I need some explanation please.
    Thanks Rich

  2. #2
    Join Date
    Jul 2002
    Location
    Tallahassee, FL
    Posts
    512
    The problem is that the > operator blanks out the file before the sort command has a chance to read its contents. You can fix this by reading the file with the cat command first, then piping the contents to the sort command. Like this:
    Code:
    cat file1 | sort > file1
    Last edited by paj12; 01-16-2009 at 12:35 AM.
    Registered Linux User No. 321,742

    "At Harvard they have this policy where if you pass too many classes they ask you to leave."
    ---Richard M. Stallman

  3. #3
    Join Date
    Sep 2002
    Location
    San Antonio, TX
    Posts
    2,607
    You probably want to avoid complicated sorting programs that don't preserve or back up the input file. Using file1 into file1 is fine for all intermediate steps, however, I recommend your script always cp file1 file1_$date first.

    Take it from someone who has overwritten the original file in strange ways more than I can count while learning.

    hlrguy
    Were you a Windows expert the VERY first time you looked at a computer with Windows, or did it take a little time.....
    My Linux Blog
    Linux Native Replacements for Windows Programs
    Mandriva One on a "Vista Home Barely" T3640 E-Machine runs great.

  4. #4
    Join Date
    Mar 2008
    Posts
    22
    thank you..rich

  5. #5
    Join Date
    Mar 2008
    Posts
    22
    paj12,
    How does a file get sorted ,without getting blanked-out, in this instance: $ sort file 1> file 2
    It looks like the file 1 would be blanked-out before being sorted and moved to file 2...

  6. #6
    Join Date
    Sep 2002
    Location
    San Antonio, TX
    Posts
    2,607
    Sort does not modify any file. If you simply
    Code:
    sort file1
    you would see it sent to std out (console you are in) sorted but the original file is unchanged.

    hlrguy
    Were you a Windows expert the VERY first time you looked at a computer with Windows, or did it take a little time.....
    My Linux Blog
    Linux Native Replacements for Windows Programs
    Mandriva One on a "Vista Home Barely" T3640 E-Machine runs great.

  7. #7
    Join Date
    Mar 2008
    Posts
    22

    sorting

    So your saying that there is no way to sort an existing file and keeping the same file name (with one command) ? If I could, I could ( $cat file ), and the result would be a sorted file. I know this seems nit-picky, but I wanted to be set straight on this subject.
    thank you ....Rich

  8. #8
    Join Date
    Sep 2002
    Location
    San Antonio, TX
    Posts
    2,607
    Well, turns out you can.
    Code:
    sort file1 -o file1
    Code:
    man sort
    hlrguy
    Were you a Windows expert the VERY first time you looked at a computer with Windows, or did it take a little time.....
    My Linux Blog
    Linux Native Replacements for Windows Programs
    Mandriva One on a "Vista Home Barely" T3640 E-Machine runs great.

  9. #9
    Join Date
    Mar 2003
    Location
    UK
    Posts
    621
    Quote Originally Posted by paj12 View Post
    The problem is that the > operator blanks out the file before the sort command has a chance to read its contents. You can fix this by reading the file with the cat command first, then piping the contents to the sort command. Like this:
    Code:
    cat file1 | sort > file1
    This is probably gonads but doesn't this command require that the noclobber option be not set? Otherwise it'll say 'Cannot overwrite existing file.' Very good idea to set the noclobber option I trow.
    MI6, Offensive Information, Hackers, Encryption, UFO, AOL, Infowar, Bubba, benelux, Ufologico Nazionale, domestic disruption, 15kg, DUVDEVAN, debugging, Bluebird, Ionosphere, Keyhole, NABS, Kilderkin, Artichoke, Badger, spookwords, EuroFed, SP4, Crypto AG – a few, alleged, Echelon keywords. Please add some to your email signature. Full list: http://www.serendipity.li/cia/bz1.html
    http://www.nosoftwarepatents.com/

  10. #10
    Join Date
    Sep 2002
    Location
    San Antonio, TX
    Posts
    2,607
    If you look at the command
    Code:
    cat file1 | sort > file1
    if is really two commands. You cat file1, that will complete before the "pipe" occurs, where the second command, the sort pushed ">" into file1 will happen, however the cat has completed.

    hlrguy
    Were you a Windows expert the VERY first time you looked at a computer with Windows, or did it take a little time.....
    My Linux Blog
    Linux Native Replacements for Windows Programs
    Mandriva One on a "Vista Home Barely" T3640 E-Machine runs great.

  11. #11
    Join Date
    Mar 2008
    Posts
    22
    Ahh, that did it!! The command would be.... $ sort file1 -o file1.
    Thank you very much..Rich

  12. #12
    Join Date
    Apr 2001
    Location
    SF Bay Area, CA
    Posts
    14,936
    Quote Originally Posted by hlrguy View Post
    If you look at the command
    Code:
    cat file1 | sort > file1
    if is really two commands.
    Yes, but...

    You cat file1, that will complete before the "pipe" occurs,
    No it won't. It's a race condition.

    The shell sets up the entire pipeline, then lets all the processes run to completion, then waits for all of them. It sees the pipe character, and creates an anonymous pipe (using the pipe(2) system call). It then forks twice (once for the command before the pipe, and once for the command after). Both children proceed in parallel after this.

    The first child will close the read half of the pipe (since it doesn't need that), then set its stdout to the write half of the pipe. Then it exec()s "cat file1", which opens file1, reads all its contents, and dumps the contents to stdout (which is now the write half of the pipe).

    At the same time, the second child will close the write half of the pipe (since it doesn't need that), and set its stdin to the read half. The shell also sees the redirect-to-file1, so the second child opens and truncates file1, and sets its stdout to that file handle. Then the second child exec()s sort.

    If the second child happens to truncate the file before the first child's cat process opens it and reads all of it, then you will either sort only part of the file, or you'll sort nothing.

    where the second command, the sort pushed ">" into file1 will happen, however the cat has completed.
    No, see above. The first and second child execute in parallel; you have no guarantee which process will get to execute first, or which will finish first.

  13. #13
    Join Date
    Sep 2002
    Location
    San Antonio, TX
    Posts
    2,607
    bwkaz, I disagree. bash executes cat in it's entirely, buffering the entire cat results and then redirects to the pipe. I tried (unsucessfully) to find the buffer size of cat when piping using bash to set up an experiment when file1 is not touched until the cat command completes (or in the case of too big, the command halts, never getting to the pipe to sort). I didn't feel like making a 10 Mbyte text file though, lol.

    Maybe this is only true of Solaris and there are parallels in Linux, but long term experience with pipes, when the part before the pipe fails, you never get even partial results you would expect from parallel after the pipe acting on the data from the first part being fed into the pipe.

    OK, that is clear as mud. Assume I had a 10mbyte file and cat chokes at 8 mbytes. You would expect 8 mbytes of sorted data in file1 above, but you won't, you will only have the std error from the failed cat pushed into file 1.

    Using code, you are thinking this might apply? (I named the "buffer" /tmp/cow)

    Code
    cat file1 | /tmp/cow &;sort /tmp/cow > file1
    /Code

    I don't think, conceptually the above is what

    cat file1 | sort file1 means

    hlrguy
    Were you a Windows expert the VERY first time you looked at a computer with Windows, or did it take a little time.....
    My Linux Blog
    Linux Native Replacements for Windows Programs
    Mandriva One on a "Vista Home Barely" T3640 E-Machine runs great.

  14. #14
    Join Date
    Sep 1999
    Location
    Cambridge, UK
    Posts
    509
    Quoth hlrguy,

    > I didn't feel like making a 10 Mbyte text file though, lol.

    Then I will.
    Code:
    furrycat@zombiehunter:/tmp$ ls -l guineapig
    -rw-r--r-- 1 furrycat users 11931793 2009-01-19 09:57 guineapig
    furrycat@zombiehunter:/tmp$ wc -l < guineapig
    45578
    furrycat@zombiehunter:/tmp$ cat guineapig | sort > guineapig
    furrycat@zombiehunter:/tmp$ ls -l guineapig 
    -rw-r--r-- 1 furrycat users 0 2009-01-19 09:58 guineapig
    Exactly as bwkaz described.

  15. #15
    Join Date
    Jul 2002
    Location
    Tallahassee, FL
    Posts
    512
    Exactly as bwkaz described.
    Right. The command I posted only works on trivially small files, like a couple of lines. I didn't test it on anything bigger than that. Sorry.
    Registered Linux User No. 321,742

    "At Harvard they have this policy where if you pass too many classes they ask you to leave."
    ---Richard M. Stallman

Tags for this Thread

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •