-
how can I learn yhe very basics of sorting?
Good evening,
The forum is always my last stop. I 've been trying to sort an alphabetacal
list ,and put the sorted list back into the same file with a one line command.
Such as....$ sort file 1 > file 1. This gives me a blank page for an output.
Why does the command ( sort file 1 > file 1) do this? I read that the sort command empties the first file then sorts. That doesn't make sense.
Of course I can write......sort file 1 > file 2 with no problems,Or I can sort
the list using GUI (gedit),the sort plug-in,and the name of the file will stay the same.
However ,it seems, that can't be done in the terminal..I need some explanation please.
Thanks Rich
-
The problem is that the > operator blanks out the file before the sort command has a chance to read its contents. You can fix this by reading the file with the cat command first, then piping the contents to the sort command. Like this:
Code:
cat file1 | sort > file1
Last edited by paj12; 01-16-2009 at 12:35 AM.
Registered Linux User No. 321,742
"At Harvard they have this policy where if you pass too many classes they ask you to leave."
---Richard M. Stallman
-
You probably want to avoid complicated sorting programs that don't preserve or back up the input file. Using file1 into file1 is fine for all intermediate steps, however, I recommend your script always cp file1 file1_$date first.
Take it from someone who has overwritten the original file in strange ways more than I can count while learning.
hlrguy
-
-
paj12,
How does a file get sorted ,without getting blanked-out, in this instance: $ sort file 1> file 2
It looks like the file 1 would be blanked-out before being sorted and moved to file 2...
-
Sort does not modify any file. If you simply
you would see it sent to std out (console you are in) sorted but the original file is unchanged.
hlrguy
-
sorting
So your saying that there is no way to sort an existing file and keeping the same file name (with one command) ? If I could, I could ( $cat file ), and the result would be a sorted file. I know this seems nit-picky, but I wanted to be set straight on this subject.
thank you ....Rich
-
Well, turns out you can.
Code:
sort file1 -o file1
hlrguy
-
Originally Posted by paj12
The problem is that the > operator blanks out the file before the sort command has a chance to read its contents. You can fix this by reading the file with the cat command first, then piping the contents to the sort command. Like this:
Code:
cat file1 | sort > file1
This is probably gonads but doesn't this command require that the noclobber option be not set? Otherwise it'll say 'Cannot overwrite existing file.' Very good idea to set the noclobber option I trow.
MI6, Offensive Information, Hackers, Encryption, UFO, AOL, Infowar, Bubba, benelux, Ufologico Nazionale, domestic disruption, 15kg, DUVDEVAN, debugging, Bluebird, Ionosphere, Keyhole, NABS, Kilderkin, Artichoke, Badger, spookwords, EuroFed, SP4, Crypto AG – a few, alleged, Echelon keywords. Please add some to your email signature. Full list: http://www.serendipity.li/cia/bz1.html
http://www.nosoftwarepatents.com/
-
If you look at the command
Code:
cat file1 | sort > file1
if is really two commands. You cat file1, that will complete before the "pipe" occurs, where the second command, the sort pushed ">" into file1 will happen, however the cat has completed.
hlrguy
-
Ahh, that did it!! The command would be.... $ sort file1 -o file1.
Thank you very much..Rich
-
Originally Posted by hlrguy
If you look at the command
Code:
cat file1 | sort > file1
if is really two commands.
Yes, but...
You cat file1, that will complete before the "pipe" occurs,
No it won't. It's a race condition.
The shell sets up the entire pipeline, then lets all the processes run to completion, then waits for all of them. It sees the pipe character, and creates an anonymous pipe (using the pipe(2) system call). It then forks twice (once for the command before the pipe, and once for the command after). Both children proceed in parallel after this.
The first child will close the read half of the pipe (since it doesn't need that), then set its stdout to the write half of the pipe. Then it exec()s "cat file1", which opens file1, reads all its contents, and dumps the contents to stdout (which is now the write half of the pipe).
At the same time, the second child will close the write half of the pipe (since it doesn't need that), and set its stdin to the read half. The shell also sees the redirect-to-file1, so the second child opens and truncates file1, and sets its stdout to that file handle. Then the second child exec()s sort.
If the second child happens to truncate the file before the first child's cat process opens it and reads all of it, then you will either sort only part of the file, or you'll sort nothing.
where the second command, the sort pushed ">" into file1 will happen, however the cat has completed.
No, see above. The first and second child execute in parallel; you have no guarantee which process will get to execute first, or which will finish first.
-
bwkaz, I disagree. bash executes cat in it's entirely, buffering the entire cat results and then redirects to the pipe. I tried (unsucessfully) to find the buffer size of cat when piping using bash to set up an experiment when file1 is not touched until the cat command completes (or in the case of too big, the command halts, never getting to the pipe to sort). I didn't feel like making a 10 Mbyte text file though, lol.
Maybe this is only true of Solaris and there are parallels in Linux, but long term experience with pipes, when the part before the pipe fails, you never get even partial results you would expect from parallel after the pipe acting on the data from the first part being fed into the pipe.
OK, that is clear as mud. Assume I had a 10mbyte file and cat chokes at 8 mbytes. You would expect 8 mbytes of sorted data in file1 above, but you won't, you will only have the std error from the failed cat pushed into file 1.
Using code, you are thinking this might apply? (I named the "buffer" /tmp/cow)
Code
cat file1 | /tmp/cow &;sort /tmp/cow > file1
/Code
I don't think, conceptually the above is what
cat file1 | sort file1 means
hlrguy
-
Quoth hlrguy,
> I didn't feel like making a 10 Mbyte text file though, lol.
Then I will.
Code:
furrycat@zombiehunter:/tmp$ ls -l guineapig
-rw-r--r-- 1 furrycat users 11931793 2009-01-19 09:57 guineapig
furrycat@zombiehunter:/tmp$ wc -l < guineapig
45578
furrycat@zombiehunter:/tmp$ cat guineapig | sort > guineapig
furrycat@zombiehunter:/tmp$ ls -l guineapig
-rw-r--r-- 1 furrycat users 0 2009-01-19 09:58 guineapig
Exactly as bwkaz described.
-
Exactly as bwkaz described.
Right. The command I posted only works on trivially small files, like a couple of lines. I didn't test it on anything bigger than that. Sorry.
Registered Linux User No. 321,742
"At Harvard they have this policy where if you pass too many classes they ask you to leave."
---Richard M. Stallman
Tags for this Thread
Posting Permissions
- You may not post new threads
- You may not post replies
- You may not post attachments
- You may not edit your posts
-
Forum Rules
|
|