cs.utexas.edu!convex!convex!tchrist Thu Jul 16 18:12:32 CDT 1992 >From the keyboard of nlane@well.sf.ca.us (Nathan D. Lane): :Hello all, : Having struggled with this all weekend and not figured it out :even with the help of the Camel Book, the manpage, or the FAQ, I've :decided it's time to post. Could someone please tell me how to remove :characters such as the bullet and foreign characters from a file? I'm :trying to convert files from a CPT 8525 word processing system into a :format that makes more sense on an IBM RS/6000 220 or 340 or a Sun 3 :or Sparc. I need to remove invalid control characters and characters with :the 8th bit set. I don't want to remove linefeed (^J), however. I'd :LOVE to have it convert the codes to troff (or so my husband tells me :-) :..any ideas? If this is too trivial a question to answer with a post, I'd :still really appreciate email. Thanks in advance for *any* replies! In general, it's hard to know how to fix up a file with wordprocessing magic in it unless you've specialized in said magic. But if all you want to do is throw away the stuff you don't recognize, you can in-place edit files using perl this way: perl -i.bak -p -e 'y/\000-\200-\377//d' file1 file2 file3 ... which strips high-bit characters. If you want to remove all the nonprintables except for space and tab, you could do this: y/\000-\010\013-\037\177-\377//d; I skipped characters 010 and 011 because they're \t and \n. --tom -- Tom Christiansen tchrist@convex.com convex!tchrist signal(i, SIG_DFL); /* crunch, crunch, crunch */ --Larry Wall in doarg.c from the perl source code cs.utexas.edu!convex!convex!tchrist Thu Jul 16 18:12:45 CDT 1992 I wrote: : perl -i.bak -p -e 'y/\000-\200-\377//d' file1 file2 file3 ... But that won't even compile. I meant to write something more like: : perl -i.bak -p -e 'y/\000-\037\200-\377//d' file1 file2 file3 ... --tom -- Tom Christiansen tchrist@convex.com convex!tchrist Real programmers can write assembly code in any language. :-) --Larry Wall in <8571@jpl-devvax.JPL.NASA.GOV>