Older releases in 0.92.x, 0.91.x and 0.90.x series are still available.
There is no reason to prefer them to current one.
Browse CVS
repository
MD5 checksums:
c021d2e30318bea063133191122676e5 catdoc-0.93.3.tar.gz
afbde32d1593c7e8eaf42f4ba5460b90 catdoc-0.93.3.zip
forget about release 0.93.2. It was buggy.
e08eb3c709de8d6dc54df03cd79a3192 catdoc-0.93.1.tar.gz
90d2fba000463f12a267e758fd2fb35d catdoc-0.93.1.zip
23c98aa829cf69aeb5e96d81a70cb84f catdoc-0.93.tar.gz
bdd96bc3629dc6400dc3e4da093f2807 catdoc-0.93.zip
460ee1aaaa34363b2cfb56748a14a55d catdoc-0.91.6.tar.gz
67974a635c143b03124987889cd6434c catdoc-0.91.6.zip
1314239a3d9c9c7bfda608dbcdc33e3f catdoc-0.91.5.tar.gz
bfc146724ad45ba1287eb5466882670a catdoc-0.91.4.tar.gz
84f9fea198f71bec66c6bed2a612e86c catdoc-0.91.3.tar.gz
edfaedb7b60ff6336b03f67c16dd4c60 catdoc-0.91.2.tar.gz
6d44fb20f2fb2365fbc26e5753b4a8bf catdoc-0.91.1.tar.gz
13fc1cafdd7f2733a28ff4b0e28a52bd catdoc-0.91.tar.gz
54f3b3789d241a346378a13023f624b7 catdoc-0.90.3.tar.gz
2d90577df365408051e489ab93c051c2 catdoc-0.90.3.zip
If you are checking checksums under DOS or Windows, please make sure
that your md5sum utility opens files in binary mode. Typically DOS-based
md5sum utilities has special command-line switch for it.
Documentation
Catdoc distribution includes man page in troff source form and
postscript and plain text versions of it.
HTML version of man pages for catdoc(1),
wordview
and xls2csv are available here.
See also catdoc FAQ below.
Status of this release
Catdoc 0.90 is complete rewritten from scratch. It has been tested
at least on MS-DOS, Linux, BSDI and Solaris.
I kindly ask my users to contribute
replacement and substitution maps for your beloved characters.
Bug reports are
also welcome
Current version of substitution maps can be downloaded separately.
So, if you are using catdoc beta1 you don't need to download whole
distribution or even
recompile, Just get ascii.rpl and tex.rpl and replace ones, provided in distribution.
Bug and success reports are also welcome.
What to do if catdoc doesn't read your Word file
correctly
- Q: I've compilied catdoc and decided to test it before installing.
But it complains about missing charset file while all the files are in
place
A: Catdoc is not designed to work without proper installation. You can
overcome this problem by creating ${HOME}/.catdocrc file and specifiing
path to charsets
in it (see manual for syntax), but you'll also need to create
symlinks for ascii.spc
and ascii.rpl
files
in this diredtory named ascii.specchars
and
ascii.replchars
repsectively. It is simplier to let
make install
do it for you
- Q: catdoc does something strange with my accented characters.
A: Have you specified correct input and output charsets? By default,
catdoc comes with cyrillic charsets configured in, and it is probably
not what you want if you are not Russian. See charset correspondence table. Note also that
Word files almost never use ISO8859-* charsets. They use cp* charsets
which have additional punctuation characters, in range 0x80-0x9F. Catdoc
probably would find reasonable substitution for them, if it knows proper
charset of document.
- Q: I've successfully compilied catdoc-0.90.2 on SunOS 4.x, but it
doesn't output any meaningful text
A: Catdoc uses %x format specifier to read charsets and substitution
maps, but on SunOS 4.x %x doesn't handle leading 0x in hex numbers.
It should be replaced by %i everywhere where it occurs in functions
read_charset
and read_substmap
. It was
addressed in 0.90.3
- Q: Catdoc breaks lines in arbitrary places and eats chars at the
end of line
A: Running MS-DOS, aren't you? This is a bug in isspace
implementation in Turbo C. It thinks that all chars with eighth bit set
are space. This is (hopefully) fixed in 0.90.3
- Q: Catdoc doesn't work at all - it just complains about some missing file,
but it is in the same directory as executable
A: You are running an MS-DOS system, aren't you?
pkunzip
on MS-DOS have crazy default behavoir to put all
the files in the archive into one directory, without reproducing
directory structure, stored in the archive. Unpack with pkunzip -d catdoc.zip
and all would be Ok. Support files should go in special subdirectory,
not where executable resides. If you don't agree with me, you can
override this in the catdoc.rc
file.
- If there are few garbage lines of screen, try to use -u
switch. Catdoc doesn't determine word 8 authomatically (suggestions
welcome)
- If there is a lot of garbage on screen (it seems that catdoc just
dumps file to stdout) - try to use -b switch. May be you are
trying to read broken file or file from very old version of Word, which
doesn't have correct OLE signatire.
- If catdoc segfaults misteriously, first try to recompile catdoc
without optimization (remove -O from FLAGS in Makefile). There are known
problems with some versions of gcc on some platforms, one to mention
HP/UX 9.x. If it doesn't help - this is a bug. Write a bug report, if
you cannot find it yourself, or send me a patch if you fixed it.
- If you get screen full of question marks or text where letters are
mixed in random order, but words and paragraphs looks sensible, you
are probably using incorrect input or output charset. Play with
-s and -d options, may be using
wordview
- Catdoc replaces some non-alphanumeric characheres with question
marks.
- Find out which UNICODE characters are unsupported. You can do so
by comparing second column in charset files for you input and
output charsets. Typically all characters with code above 0x2000
are suspicious.
-
Edit your substitution map file (ascii.replchars/ascii.rpl or
tex.replchars/tex.rpl) and add there correct replacement sequences
according to UNICODE name for this character.
- Send me a patch to be included in the next beta version.
- Catdoc produces incorrect TeX commands.
- Find out which substitution map contain this incorrect sequence
There are only to files named tex.something in catdoc library
directory, so it should be easy.
- Edit this file with your favorite text editor and fix it.
- Send me a patch
If you don't have access to catdoc library directory, copy these
files into your home directory and override substitution map location
in your ~/.catdocrc
.
After submitting me a patch, persuade your sysadmin to upgrade catdoc
Where to get additional charset definitions
- ISO to
Unicode mappings,
- directly usable by catdoc
- Various
Microsoft codepages
- Which you can expect to find in Word files
- APPLE
codepages
- For those troubled with files from Word for Macintosh
- KOI8-R and
KOI8-U mappings to UNICODE
- are not available
on unicode.org site for some reason. So, they are provided locally.
List of UNICODE character names, which can be helpful for those,
who wish to extend substitution and replacement maps can be obtained
from ftp://ftp.unicode.org/Public/UNIDATA/UnicodeData-Latest.txt.
Format of this file is described in corresponding
ReadMe.
[Top] -> [Works] ->
[Unix] -> [catdoc] -> [Version
0.3x] [Version 0.9x]