Compiling Osra on Ubuntu Jaunty
This is a brief HOWTO on compiling OSRA, (Optical Structure Recognition) on Ubuntu Jaunty. To quote the OSRA home page, OSRA is
… is a utility designed to convert graphical representations of chemical structures, as they appear in journal articles, patent documents, textbooks, trade magazines etc., into SMILES (Simplified Molecular Input Line Entry Specification – see http://en.wikipedia.org/wiki/SMILES) or SD file – a computer recognizable molecular structure format. OSRA can read a document in any of the over 90 graphical formats parseable by ImageMagick – including GIF, JPEG, PNG, TIFF, PDF, PS etc., and generate the SMILES or SDF representation of the molecular structure images encountered within that document …
Make a directory to compile the source:
mkdir /tmp/OSRA; cd /tmp/OSRA;
Be careful doing this in /tmp is cleaned upon reboot the directory may be removed.
Install dependencies needed by the OS:
sudo apt-get install libgraphicsmagick1-dev libmagick++-dev libgraphicsmagick++1-dev potrace gocr libtclap-dev libopenbabel-dev libopenbabel3 openbabel libnetpbm10 libnetpbm10-dev
Don’t install ocrad and remove it if it’s on your system (you can probably reinstall if you need to after you get Osra to compile):
sudo apt-get remove –purge ocrad;
Source Code:
Instead of manually getting the source packages download the sources used to build the packages for Ubuntu if available. Make sure the src lines are commented in, in your /etc/apt/sources.list . This will automatically download and extract the code into the current directory:
cd /tmp/OSRA; apt-get source gocr ocrad potrace;
This downloads Gocr 0.46 which the OSRA docs say may not work:
- GOCR/JOCR, optical character recognition library, version 0.43 or later (version 0.45 recommended, do not use 0.46! See special instructions for 0.47 compilation below)
Get the Osra Source and extract it
cd /tmp/OSRA;
wget http://cactus.nci.nih.gov/osra/osra-1.2.1.tgz;
tar xzvf osra-1.2.1.tgz
cd /tmp/OSRA2/osra-1.2.1;
Make a backup copy of the OSRA Makefile:
cp Makefile Makefile.bak;
Edit the Makefile
Change the following lines:
GOCR=../gocr-0.45/
to
GOCR=../gocr-0.46/
OPENBABEL=/usr/local/
to
OPENBABEL=/usr/
TCLAPINC=-I/usr/local/include/tclap/
to
TCLAPINC=-I/usr/include/tclap/
GOCR=../gocr-0.46/
to
GOCR=../gocr-0.45/
Compiling
Compile, but don’t install the potrace source:
cd /tmp/OSRA/potrace-1.8;
./configure;
make;
Compile the OSRA source:
cd /tmp/OSRA/osra-1.2.1;
make;
This produces a working OSRA binary:
./osra
./osra [-f <can/smi/sdf>] [-g] [-p] [-s <dimensions, 300x400>] [-n] [-r
<default: auto>] [-o <filename prefix>] [-t <0.2..0.8>] [--]
[--version] [-h] <filename>
Now I just need a file to test it against to see if it will run correctly.
If you want to build with Gocr 0.47 this step is required:
cd /tmp/OSRA/gocr-0.47;
./configure CPPFLAGS=-fPIC LDFLAGS=-fPIC;
make libs;









I followed you instructions but I get compilation errors. I tried with osra-1.3.5 and also with osra-1.2.1 but I’m getting compilation errors.
patching file ../ocrad-0.17//character.h
g -g -O2 -fPIC -I../ocrad-0.17/ -D_LIB -D_MT -Wall -I../potrace-1.8//src/ -I../gocr-0.45//src/ -I../gocr-0.45//include/ -I/usr//include/openbabel-2.0/ -I/usr/include/tclap/ -I/usr/include/ImageMagick -g -O2 -Wall -W -c osra_ocr.cpp
In file included from /usr/include/c /4.3/cwchar:52,
from /usr/include/c /4.3/bits/postypes.h:47,
from /usr/include/c /4.3/bits/char_traits.h:47,
from /usr/include/c /4.3/string:47,
from ../ocrad-0.17/common.h:18,
from osra_ocr.cpp:34:
/usr/include/wchar.h:140: error: declaration of ‘wchar_t* wcscpy(wchar_t*, const wchar_t*) throw ()’ throws different exceptions
pgm2asc.h:36: error: from previous declaration ‘wchar_t* wcscpy(wchar_t*, const wchar_t*)’
/usr/include/wchar.h:208: error: declaration of ‘wchar_t* wcsdup(const wchar_t*) throw ()’ throws different exceptions
pgm2asc.h:40: error: from previous declaration ‘wchar_t* wcsdup(const wchar_t*)’
/usr/include/wchar.h:214: error: declaration of ‘wchar_t* wcschr(const wchar_t*, wchar_t) throw ()’ throws different exceptions
pgm2asc.h:35: error: from previous declaration ‘wchar_t* wcschr(const wchar_t*, wchar_t)’
/usr/include/wchar.h:249: error: declaration of ‘size_t wcslen(const wchar_t*) throw ()’ throws different exceptions
pgm2asc.h:37: error: from previous declaration ‘size_t wcslen(const wchar_t*)’
osra_ocr.cpp: In function ‘char get_atom_label(Magick::Image, Magick::ColorGray, int, int, int, int, double, int, int)’:
osra_ocr.cpp:56: warning: deprecated conversion from string constant to ‘char*’
osra_ocr.cpp: In function ‘bool detect_bracket(int, int, unsigned char*)’:
osra_ocr.cpp:202: warning: deprecated conversion from string constant to ‘char*’
make: *** [osra_ocr.o] Error 1
What am I going wrong? please help
What version of Ubuntu are you on?
I just tried rebuilding 1.3.5 on Ubuntu Karmic and I had to make some changes to the steps above. Note, I had to use ocrad 0.19 from GNU and not the source package that Ubuntu has. The Osra website says you have to use 0.19 for Osra 1.3.5 to compile.
So do the following to compile 1.3.5 on Ubuntu Karmic:
cd /tmp/OSRA; wget http://cactus.nci.nih.gov/osra/osra-1.3.5.tgz; tar xzvf osra-1.3.5.tgz
cd /tmp/OSRA; wget http://ftp.gnu.org/gnu/ocrad/ocrad-0.19.tar.gz
tar xzvf ocrad-0.19.tar.gz; cd ocrad-0.19/; ./configure; make;
Make the following changes the the Osra Makefile (in /tmp/OSRA/osra-1.3.5):
POTRACE=../potrace-1.8/
GOCR=../gocr-0.46/
OCRAD=../ocrad-0.19/
OPENBABEL=/usr/
Then run make and it should work.
Let me know if this works for you and if it does I’ll do a new posting on how to do this on Karmic.
If your not on Karmic let me know.
Also, this is late at night and there might be some typos or errors …
Thanks, it worked, however there was one more thing I had to do. I had to search and replace all the /usr/local/… occurrences in the Makefile and change them to /usr/ not just the 4 variables above.
BTW: I use Ubuntu 9.04 Jaunty but with the Karmic instruction it works fine.
Thank you very very much you are a life saver
Hi AM,
Glad you were able to get it to compile.
Cheers
Mick