Tag Archives: gocr

Build & Install Osra 1.3.8 on Ubuntu 11.10

Summary

Osra is a a utility designed to convert graphical representations of chemical structures create by Igor Filippov at the National Cancer Institute. This page documents how to compile and install Osra on Ubuntu Linux 11.10. These instructions may work on other versions of Ubuntu Linux and on Debian Linux.  Please leave a comment if you have compiled Osra using these instructions on a different version of Ubuntu, or on other Linux distributions.

Overview

At the time of writing this doc, March 2012, the Osra version is 1.3.8 and is available at:

I copied all the source to a directory in /tmp. If you need the source code at some later point in time, don’t use /tmp as files in /tmp are deleted upon reboot (if they’re older than 14 days). Also, at the time of writing Osra requires a patched version of Gocr for Osra to work. You need to install Gocr before you try to compile and install Osra, you also need to install other packages required for Osra to compile. Most are listed below, but see the Osra Homepage for more details.

An overview of the steps are:

  1. Install required Ubuntu packages
  2. Compile and Install Gocr
  3. Compile and Install Osra

I’ve also written instructions on how to install Osra 1.2.1 on Ubuntu 9.04, however that was written in 2009. Continue reading Build & Install Osra 1.3.8 on Ubuntu 11.10

Compiling Osra on Ubuntu Jaunty

This is a brief HOWTO on compiling OSRA, (Optical Structure Recognition) on Ubuntu Jaunty. To quote the OSRA home page, OSRA is

… is a utility designed to convert graphical representations of chemical structures, as they appear in journal articles, patent documents, textbooks, trade magazines etc., into SMILES (Simplified Molecular Input Line Entry Specification – see http://en.wikipedia.org/wiki/SMILES) or SD file – a computer recognizable molecular structure format. OSRA can read a document in any of the over 90 graphical formats parseable by ImageMagick – including GIF, JPEG, PNG, TIFF, PDF, PS etc., and generate the SMILES or SDF representation of the molecular structure images encountered within that document …

Update: I’ve a newer document that shows how to install Osra on Ubuntu 11.10 (Oneiric):

Make a directory to compile the source:

mkdir /tmp/OSRA; cd /tmp/OSRA;

Be careful doing this in /tmp is cleaned upon reboot the directory may be removed.

Install dependencies needed by the OS:

sudo apt-get install libgraphicsmagick1-dev libmagick++-dev libgraphicsmagick++1-dev potrace gocr  libtclap-dev libopenbabel-dev libopenbabel3 openbabel libnetpbm10 libnetpbm10-dev

Don’t install ocrad and remove it if it’s on your system (you can probably reinstall if you need to after you get Osra to compile):
sudo apt-get remove –purge ocrad;

Source Code:

Instead of manually getting the source packages download the sources used to build the packages for Ubuntu if available.  Make sure the src lines are commented in, in your /etc/apt/sources.list . This will automatically download and extract the code into the current directory:

cd /tmp/OSRA; apt-get source gocr ocrad potrace;

This downloads Gocr 0.46 which the OSRA docs say may not work:

– GOCR/JOCR, optical character recognition library, version 0.43 or later (version 0.45 recommended, do not use 0.46! See special instructions for 0.47 compilation below) Continue reading Compiling Osra on Ubuntu Jaunty