Reading images in Linux via OCR desktop

Optical character recognition is one of those crucial things we cannot do without as a blind computer user. You don’t know when you will need it.
Most screen readers on windows have the ability to recognize text on controls or in an image. The orca screen reader on Linux does not have this capability. However, there is a script called ocr desktop that does this.

The script grabs am image of the current window, magnifies it and then runs optical character recognition on it. It then displays the results in a read only window which comes on the top of the existing window. You read the results with the arrow keys and standard reading commands. Once you have finished reading the results, you can dismiss them by pressing the escape key.

The script ocrdesktop is present in the arch linux repositories. On raspberry os, we need to install some dependencies before the script will work.
At a terminal, run the following commands to install the dependencies for ocrdesktop.
sudo apt install -y tesseract-ocr pip3 install pillow pip3 install tesserwrap sudo apt install libtesseract-dev sudo apt install -y libwnck-3-dev

Note

If you you do not have pip3 installed, then run
sudo apt install -y python3-pip
You can then download the ocrdesktop repository by cloning it from git.
git clone https://github.com/chrys87/ocrdesktop.git
Switch to the ocrdesktop folder and run
./ocrdesktop

If you run the script from a ssh prompt, you will get an error about unable to initialize the xinit server. That is good because it tells you that there are no unmet dependencies. This error occurs because your desktop is not active in the ssh session.

You can now navigate to the pi’s desktop and if you are using mate, use the keyboard button from the control center to assign a shortcut to the script. Other desktop environments have their own mechanisms to assign keyboard shortcuts to commands. In mate, which is what i am using, do the following.

Navigate to the option to add keyboard shortcuts and activate it.
You will be on a page wwhere there is a tree and a table of corresponding shortcuts.
Tab to the button labeled “add” and activate it.
You will be asked for a name and if you tab, the command you want to run.
You then populate these values. The name can be anything you want. You will need to type the full path to the ocrdesktop script. On my raspberry pi, the path is as follows.
/home/pi/ocrdesktop/ocrdesktop
Once you confirm this dialogue box, you will be placed back in that table and the ocr desktop shortcut will be present in the table under the custom shortcut section.
When you reach the entry of the shortcut, you will hear orca tell you that it is disabled.
Hit the right arrow once to land on the column contained the term disabled and press the enter key.
You may not hear any feedback from orca but this is where you press the shortcut you want to assign to ocrdesktop
Once you have done so and assuming there are no errors, the script will be assigned that shortcut which you will be abkle to hear.

You can now grab any image and hit the assigned hotkey and have the text in that image recognized.
If starting from scratch, you may want to install a program to view images. There are plenty of them but a simple one I am using at the time of this writing is Shotwell

Enter your email Address

techesoterica.com

A blog dealing with sensory substitution and other esoteric concepts and technologies like speech-recognition and chaos theory

Reading images in Linux via OCR desktop

Note

Like this:

Related

About Pranav

Welcome to Techesoterica

Leave a ReplyCancel reply

Enter your email Address

Skip links

Note

Share this:

Like this:

Related

About Pranav

Welcome to Techesoterica

Reader Interactions

Leave a ReplyCancel reply