Modify Plain-Prose OCR-Text
Modify Plain-Prose OCR-Text
(screenshot below) is a tiny freeware (developed by Rituraj
Kalita) that allows its
user to automatically clean the set of uncommon symbols from OCR-generated text,
such symbols being unlikely to actually remain in
the plain English text materials (generally) subjected to optical character
recognition (OCR). In addition, the user is next offered an editor
window (working on edit_lines.txt) for further manual correction of the text line by line, and
then this freeware finally attempts to form the (continuous) paragraphs as best
as it can. The generated continuous paragraphs may then be copied
from take_text.txt,
and formatted as desired in the final-target word-processor (e.g., in Microsoft
Word).
To use this, output from the OCR package, say from Softi Free OCR, is to be pasted into the put_text.txt window of this package.
To work on a set of text-images, it'd be advisable to have the OCR package, this freeware, the target word-processor and the image-viewer all simultaneously opened, and to modify & save one by one!
Download Modify Plain-Prose OCR-Text
(only 11 kB, requires Shabda-Brahma, Abhibhavak or Assam-Calcu
pre-installed).
Note: Extract the downloaded .zip file to C:\ so as to have C:\ModifOCR as the working folder, then copy the link file Modify Plain Prose OCR to your desktop.