How To Install Tesseract OCR For Nextant And Nextcloud

http://installion.co.uk/ubuntu/xenial/universe/t/tesseract-ocr/install/index.html

https://github.com/tesseract-ocr/tesseract/wiki

1. Install tesseract-ocr

$ sudo apt-get update
$ sudo apt-get install tesseract-ocr

To uninstall tesseract-ocr

http://installion.co.uk/ubuntu/xenial/universe/t/tesseract-ocr/uninstall/index.html

$ sudo apt-get remove tesseract-ocr

This will remove just the tesseract-ocr package itself.

To uninstall tesseract-ocr and its dependencies

$ sudo apt-get remove --auto-remove tesseract-ocr

This will remove the tesseract-ocr package and any other dependant packages which are no longer needed.

Purging your config/data too

If you also want to delete your local/config files for tesseract-ocr then this will work. Caution! Purged config/data can not be restored by reinstalling the package.

$ sudo apt-get purge tesseract-ocr

Or similarly, like this tesseract-ocr

$ sudo apt-get purge --auto-remove tesseract-ocr

2. Download the appropriate training data

https://github.com/tesseract-ocr/tessdata

Download the latest training data file (e.g., ‘eng.traineddata') into the ‘tessdata' directory at ‘/usr/share/tesseract-ocr/tessdata' :

$ cd /usr/share/tesseract-ocr/tessdata

We will delete the current training data file before we get the latest available:

$ sudo rm -r eng.traineddata

Then,

$ sudo wget https://github.com/tesseract-ocr/tessdata/raw/master/eng.traineddata

If necessary:

Force Nextcloud to rescan files:

$ cd /var/www/html/nextcloud
$ sudo -u www-data php console.php files:scan --all

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.