First of all I am not a skilled package maintainer secondly I am not the xsane nor the tesseract maintainer when trying to operate an OCR (Optical Character Recognition) with tesseract on a page scanned with Xsane we can't get a good result This nevertheless can be done by using this little trick : You must add this in the configuration window of Xsane, in the OCR tab, [OCR command] : "xsane2tess -l xxx" where xxx are the 3 letters of the language to be used by tesseract depending of which is installed (fra for french, eng for english, grc for greek, ara for arabic ...) this can be written directly in the xsane.rc file of a user in its ocr section "ocr-command" "xsane2tess -l fra" "ocr-inputfile-option" "-i" "ocr-outputfile-options" "-o" "ocr-use-gui-pipe" 0 "ocr-gui-outfd-option" "-x" "ocr-progress-keyword" "" xsane2tess.pl is a little script (I will give it as an attachment) that improves hugely the job But... How to make a user able to use it ? We can add this script in the tesseract package ... But where to install it /usr/share/tesseract/configs/ would be perhaps a good place ? But if it's installed, how to know what to do with it ? Maybe adding a notice to tesseract explaining what to do ? Or can it be magically done by using a conditional post install script that modify the xsane.rc if it exists ? Hoping that the maintainers will find it useful Thanks
Created attachment 10834 [details] script to use tesseract with xsane script to use tesseract correctly with xsane (needs to be called by xsane : must add something in the OCR config tab)
Summary: OCR badly done with xsane and tesseract it needs a little trick => OCR badly done with xsane and tesseract : to be OK it needs a little trick
> > We can add this script in the tesseract package ... > But where to install it > /usr/share/tesseract/configs/ would be perhaps a good place ? Sorry My Mistake ! I installed it in /usr/bin after having made it executable and it can be found automatically if it's called by the xsane config file > But if it's installed, how to know what to do with it ? > > Maybe adding a notice to tesseract explaining what to do ? > > Or can it be magically done by using a conditional post install script that > modify the xsane.rc if it exists ? > > Hoping that the maintainers will find it useful > > Thanks
Assigning to the xsane maintainer, CC'ing the tesseract maintainer. Feel free to set the Severity back to enhancement, as the reporter did, I just felt this issue shouldn't happen!
Severity: enhancement => normalAssignee: bugsquad => lists.jjorgeKeywords: (none) => PATCHSummary: OCR badly done with xsane and tesseract : to be OK it needs a little trick => OCR badly done with xsane and tesseract : to be OK it needs a little Perl scriptCC: (none) => marja11, zen25000
I feel this should be upstreamed to xsane project, as it is an enhancement to it's OCR plugin.
I think this is just an Xsane affair, not tesseract at all. I have played with OCR & tesseract (on existing scanned images); did not know you could do it directly from Xsane!
CC: (none) => lewyssmith
Assignee: lists.jjorge => pkg-bugs
CC: (none) => doktor5000
CC: (none) => fri