Bug 2379 - PDFtoHTML missed in Calibre
Summary: PDFtoHTML missed in Calibre
Status: RESOLVED FIXED
Alias: None
Product: Mageia
Classification: Unclassified
Component: New RPM package request (show other bugs)
Version: 1
Hardware: i586 Linux
Priority: Normal normal
Target Milestone: ---
Assignee: QA Team
QA Contact:
URL:
Whiteboard:
Keywords: Junior_job, validated_update
Depends on: 2317
Blocks:
  Show dependency treegraph
 
Reported: 2011-08-05 19:58 CEST by Alejandro Cobo
Modified: 2014-05-08 18:04 CEST (History)
6 users (show)

See Also:
Source RPM:
CVE:
Status comment:


Attachments

Description Alejandro Cobo 2011-08-05 19:58:55 CEST
When I try to converte a ebook file from PDF to MOBI format, I get this error log:

ERROR: Error de conversión: <b>Fallo</b>: Convertir el libro 1 de 1 (Antoine De Saint Exupéry - Principito)

Convertir el libro 1 de 1 (Antoine De Saint Exupéry - Principito)
Resolved conversion options
calibre version: 0.7.32
{'asciiize': False,
 'author_sort': None,
 'authors': None,
 'base_font_size': 0.0,
 'book_producer': None,
 'change_justification': u'original',
 'chapter': u"//*[((name()='h1' or name()='h2') and re:test(., 'chapter|book|section|part|prologue|epilogue\\s+', 'i')) or @class = 'chapter']",
 'chapter_mark': u'pagebreak',
 'comments': None,
 'cover': '/tmp/calibre_0.7.32_tmp_arNsPv/calibre_0.7.32_bgevrx.jpeg',
 'debug_pipeline': None,
 'disable_font_rescaling': False,
 'dont_compress': False,
 'extra_css': None,
 'font_size_mapping': None,
 'footer_regex': u'(?i)(?<=<hr>)((\\s*<a name=\\d+></a>((<img.+?>)*<br>\\s*)?\\d+<br>\\s*.*?\\s*)|(\\s*<a name=\\d+></a>((<img.+?>)*<br>\\s*)?.*?<br>\\s*\\d+))(?=<br>)',
 'header_regex': u'(?i)(?<=<hr>)((\\s*<a name=\\d+></a>((<img.+?>)*<br>\\s*)?\\d+<br>\\s*.*?\\s*)|(\\s*<a name=\\d+></a>((<img.+?>)*<br>\\s*)?.*?<br>\\s*\\d+))(?=<br>)',
 'html_unwrap_factor': 0.4,
 'input_encoding': None,
 'input_profile': <calibre.customize.profiles.InputProfile object at 0xb004fcc>,
 'insert_blank_line': False,
 'insert_metadata': False,
 'isbn': None,
 'keep_ligatures': False,
 'language': None,
 'level1_toc': None,
 'level2_toc': None,
 'level3_toc': None,
 'line_height': 0.0,
 'linearize_tables': False,
 'margin_bottom': 5.0,
 'margin_left': 5.0,
 'margin_right': 5.0,
 'margin_top': 5.0,
 'max_toc_links': 50,
 'new_pdf_engine': False,
 'no_chapters_in_toc': False,
 'no_images': False,
 'no_inline_navbars': True,
 'no_inline_toc': False,
 'output_profile': <calibre.customize.profiles.KindleOutput object at 0xb005aac>,
 'page_breaks_before': u"//*[name()='h1' or name()='h2']",
 'personal_doc': u'[PDOC]',
 'prefer_author_sort': False,
 'prefer_metadata_cover': False,
 'preprocess_html': False,
 'pretty_print': False,
 'pubdate': None,
 'publisher': None,
 'rating': None,
 'read_metadata_from_opf': '/tmp/calibre_0.7.32_tmp_arNsPv/calibre_0.7.32_pWmhab.opf',
 'remove_first_image': False,
 'remove_footer': False,
 'remove_header': False,
 'remove_paragraph_spacing': False,
 'remove_paragraph_spacing_indent_size': 1.5,
 'rescale_images': False,
 'series': None,
 'series_index': None,
 'smarten_punctuation': False,
 'tags': None,
 'timestamp': None,
 'title': None,
 'title_sort': None,
 'toc_filter': None,
 'toc_threshold': 6,
 'toc_title': None,
 'unwrap_factor': 0.45,
 'use_auto_toc': False,
 'verbose': 2}
InputFormatPlugin: PDF Input running
on /home/alex/Biblioteca de calibre/Shimeria/Antoine De Saint Exupery - Principito (20)/Antoine De Saint Exupery - Principito - Shimeria.pdf
Converting file to html...
Traceback (most recent call last):
  File "/usr/bin/calibre-parallel", line 19, in <module>
    sys.exit(main())
  File "/usr/lib/calibre/calibre/utils/ipc/worker.py", line 106, in main
    result = func(*args, **kwargs)
  File "/usr/lib/calibre/calibre/gui2/convert/gui_conversion.py", line 24, in gui_convert
    plumber.run()
  File "/usr/lib/calibre/calibre/ebooks/conversion/plumber.py", line 836, in run
    accelerators, tdir)
  File "/usr/lib/calibre/calibre/customize/conversion.py", line 216, in __call__
    log, accelerators)
  File "/usr/lib/calibre/calibre/ebooks/pdf/input.py", line 50, in convert
    pdftohtml(os.getcwd(), stream.name, options.no_images)
  File "/usr/lib/calibre/calibre/ebooks/pdf/pdftohtml.py", line 55, in pdftohtml
    raise ConversionError(_('Could not find pdftohtml, check it is in your PATH'))
calibre.ebooks.ConversionError 


I tried to install an old PDFtoHTML package from Mandriva 2007.1 but a got this log:

ERROR: Error de conversión: <b>Fallo</b>: Convertir el libro 1 de 1 (Antoine De Saint Exupéry - Principito)

Convertir el libro 1 de 1 (Antoine De Saint Exupéry - Principito)
Resolved conversion options
calibre version: 0.7.32
{'asciiize': False,
 'author_sort': None,
 'authors': None,
 'base_font_size': 0.0,
 'book_producer': None,
 'change_justification': u'original',
 'chapter': u"//*[((name()='h1' or name()='h2') and re:test(., 'chapter|book|section|part|prologue|epilogue\\s+', 'i')) or @class = 'chapter']",
 'chapter_mark': u'pagebreak',
 'comments': None,
 'cover': '/tmp/calibre_0.7.32_tmp_arNsPv/calibre_0.7.32_cPibSj.jpeg',
 'debug_pipeline': None,
 'disable_font_rescaling': False,
 'dont_compress': False,
 'extra_css': None,
 'font_size_mapping': None,
 'footer_regex': u'(?i)(?<=<hr>)((\\s*<a name=\\d+></a>((<img.+?>)*<br>\\s*)?\\d+<br>\\s*.*?\\s*)|(\\s*<a name=\\d+></a>((<img.+?>)*<br>\\s*)?.*?<br>\\s*\\d+))(?=<br>)',
 'header_regex': u'(?i)(?<=<hr>)((\\s*<a name=\\d+></a>((<img.+?>)*<br>\\s*)?\\d+<br>\\s*.*?\\s*)|(\\s*<a name=\\d+></a>((<img.+?>)*<br>\\s*)?.*?<br>\\s*\\d+))(?=<br>)',
 'html_unwrap_factor': 0.4,
 'input_encoding': None,
 'input_profile': <calibre.customize.profiles.InputProfile object at 0xabb8fcc>,
 'insert_blank_line': False,
 'insert_metadata': False,
 'isbn': None,
 'keep_ligatures': False,
 'language': None,
 'level1_toc': None,
 'level2_toc': None,
 'level3_toc': None,
 'line_height': 0.0,
 'linearize_tables': False,
 'margin_bottom': 5.0,
 'margin_left': 5.0,
 'margin_right': 5.0,
 'margin_top': 5.0,
 'max_toc_links': 50,
 'new_pdf_engine': False,
 'no_chapters_in_toc': False,
 'no_images': False,
 'no_inline_navbars': True,
 'no_inline_toc': False,
 'output_profile': <calibre.customize.profiles.KindleOutput object at 0xabb9aac>,
 'page_breaks_before': u"//*[name()='h1' or name()='h2']",
 'personal_doc': u'[PDOC]',
 'prefer_author_sort': False,
 'prefer_metadata_cover': False,
 'preprocess_html': False,
 'pretty_print': False,
 'pubdate': None,
 'publisher': None,
 'rating': None,
 'read_metadata_from_opf': '/tmp/calibre_0.7.32_tmp_arNsPv/calibre_0.7.32_SBgRr7.opf',
 'remove_first_image': False,
 'remove_footer': False,
 'remove_header': False,
 'remove_paragraph_spacing': False,
 'remove_paragraph_spacing_indent_size': 1.5,
 'rescale_images': False,
 'series': None,
 'series_index': None,
 'smarten_punctuation': False,
 'tags': None,
 'timestamp': None,
 'title': None,
 'title_sort': None,
 'toc_filter': None,
 'toc_threshold': 6,
 'toc_title': None,
 'unwrap_factor': 0.45,
 'use_auto_toc': False,
 'verbose': 2}
InputFormatPlugin: PDF Input running
on /home/alex/Biblioteca de calibre/Shimeria/Antoine De Saint Exupery - Principito (20)/Antoine De Saint Exupery - Principito - Shimeria.pdf
Converting file to html...
Traceback (most recent call last):
  File "/usr/bin/calibre-parallel", line 19, in <module>
    sys.exit(main())
  File "/usr/lib/calibre/calibre/utils/ipc/worker.py", line 106, in main
    result = func(*args, **kwargs)
  File "/usr/lib/calibre/calibre/gui2/convert/gui_conversion.py", line 24, in gui_convert
    plumber.run()
  File "/usr/lib/calibre/calibre/ebooks/conversion/plumber.py", line 836, in run
    accelerators, tdir)
  File "/usr/lib/calibre/calibre/customize/conversion.py", line 216, in __call__
    log, accelerators)
  File "/usr/lib/calibre/calibre/ebooks/pdf/input.py", line 50, in convert
    pdftohtml(os.getcwd(), stream.name, options.no_images)
  File "/usr/lib/calibre/calibre/ebooks/pdf/pdftohtml.py", line 72, in pdftohtml
    raise ConversionError(out)
calibre.ebooks.ConversionError: pdftohtml version 0.39 http://pdftohtml.sourceforge.net/, based on Xpdf version 3.00
Copyright 1999-2003 Gueorgui Ovtcharov and Rainer Dorsch
Copyright 1996-2004 Glyph & Cog, LLC

Usage: pdftohtml [options] <PDF-file> [<html-file> <xml-file>]
  -f <int>          : first page to convert
  -l <int>          : last page to convert
  -q                : don't print any messages or errors
  -h                : print usage information
  -help             : print usage information
  -p                : exchange .pdf links by .html
  -c                : generate complex document
  -i                : ignore images
  -noframes         : generate no frames
  -stdout           : use standard output
  -zoom <fp>        : zoom the pdf document (default 1.5)
  -xml              : output for XML post-processing
  -hidden           : output hidden text
  -nomerge          : do not merge paragraphs
  -enc <string>     : output text encoding name
  -dev <string>     : output device name for Ghostscript (png16m, jpeg etc)
  -v                : print copyright and version info
  -opw <string>     : owner password (for encrypted files)
  -upw <string>     : user password (for encrypted files)

I think a propper version of PDFtoHTML is requiered by Calibre.
Comment 1 Ahmad Samir 2011-08-05 20:59:35 CEST
calibre-0.8.6-0.1.mga1 has already been in the core/updates_testing repo for some time, you may want to give it a shot
Comment 2 Dave Hodgins 2011-08-05 21:34:58 CEST
Try installing poppler.

May just be a missing requires.

CC: (none) => davidwhodgins

Manuel Hiebel 2011-08-17 12:37:12 CEST

Keywords: (none) => NEEDINFO

Comment 3 Alejandro Cobo 2011-08-17 13:20:07 CEST
Sorry for the wait. I have installed poppler and it works fine.

Thanks.
Comment 4 Dave Hodgins 2011-08-17 21:23:49 CEST
Then a requires for poppler should be added to calbre, but adding a requires
is currently blocked by bug 2317

Depends on: (none) => 2317

Samuel Verschelde 2011-09-10 00:11:30 CEST

Keywords: (none) => Junior_job
CC: (none) => stormi

Samuel Verschelde 2011-09-10 00:12:59 CEST

Assignee: bugsquad => bertauxx

Comment 5 Samuel Verschelde 2011-09-10 00:13:47 CEST
Xavier, here is a task for you : the calibre package in Mageia 1 (and maybe in cauldron) have a missing dependency.
Comment 6 Nicolas Vigier 2011-09-15 15:14:22 CEST
I don't think it's related to bug 2317 as there is no urpmi error.

CC: (none) => boklm

Comment 7 Dave Hodgins 2011-09-15 18:58:28 CEST
There is a missing dependency on poppler in calibre.

When a new dependency is added to calibre, there will be urpmi errors for
anyone using mgaapplet to update calibre, unless poppler is also in Updates.
Manuel Hiebel 2011-10-25 11:25:23 CEST

Keywords: NEEDINFO => (none)

Comment 8 Samuel Verschelde 2011-11-06 23:04:12 CET
Adding the maintainer in CC now that there's one. To maintainer: Xavier BERTAUX is on this bug report as part of mentoring process, but just tell if you prefer to handle it yourself.

CC: (none) => supp

Comment 9 Tomas Kindl 2011-11-07 16:32:35 CET
This would be nice tutoring example so let your apprentice do those changes.
Comment 10 Xavier BERTAUX 2011-11-07 23:00:45 CET
Add Requires: poppler in spec file for fix bug

Assignee: bertauxx => qa-bugs

Comment 11 Dave Hodgins 2011-11-07 23:59:06 CET
Testing complete on i586 for the srpm
calibre-0.8.6-0.2.mga1.src.rpm

Tested calibre using a local newspaper.

The following packages will have to be linked from Core Release
to Core Updates according to the depchecks script.
libpoppler-glib6
poppler
poppler-gir0.16
Comment 12 Derek Jennings 2011-11-10 00:39:03 CET
calibre-0.8.6-0.2.mga1.x86_64.rpm
Tested OK on x86_64

Confirmed bug when converting from PDF to MOBI and confirmed fixed.

CC: (none) => derekjenn

Comment 13 Dave Hodgins 2011-11-10 05:02:05 CET
Validating the update.

Can someone from the sysadmin team push the srpm
calibre-0.8.6-0.2.mga1.src.rpm
from Core Updates Testing to Core Updates

Advisory: This update to the calibre package adds a requirement for
the poppler package, which is required for format conversions
involving pdf files.

https://bugs.mageia.org/show_bug.cgi?id=2379

Also the packages
libpoppler-glib6
poppler
poppler-gir0.16
need to be linked from Core Release to Core Updates.
Comment 14 Dave Hodgins 2011-11-10 05:05:41 CET
Forgot to add the keyword and sysadmin-bugs mailing list address.

See Comment 13 for the advisory, srpm, and three packages to be linked.

Keywords: (none) => validated_update
CC: (none) => sysadmin-bugs

Comment 15 Thomas Backlund 2011-11-10 22:22:24 CET
Update pushed.

Status: NEW => RESOLVED
CC: (none) => tmb
Resolution: (none) => FIXED

Nicolas Vigier 2014-05-08 18:04:40 CEST

CC: boklm => (none)


Note You need to log in before you can comment on or make changes to this bug.