calibre-web/cps/book_formats.py

#!/usr/bin/env python
# -*- coding: utf-8 -*-

import logging
import uploader
import os
from flask_babel import gettext as _

__author__ = 'lemmsh'

logger = logging.getLogger("book_formats")

try:
    from wand.image import Image
    from wand import version as ImageVersion
    use_generic_pdf_cover = False
except ImportError as e:
    logger.warning('cannot import Image, generating pdf covers for pdf uploads will not work: %s', e)
    use_generic_pdf_cover = True
try:
    from PyPDF2 import PdfFileReader
    from PyPDF2 import __version__ as PyPdfVersion
    use_pdf_meta = True
except ImportError as e:
    logger.warning('cannot import PyPDF2, extracting pdf metadata will not work: %s', e)
    use_pdf_meta = False

try:
    import epub
    use_epub_meta = True
except ImportError as e:
    logger.warning('cannot import epub, extracting epub metadata will not work: %s', e)
    use_epub_meta = False

try:
    import fb2
    use_fb2_meta = True
except ImportError as e:
    logger.warning('cannot import fb2, extracting fb2 metadata will not work: %s', e)
    use_fb2_meta = False


def process(tmp_file_path, original_file_name, original_file_extension):
    try:
        if ".PDF" == original_file_extension.upper():
            return pdf_meta(tmp_file_path, original_file_name, original_file_extension)
        if ".EPUB" == original_file_extension.upper() and use_epub_meta is True:
            return epub.get_epub_info(tmp_file_path, original_file_name, original_file_extension)
        if ".FB2" == original_file_extension.upper() and use_fb2_meta is True:
            return fb2.get_fb2_info(tmp_file_path, original_file_extension)
    except Exception as e:
        logger.warning('cannot parse metadata, using default: %s', e)
    return default_meta(tmp_file_path, original_file_name, original_file_extension)


def default_meta(tmp_file_path, original_file_name, original_file_extension):
    return uploader.BookMeta(
        file_path=tmp_file_path,
        extension=original_file_extension,
        title=original_file_name,
        author=u"Unknown",
        cover=None,
        description="",
        tags="",
        series="",
        series_id="",
        languages="")


def pdf_meta(tmp_file_path, original_file_name, original_file_extension):

    if use_pdf_meta:
        pdf = PdfFileReader(open(tmp_file_path, 'rb'))
        doc_info = pdf.getDocumentInfo()
    else:
        doc_info = None

    if doc_info is not None:
        author = doc_info.author if doc_info.author is not None else u"Unknown"
        title = doc_info.title if doc_info.title is not None else original_file_name
        subject = doc_info.subject
    else:
        author = u"Unknown"
        title = original_file_name
        subject = ""
    return uploader.BookMeta(
        file_path=tmp_file_path,
        extension=original_file_extension,
        title=title,
        author=author,
        cover=pdf_preview(tmp_file_path, original_file_name),
        description=subject,
        tags="",
        series="",
        series_id="",
        languages="")


def pdf_preview(tmp_file_path, tmp_dir):
    if use_generic_pdf_cover:
        return None
    else:
        cover_file_name = os.path.splitext(tmp_file_path)[0] + ".cover.jpg"
        with Image(filename=tmp_file_path + "[0]", resolution=150) as img:
            img.compression_quality = 88
            img.save(filename=os.path.join(tmp_dir, cover_file_name))
        return cover_file_name

def get_versions():
    if not use_generic_pdf_cover:
        IVersion=ImageVersion.MAGICK_VERSION
    else:
        IVersion=_(u'not installed')
    if use_pdf_meta:
        PVersion=PyPdfVersion
    else:
        PVersion=_(u'not installed')
    return {'ImageVersion':IVersion,'PyPdfVersion':PVersion}
Code cosmetics Bugfix download opds added changable title to opds feed removed unused search.xml file 2017-01-29 20:06:08 +00:00			`#!/usr/bin/env python`
			`# -- coding: utf-8 --`

changes for #77 Code cosmetics #75: - More debug infos for kindlegen and sending e-mail. - Button for sending test e-mail. - timeout of 5min for sending e-mail 2016-12-23 08:53:39 +00:00			`import logging`
			`import uploader`
			`import os`
			`from flask_babel import gettext as _`

refactoring to make adding new formats possible 2016-06-05 15:41:47 +00:00			`__author__ = 'lemmsh'`

logging, tmp cleanup 2016-06-05 16:42:18 +00:00			`logger = logging.getLogger("book_formats")`

refactoring to make adding new formats possible 2016-06-05 15:41:47 +00:00			`try:`
			`from wand.image import Image`
changes for #77 Code cosmetics #75: - More debug infos for kindlegen and sending e-mail. - Button for sending test e-mail. - timeout of 5min for sending e-mail 2016-12-23 08:53:39 +00:00			`from wand import version as ImageVersion`
refactoring to make adding new formats possible 2016-06-05 15:41:47 +00:00			`use_generic_pdf_cover = False`
Least change to adapt with python3 change some syntax - except clause - unicode -> bulitins.str - sqllite uri - fix import local path - 01 to 1 (0 is meaningless) add module - future - builtins (from future) - imp (python3 ) - past (from future) - sqlalchemy (update one) refer to http://python-future.org/compatible_idioms.html 2017-03-05 09:40:39 +00:00			`except ImportError as e:`
epub uploading 2016-06-05 19:28:30 +00:00			`logger.warning('cannot import Image, generating pdf covers for pdf uploads will not work: %s', e)`
refactoring to make adding new formats possible 2016-06-05 15:41:47 +00:00			`use_generic_pdf_cover = True`
logging, tmp cleanup 2016-06-05 16:42:18 +00:00			`try:`
			`from PyPDF2 import PdfFileReader`
changes for #77 Code cosmetics #75: - More debug infos for kindlegen and sending e-mail. - Button for sending test e-mail. - timeout of 5min for sending e-mail 2016-12-23 08:53:39 +00:00			`from PyPDF2 import __version__ as PyPdfVersion`
logging, tmp cleanup 2016-06-05 16:42:18 +00:00			`use_pdf_meta = True`
Least change to adapt with python3 change some syntax - except clause - unicode -> bulitins.str - sqllite uri - fix import local path - 01 to 1 (0 is meaningless) add module - future - builtins (from future) - imp (python3 ) - past (from future) - sqlalchemy (update one) refer to http://python-future.org/compatible_idioms.html 2017-03-05 09:40:39 +00:00			`except ImportError as e:`
epub uploading 2016-06-05 19:28:30 +00:00			`logger.warning('cannot import PyPDF2, extracting pdf metadata will not work: %s', e)`
logging, tmp cleanup 2016-06-05 16:42:18 +00:00			`use_pdf_meta = False`
refactoring to make adding new formats possible 2016-06-05 15:41:47 +00:00
epub uploading 2016-06-05 19:28:30 +00:00			`try:`
			`import epub`
			`use_epub_meta = True`
Least change to adapt with python3 change some syntax - except clause - unicode -> bulitins.str - sqllite uri - fix import local path - 01 to 1 (0 is meaningless) add module - future - builtins (from future) - imp (python3 ) - past (from future) - sqlalchemy (update one) refer to http://python-future.org/compatible_idioms.html 2017-03-05 09:40:39 +00:00			`except ImportError as e:`
resolve conflicts with PR 30 2016-08-07 16:46:38 +00:00			`logger.warning('cannot import epub, extracting epub metadata will not work: %s', e)`
epub uploading 2016-06-05 19:28:30 +00:00			`use_epub_meta = False`

fb2 uploading 2016-06-18 13:50:32 +00:00			`try:`
			`import fb2`
			`use_fb2_meta = True`
Least change to adapt with python3 change some syntax - except clause - unicode -> bulitins.str - sqllite uri - fix import local path - 01 to 1 (0 is meaningless) add module - future - builtins (from future) - imp (python3 ) - past (from future) - sqlalchemy (update one) refer to http://python-future.org/compatible_idioms.html 2017-03-05 09:40:39 +00:00			`except ImportError as e:`
resolve conflicts with PR 30 2016-08-07 16:46:38 +00:00			`logger.warning('cannot import fb2, extracting fb2 metadata will not work: %s', e)`
fb2 uploading 2016-06-18 13:50:32 +00:00			`use_fb2_meta = False`

epub uploading 2016-06-05 19:28:30 +00:00
refactoring to make adding new formats possible 2016-06-05 15:41:47 +00:00			`def process(tmp_file_path, original_file_name, original_file_extension):`
epub uploading 2016-06-05 19:28:30 +00:00			`try:`
			`if ".PDF" == original_file_extension.upper():`
			`return pdf_meta(tmp_file_path, original_file_name, original_file_extension)`
changes for #77 Code cosmetics #75: - More debug infos for kindlegen and sending e-mail. - Button for sending test e-mail. - timeout of 5min for sending e-mail 2016-12-23 08:53:39 +00:00			`if ".EPUB" == original_file_extension.upper() and use_epub_meta is True:`
epub uploading 2016-06-05 19:28:30 +00:00			`return epub.get_epub_info(tmp_file_path, original_file_name, original_file_extension)`
changes for #77 Code cosmetics #75: - More debug infos for kindlegen and sending e-mail. - Button for sending test e-mail. - timeout of 5min for sending e-mail 2016-12-23 08:53:39 +00:00			`if ".FB2" == original_file_extension.upper() and use_fb2_meta is True:`
- added best rated section in normal view - added most downloaded section in opds view - imporved fb2 upload, correct handling of missing elements - author sort is set on editing and uploading files - Encoding stuff on uploading files 2017-02-04 13:28:18 +00:00			`return fb2.get_fb2_info(tmp_file_path, original_file_extension)`
Least change to adapt with python3 change some syntax - except clause - unicode -> bulitins.str - sqllite uri - fix import local path - 01 to 1 (0 is meaningless) add module - future - builtins (from future) - imp (python3 ) - past (from future) - sqlalchemy (update one) refer to http://python-future.org/compatible_idioms.html 2017-03-05 09:40:39 +00:00			`except Exception as e:`
epub uploading 2016-06-05 19:28:30 +00:00			`logger.warning('cannot parse metadata, using default: %s', e)`
			`return default_meta(tmp_file_path, original_file_name, original_file_extension)`
default upload logic 2016-06-05 16:52:28 +00:00

			`def default_meta(tmp_file_path, original_file_name, original_file_extension):`
			`return uploader.BookMeta(`
changes for #77 Code cosmetics #75: - More debug infos for kindlegen and sending e-mail. - Button for sending test e-mail. - timeout of 5min for sending e-mail 2016-12-23 08:53:39 +00:00			`file_path=tmp_file_path,`
			`extension=original_file_extension,`
			`title=original_file_name,`
- added best rated section in normal view - added most downloaded section in opds view - imporved fb2 upload, correct handling of missing elements - author sort is set on editing and uploading files - Encoding stuff on uploading files 2017-02-04 13:28:18 +00:00			`author=u"Unknown",`
changes for #77 Code cosmetics #75: - More debug infos for kindlegen and sending e-mail. - Button for sending test e-mail. - timeout of 5min for sending e-mail 2016-12-23 08:53:39 +00:00			`cover=None,`
			`description="",`
			`tags="",`
			`series="",`
add languages field for BookMeta 2017-03-02 11:59:35 +00:00			`series_id="",`
Upload support detection of language 2017-03-02 14:57:02 +00:00			`languages="")`
refactoring to make adding new formats possible 2016-06-05 15:41:47 +00:00

			`def pdf_meta(tmp_file_path, original_file_name, original_file_extension):`
logging, tmp cleanup 2016-06-05 16:42:18 +00:00
changes for #77 Code cosmetics #75: - More debug infos for kindlegen and sending e-mail. - Button for sending test e-mail. - timeout of 5min for sending e-mail 2016-12-23 08:53:39 +00:00			`if use_pdf_meta:`
logging, tmp cleanup 2016-06-05 16:42:18 +00:00			`pdf = PdfFileReader(open(tmp_file_path, 'rb'))`
			`doc_info = pdf.getDocumentInfo()`
			`else:`
			`doc_info = None`

changes for #77 Code cosmetics #75: - More debug infos for kindlegen and sending e-mail. - Button for sending test e-mail. - timeout of 5min for sending e-mail 2016-12-23 08:53:39 +00:00			`if doc_info is not None:`
- added best rated section in normal view - added most downloaded section in opds view - imporved fb2 upload, correct handling of missing elements - author sort is set on editing and uploading files - Encoding stuff on uploading files 2017-02-04 13:28:18 +00:00			`author = doc_info.author if doc_info.author is not None else u"Unknown"`
Fix for missing metadata in pdf files 2016-08-07 21:32:55 +00:00			`title = doc_info.title if doc_info.title is not None else original_file_name`
refactoring to make adding new formats possible 2016-06-05 15:41:47 +00:00			`subject = doc_info.subject`
			`else:`
- added best rated section in normal view - added most downloaded section in opds view - imporved fb2 upload, correct handling of missing elements - author sort is set on editing and uploading files - Encoding stuff on uploading files 2017-02-04 13:28:18 +00:00			`author = u"Unknown"`
refactoring to make adding new formats possible 2016-06-05 15:41:47 +00:00			`title = original_file_name`
			`subject = ""`
			`return uploader.BookMeta(`
changes for #77 Code cosmetics #75: - More debug infos for kindlegen and sending e-mail. - Button for sending test e-mail. - timeout of 5min for sending e-mail 2016-12-23 08:53:39 +00:00			`file_path=tmp_file_path,`
			`extension=original_file_extension,`
			`title=title,`
			`author=author,`
			`cover=pdf_preview(tmp_file_path, original_file_name),`
			`description=subject,`
			`tags="",`
			`series="",`
add languages field for BookMeta 2017-03-02 11:59:35 +00:00			`series_id="",`
Upload support detection of language 2017-03-02 14:57:02 +00:00			`languages="")`
refactoring to make adding new formats possible 2016-06-05 15:41:47 +00:00
changes for #77 Code cosmetics #75: - More debug infos for kindlegen and sending e-mail. - Button for sending test e-mail. - timeout of 5min for sending e-mail 2016-12-23 08:53:39 +00:00
refactoring to make adding new formats possible 2016-06-05 15:41:47 +00:00			`def pdf_preview(tmp_file_path, tmp_dir):`
			`if use_generic_pdf_cover:`
			`return None`
			`else:`
			`cover_file_name = os.path.splitext(tmp_file_path)[0] + ".cover.jpg"`
Added polish in readme to supported UI languages Handling of missing tags in fb import naming of path is more imitating calibre (replacement of special characters, "pinyining" of author names if unidecode is available ) Sorting of authors (similar to calibre for jr./sr./I..IV endings) bugfix pathseparator on windows and linux during upload bugfix os.rename for authordir publishing date on detailview is formated according to slected locale filename on downloading from web ui is now correct displayed added ids to html for testing 2017-02-15 17:09:17 +00:00			`with Image(filename=tmp_file_path + "[0]", resolution=150) as img:`
refactoring to make adding new formats possible 2016-06-05 15:41:47 +00:00			`img.compression_quality = 88`
			`img.save(filename=os.path.join(tmp_dir, cover_file_name))`
			`return cover_file_name`
changes for #77 Code cosmetics #75: - More debug infos for kindlegen and sending e-mail. - Button for sending test e-mail. - timeout of 5min for sending e-mail 2016-12-23 08:53:39 +00:00
			`def get_versions():`
			`if not use_generic_pdf_cover:`
			`IVersion=ImageVersion.MAGICK_VERSION`
			`else:`
- added best rated section in normal view - added most downloaded section in opds view - imporved fb2 upload, correct handling of missing elements - author sort is set on editing and uploading files - Encoding stuff on uploading files 2017-02-04 13:28:18 +00:00			`IVersion=_(u'not installed')`
changes for #77 Code cosmetics #75: - More debug infos for kindlegen and sending e-mail. - Button for sending test e-mail. - timeout of 5min for sending e-mail 2016-12-23 08:53:39 +00:00			`if use_pdf_meta:`
			`PVersion=PyPdfVersion`
			`else:`
- added best rated section in normal view - added most downloaded section in opds view - imporved fb2 upload, correct handling of missing elements - author sort is set on editing and uploading files - Encoding stuff on uploading files 2017-02-04 13:28:18 +00:00			`PVersion=_(u'not installed')`
changes for #77 Code cosmetics #75: - More debug infos for kindlegen and sending e-mail. - Button for sending test e-mail. - timeout of 5min for sending e-mail 2016-12-23 08:53:39 +00:00			`return {'ImageVersion':IVersion,'PyPdfVersion':PVersion}`