Skip to main content
  1. Posts/

Convert PPTX Document to JPEG Images on Ubuntu

··541 words·3 mins·
Table of Contents
Changelog
  • 2020-12-24: Add how to convert pptx to pdf using unoconv.

In this post, I will share how to convert PPTX file to images. There are two steps. The first step is to convert PPTX to PDF, and the second step is to convert PDF to JPEG.

First, we need to install libreoffice:

apt update && apt install libreoffice

To install the latest version of libreoffice, run the following command instead:

# in order to use add-apt-repository command
apt install software-properties-common
apt-add-repository -y ppa:libreoffice/ppa
apt update && apt install libreoffice

Step one: from PPTX to PDF
#

To convert PPTX to image, we need to first convert it to PDF

Use libreoffice directly
#

We can use soffice provided by libreoffice to convert pptx to pdf directly:

soffice --headless --convert-to pdf test.pptx

This will create a file named test.pdf.

Use unoconv
#

Apart from soffice, we can also use unoconv. Install the related packages first:

apt update && apt install python3-uno unoconv

Then use unoconv to convert pptx to pdf:

unoconv -f pdf demo.pptx

I met the following error when I run unoconv:

unoconv: Cannot find a suitable pyuno library and python binary combination in /usr/lib64/libreoffice
ERROR: Please locate this library and send your feedback to:
[http://github.com/dagwieers/unoconv/issues](https://github.com/dagwieers/unoconv/issues)
No module named uno
unoconv: Cannot find a suitable office installation on your system.
ERROR: Please locate your office installation and send your feedback to:
[http://github.com/dagwieers/unoconv/issues](https://github.com/dagwieers/unoconv/issues)

For me the error is because unoconv is using the wrong python. In fact, unoconv is just a python script with a shebang:

#!/usr/bin/env python3

Since I also installed python3 via Anaconda and add it to the system path, the above shebang will actually use python3 from Anaconda, which is wrong.

The uno.py package is located in /usr/lib/python3/dist-packages/uno.py, and we need to use the system python3.

So we need to change the shebang of unoconv to:

#!/usr/bin/python3

According to comment here, we can also sed to do this:

sed -i 's|#!/usr/bin/env python3|#!/usr/bin/python3|' /usr/bin/unoconv

Step two: from PDF to image
#

In order to turn PDF to images, we can use imagemagick or poppler.

With Imagemagick
#

We need to install imagemagick:

apt install imagemagick

Then we can convert PDF file to image using convert:

convert -density 150 test.pdf -quality 80 output-%3d.jpg

The -desnity option will control the dpi of generated image.

Possible issues
#

During conversion, two errors occur after running the convert command:

convert-im6.q16: not authorized `multiple_img.pdf' @ error/constitute.c/ReadImage/412.
convert-im6.q16: no images defined `output-%3d.jpg' @ error/convert.c/ConvertImageCommand/3258.

For the first error, you can edit /etc/ImageMagick-6/policy.xml and change the following line:

<policy domain="coder" rights="none" pattern="PDF" />

to

<policy domain="coder" rights="read|write" pattern="PDF" />

For the second error, this is because ghostscript has not been installed on the system. Try to install it:

apt install ghostscript

After that, you should be fine to generate from PPTX to jpg/png images.

With poppler
#

We need to install poppler-utils:

apt-get update && apt-get install -y poppler-utils

For further steps, refer to this post.

References
#

Related

Fix Nvidia Apt Repository Public Key Error
·241 words·2 mins
Switch Command with update-alternatives on Ubuntu
··457 words·3 mins
Set Timezone inside Docker Container
··173 words·1 min