Skip to main content

Generate thumbnail image from the first page of a PDF using Batch Processs

Description of how to automatically create a thumbnail image of the first page of a PDF file when registering it as content using batch process.

Before you start

The thumbnail image will be saved in GCS, so you need to integrate your Kuroco account with Firebase in advance. Please refer to Cloud storage integration with Firebase to integrate with Firebase.

Creating content structure

First, create a content structure to register the PDF file. Click [Content structure] from the menu to open the content structure list screen, and then click [Add].

Image from Gyazo

Create it as follows this time.

Field NameSetting
NameAuto thumbnail generation

Image from Gyazo

Also, add the following two fields:

IDField NameIdentifierField Settings
1Thumbnail imageimageImage (upload to KurocoFiles)
2PDF filepdfFile (upload to KurocoFiles)

Image from Gyazo

Once you have made the settings, click [Add] to save the content structure.

Image from Gyazo

tip

For more information on creating content structure, please refer to the tutorial, Creating content structure.

API Configuration

Next, configure the API to convert the PDF to an image. Click [API] -> [Default] from the menu to open the endpoint list screen, and then click [Add].

Image from Gyazo

The API add screen will be displayed, so configure it as follows and click [Add].

FieldSetting
Titlepdf-to-thumbs
Version1.0
Descriptionauto Thumbnail generation

Image from Gyazo

The endpoint list screen for the created API will be displayed.

Image from Gyazo

API Security Configuration

Configure the API security settings. For this API, select "Dynamic Access Token" to prevent requests from external sources.

From the endpoint list screen, click on [Security].

Image from Gyazo

Select [Dynamic Access Token] from the security settings, and then click [Save].

Image from Gyazo

Next, you need to create endpoints.

Creating Endpoints

You will create two endpoints this time

  • Get contents that have no thumbnail images registered.
  • Update thumbnail image.

In "Get contents that have no thumbnail images registered" endpoint, check whether the content has a registered thumbnail image, and if not, register a thumbnail image for the content with "Update thumbnail image" endpoint.

Creating the "get contents without thumbnail images" endpoint

First, let's create an endpoint to "get the contents with no thumbnail images". Click [Add new endpoint] on the endpoint list screen.

Image from Gyazo

We will create the endpoint as follows:

FieldValue
Pathno-thumb-list
CategoryContent
ModelTopics
Operationlist
Basic settingsFilterimage="" and pdf!=""
(Note: searching by text presence or absence in PDF file names/image descriptions)
Basic settingstopics_group_idEnter the ID of "Auto thumbnail generation" created earlier
Basic settingscnt0

Image from Gyazo

After the settings are configured, click the [Add] button at the bottom to save your changes.

Image from Gyazo

Creating the "Update Thumbnail Image" endpoint

Next, we will create the "Update Thumbnail Image" endpoint. Similarly, click "Add new endpoint" on the endpoint list screen, and create as follows:

ItemContent
Path
CategoryContents
ModelTopics
Operationupdate
Basic Settingstopics_group_idSpecify the ID of the "Automated Thumbnail Creation" created earlier
Basic Settingsuse_columnsimage

After setting, click [Add] at the bottom of the screen.

Image from Gyazo

After the settings are configured, click the [Add] button at the bottom to save your changes.

That completes the creation of the endpoints.

Creating a temporary storage folder

Next, create a folder to store the images. This folder will be used as a temporary storage location after saving the PDF as a thumbnail image.

Click [File manager] from the menu.

Image from Gyazo

Click on [GCS(Private)] and then click on [New subfolder] to create a new folder.

Image from Gyazo

Enter "pdf_thumb" as the folder name, and click [OK].

Image from Gyazo

"pdf_thumb" folder has been created.

Image from Gyazo

Creating batch process

Next, we will create batch process to convert PDF files into images. We will create the following two batch process:

  • Generate a thumbnail image from a PDF
    This batch process converts the first page of a PDF into an image.
  • Register the generated thumbnail images as content
    This batch process registers the created thumbnail images as content for the target.

Creating the "Generate Thumbnail Image from PDF" Batch Process

First, let's create the "Generate Thumbnail Image from PDF" batch process. Click [Add] under [Operation] -> [Batch process].

Image from Gyazo

Configure the following on the batch process editor.

ItemSetting
TitleThumbnail images generation from PDF
Identifiercreate_thumb
BatchHourly

Next, enter the following in the execution contents.

{*Get a list of contents without thumbnails that have been registered as PDF.*}
{api_internal member_id=1 endpoint='/rcms-api/2/no-thumb-list' query='' method='GET' var='contents_list' status_var='status'}
{if $status == 1 && $contents_list.list|@count > 0}
{foreach from=$contents_list.list key=idx item=item}
{if !$item.image.url && $item.pdf.url}{*Image not set.*}
{get_file url=$item.pdf.url var=temp_path save=1}
{if $temp_path}
{assign var=gcp_pdf_path value='files/g/private/pdf_thumb/'|cat:$item.topics_id|cat:'.pdf'}
{assign var=gcp_img_path value='files/g/private/pdf_thumb/'|cat:$item.topics_id|cat:'.png'}
{*Save the PDF file to a temporary directory on GCS.*}
{put_file tmp_path=$temp_path path=$gcp_pdf_path}
{assign var=data value=null}
{assign_array var=data values=''}
{assign var=data.topics_id value=$item.topics_id}
{assign var=data.pdf value=$item.pdf}
{*Generate thumbnails using the functionality of Cloud Functions.*}
{make_pdf_thumb pdfPath=$gcp_pdf_path destPath=$gcp_img_path callback_batch='update_pdf_thumb_bat' data=$data}
{/if}
{/if}
{/foreach}
{/if}
caution

Please replace the 2 in /rcms-api/2/no-thumb-list with the ID of the API you just created. You can confirm the ID of the API from the URL of the endpoint list page.

Image from Gyazo

Image from Gyazo

Once you have finished configuring the settings, click [Add] to save the batch process.

Creating the "Registering Generated Thumbnail Images to Content" batch process

Next, create the "Registering Generated Thumbnail Images to Content" batch process.
Similar to previous step, we will add it using the batch process editor with the following settings.

ItemSetting
TitleRegister image to content
Identifierupdate_pdf_thumb_bat
BatchBatch Template

Next, enter the following information in the execution contents.

{*Set the data obtained from Cloud Functions.*}
{assign var=topics_id value=$ext_data.data.topics_id}
{assign var=image_name value=$ext_data.data.pdf.desc|replace:'.pdf':''}
{assign var=dest_path value=$ext_data.destPath}
{assign var=file_id value='files/temp/pdf_thumb/'|cat:$topics_id|cat:'.png'}
{assign var=save_path value='/files/temp/pdf_thumb/'|cat:$topics_id|cat:'.png'}

{*Get the image file from GCS to "files/temp".*}
{get_file path=$dest_path save_path=$save_path save=1}

{*Upload the obtained image to the content.*}
{assign_array var=post_data values=''}
{assign_array var=post_data.image values=''}
{assign var=post_data.image.file_id value=$file_id}
{assign var=post_data.image.file_nm value=$image_name|cat:'.png'}
{assign var=post_data.image.desc value=$image_name}
{api_internal endpoint='/rcms-api/2/thumb-update/'|cat:$topics_id member_id=1 method='POST' queries=$post_data var='resp' status_var='status'}
{if $status==1}
{*Delete the PDF and thumbnail upon successful processing.*}
{remove_file path='/'|cat:$dest_path}
{remove_file path='/'|cat:$dest_path|replace:'.png':'.pdf'}
{/if}
caution

Please change 2 in /rcms-api/2/thumb-update to the ID of the API that you created earlier. You can confirm the API ID from the URL of the endpoint list page.

Image from Gyazo

The batch process configurations are now complete.

Operational verification

Finally, let's perform an operational verification of the settings.

Create content from the "Auto thumbnail generation" content structure by uploading only the PDF file without uploading any images.

Image from Gyazo

After uploading the PDF, click on [Add] to create the content.

Image from Gyazo

Next, let's run the batch process.
Although the settings are configured to run the batch process every hour, for testing purposes, we will run it manually.

Click on [Thumbnail images generation from PDF] that was created earlier from the batch process.

Image from Gyazo

Click on [Run now] next to the title.

Image from Gyazo

When the alert appears, click on [OK]

Image from Gyazo

The batch has been executed.

Next, let's check if the images have been uploaded to the content.

Click on [Content structure] and then click on [List] for the "Automatic thumbnail generation".

Image from Gyazo

Click on the content that was created earlier.

Image from Gyazo

You should be able to confirm that an image has been registered in the "Thumbnail image" field.

Image from Gyazo

tip

It may take several minutes for the image to be created. If the image has not been registered, please wait for some time and check again.


Support

If you have any other questions, please contact us or check out Our Slack Community.