WCC: Indexing issue in 11.1.1.6

Corrupt PDF or PPTX files can block indexing queue in WebCenter Content 11.1.1.6 platform using DATABASE.FULLTEXT as engine.

How identify it?

Before start Collection Rebuild put some traces to debug level. Follow next:

  • Configure IDC tracers: indexer, indexermonitor, indexerprocess, systemdatabase, taskmanager. In addition, check Full Verbose Tracing.

     

    1
    IDC Tracers

     

  • The new traces will contain following information:
    • indexer, indexermonitor, indexerprocess: Shows indexing task traces.
    • systemdatabase: Shows all the SQL Queries involved in the indexing process.
    • taskmanager: Shows information relative to the process and tasks involved in the indexing process. For example, TextExport task is the responsible of transform the content to indexing files.
  • Before start full Collection Rebuilt it's recommendable stop automatic indexer.

     

    3
    Stop Automatic Indexer

     

  • Configure Collection Reubuild to generate traces.

     

    4
    Configure Collection Rebuild to generate traces

     

  • When the indexer stops/block indexing a corrupt file then will appear a log trace like next:

    (internal)/6 06.25 22:17:22.961 TextExport_0 Process 'TextExport' timed out.It means a timeout during the conversion of the processing content to an index file. Timeout sometimes could be solve including next variables to config.cfgIndexerTextExtractionTimeout: por defecto son 15 sec (subirlo a 60 sec).
    TextExtractorTimeoutSec: por defecto son 15 sec (subirlo a 60 sec).
  • However, after increase Timeout and set taskmanager on in IDC traces shows following log:taskmanager/6 06.25 23:56:10.636 TextExport_0 Task failed with output: 1.
    (internal)/7 06.25 23:56:10.636 TextExport_0 Unexpected abort by process 'TextExport'.
    taskmanager/6 06.25 23:56:10.636 TextExport_0 Removing launcher for task: TextExport that has been marked as terminated
    indexer/6 06.25 23:56:10.636 TextExport_0 Extracted file contains zero bytes.
    taskmanager/6 06.25 23:56:10.652 TextExport_0 task Monitor <intradoc.taskmanager.TaskMonitor$1@130a6d30> exiting
    taskmanager/7 06.25 23:56:10.652 TaskLauncher_TextExport_stderr__0 Finish reading.
    taskmanager/7 06.25 23:56:10.652 TaskLauncher_TextExport_stderr__0 Finish reading.
    TextExport were aborted when processing a file.

    This error is due to a bug in Oracle 11.1.1.6 WebCenter Content. Applying the latest patch of WebCenter Content 11.1.1.6 is solved by making the indexer not remain stuck when it encounters a problem of this type, and, therefore, allowing again the indexing process until finish.

  Whitepaper. Migration to WebCenter Sites