Analysis Platform Support

Run Scanner can parse analysis data generated by an on-instrument analysis suite, and serve said data via REST API. This feature must be enabled per processor. Please see Appendix A: Processor Definitions for details on configuration.

If enabled, the analysisExpected field of the run data will be set to true if Run Scanner detects a run directory contains analysis information. The pipelineRuns field of the /run endpoint will contain an array of data structures specific to the analysis suite configured. See below for platform-specific information. If a run's analysisExpected field is false and you are expecting analysis information for that run, please see Troubleshooting Analysis Output.

If disabled, the analysisExpected field of the run data will always be set to false, and the pipelineRuns field of the run data will always be an empty array.

About Pipeline Status

The pipelineRuns field of the run data is an array of data structures specific to the analysis suite, called a pipeline run. Each pipeline run represents one attempt at running an entire analysis suite. Each individual pipeline run has a pipelineStatus which will be one of the following values:

  • INCOMPLETE: Run Scanner has not found all of the information it needs to serve analysis data yet. Scanning ongoing.
  • COMPLETE: Run Scanner has either found all of the information it needs to serve analysis data, or has evidence of a failure. Inspect the data within the pipeline run for more information.
  • UNSUPPORTED: The only analysis configured is analysis for which Run Scanner does not currently have support.
  • SCAN_ERROR: Run Scanner has found a file which it wants to parse but there is a problem with the file contents. This is an error which requires developer attention. For more information, please see Pipeline run has a pipelineStatus of SCAN_ERROR

About Workflow Run Status

The pipelineRuns field of the run data is an array of data structures specific to the analysis suite, called a pipeline run. Each pipeline run contains workflowRuns, which is an array of data structures each representing a workflow within an analysis suite.

Each individual workflow run has a workflowRunStatus which will be one of the following values:

  • PENDING: Run Scanner has not found all of the information it needs to serve analysis data yet. Scanning ongoing.
  • COMPLETED: Run Scanner has found all of the information it needs to serve analysis data.
  • FAILED: Run Scanner has evidence of a failure of the workflow run.

Illumina DRAGEN

Requirements: * NovaSeq X Plus * DRAGEN >= 4.1.7

Partial support is available for parsing and serving analysis data generated on-instrument by the Illumina DRAGEN secondary analysis suite. This partial support has been tested against output from the NovaSeq X Plus. Other Illumina sequencers which offer DRAGEN secondary analysis may be supported, but these are untested.

Only the workflows listed below are supported by Run Scanner at this time. Pipeline runs which are only configured to run workflows that Run Scanner does not yet support will be marked with a pipelineStatus of UNSUPPORTED. Pipeline runs which are configured to run a combination of workflows which Run Scanner does and does not support will have only the supported workflows scanned.

The NovaSeq X Plus supports requeueing a finished sequencing run for DRAGEN analysis multiple times. This information is served in the run data as the attempt field of an individual pipeline run. Please note that by Illumina's convention, this number does not necessarily start at 0. If a run that has been scanned by Run Scanner is requeued for analysis on-instrument, Run Scanner will not serve the new analysis data until the cache for the run is invalidated. Please see the API Docs for more information.

Samplesheet

The DRAGEN Samplesheet is required to queue analysis on-instrument, and Run Scanner uses this same file to determine which workflows to find in the run directory. It also uses the information in this file to determine if a workflow is COMPLETE: if sufficient information is not present for all samples represented in the samplesheet, Run Scanner determines the workflow is PENDING.

BCLConvert

The BCLConvert workflow is parsed and served in units of: * Sample name * Barcode index 1 * Barcode index 2 * Lane * Override Cycles

Each unit has two fastq files, one per read, consisting of: * Format (fastq) * Filesystem path * crc32 Checksum * File size in bytes * Creation and modification times (Currently these will always be the same time) * Read count * Read number (1 or 2)

Run Scanner requires that fastq files generated by BCLConvert stay extant. Some downstream DRAGEN workflows have an option to remove BCLConvert's fastq files once completed. Run Scanner is not compatible with this option, and BCLConvert will be marked as FAILED if this option is enabled.