Drop fastq.gz files here.

They must contain _R1* and _R2* and will be processed as pairs if multiple are uploaded.
Select a method: 


       

     

FASTQ SubSampling and Filtering Pipeline Overview

  • Accepts paired-end FASTQ files (.fastq.gz format)
  • Automatically pairs R1 and R2 files based on naming convention
  • Supports multiple sample processing in batch
  • File size validation and compression optimization

  • Bowtie2 alignment against M. tuberculosis H37Rv reference genome
  • Local alignment with dovetail option for better sensitivity
  • Extracts only M.tb-specific sequences
  • Generates unaligned reads for non-M.tb sequences
  • Maintains paired-end read integrity

  • Automatic file size monitoring (98MB threshold)
  • Seqtk-based random subsampling with fixed seed (1987)
  • Adaptive read count reduction for optimal file sizes
  • Maintains paired-end read synchronization
  • 7z compression for efficient storage

  • Processed FASTQ files with consistent naming
  • Optional M.tb-filtered sequences
  • Unaligned reads (when filtering is applied)
  • Compressed output for efficient transfer
  • Results delivered via email notification