$ dnbc4tools vdj run -h
usage: dnbc4tools vdj run [OPTIONS]
optional arguments:
-h, --help show this help message and exit
Input Files:
Choose ONE input method: either --fastqs (directory) OR individual FASTQ files (-1 and -2).
--fastqs <DIR> Input directory containing paired-end FASTQ files. The pipeline automatically detects Read1/Read2 files. Example: ./fastq_dir
-1, --fastq1 <FILE> [<FILE> ...]
Read1 FASTQ file(s) (supports wildcards and comma-separated lists). Example: sample1_L01_R1.fastq.gz,sample1_L02_R1.fastq.gz
-2, --fastq2 <FILE> [<FILE> ...]
Read2 FASTQ file(s) (supports wildcards and comma-separated lists). Must match --fastq1 order. Example: sample1_L01_R2.fastq.gz,sample1_L02_R2.fastq.gz
Basic Settings:
-n, --name <STR> Unique identifier for the sample (e.g., sample1). Used for naming output files and reports.
-r, --ref <REF> Reference database: 'human'/'mouse' (case-insensitive) or path to a custom reference directory containing reference.json. Examples: human | mouse | ./custom_vdj_ref
-c, --chain <STR> VDJ receptor type: 'IG' (B-cell receptors) or 'TR' (T-cell receptors).
-o, --outdir <DIR> Output directory for results and reports [default: current directory]. Example: ./output
-t, --threads <INT> Number of CPU threads for parallel processing [default: all available cores] (e.g., 16).
-s, --beadstrans <FILE>
RNA analysis singlecell.csv file for filtering cells and merging beads information. When not provided, all cells will be kept by default (equivalent to --keep_all_cells).
Library Settings:
Auto-detection is recommended for dark cycles. Available modes include "R1" and "unset".
For multiple files, ensure consistent settings across all inputs.
customize: Specify sequence structure patterns for parsing.
--darkreaction <STR> Dark cycle setting for VDJ library [default: auto]. Use 'R1' if dark cycles occur in Read1; otherwise leave as 'auto' or 'unset'.
--customize <STR> Sequence structure patterns, format: <type>,<read>:<start>-<end> separated by ';'. Types include: cb (cell barcode), umi (UMI) R1/R2 (sequence). Example:
"cb,R1:1-10;cb,R1:11-20;umi,R1:21-30;R1,R1:31-120;R2,R2:1-150"
--enrichment_primers <FILE>
Custom inner enrichment primers file (one primer sequence per line). Required when using a custom reference database.
Analysis Settings:
--keep_all_cells Keep all cells in analysis without RNA data filtering. If --beadstrans is not provided, this behavior is enabled by default.
--r2_only Only use R2 reads for VDJ assembly. Manual setting required because Read1 assembly requirements cannot be auto-detected.
--sample_read_pairs <INT>
Subsample the specified number of read pairs from the input FASTQ files (e.g., 1000000).
β οΈ Essential parameters that must be specified for a successful analysis
-n, --name (Required)Provide a unique name for this analysis run.
Default: None
Example:
--name sample_VDJ_001
-r, --ref (Required)Specify the reference database to be used for VDJ analysis.
human) and mouse (mouse).reference.json file can be provided.Default: None
Examples:
# Use the built-in human reference database
--ref human
# Use a custom reference database
--ref ./custom_vdj_ref
-c, --chain (Required)Specify the type of immune receptor to be analyzed.
TR: T-cell Receptor, for T-cell studies.IG: Immunoglobulin, for B-cell studies.Default: None
Examples:
# Analyze T-cell receptors
--chain TR
# Analyze B-cell receptors
--chain IG
π Choose one input method: Directory-based OR specify individual files
--fastqs (Method 1)Specify the path to the directory containing all FASTQ files.
--fastq1 / --fastq2.Default: None
Example:
--fastqs ./VDJ_fastq_dir
-1, --fastq1 (Method 2A)Specify one or more Read1 FASTQ files for the VDJ library individually.
*) to match files or a comma-separated list for multiple files.--fastq2 parameter, and the file order must match exactly.Default: None
Example:
--fastq1 sample1_L01_R1.fastq.gz,sample1_L02_R1.fastq.gz
-2, --fastq2 (Method 2B)Specify one or more Read2 FASTQ files for the VDJ library individually.
*) to match files or a comma-separated list for multiple files.--fastq1 parameter, and the file order must match exactly.Default: None
Example:
--fastq2 sample1_L01_R2.fastq.gz,sample1_L02_R2.fastq.gz
β οΈ Input Method Selection:
- πΈ Method 1: Use
--fastqsto specify a directory containing paired files.- πΈ Method 2: Use
-1, --fastq1and-2, --fastq2to specify R1 and R2 files respectively.
β οΈ Important Note: All files under a parameter must come from the same library, with consistent sequencing mode and dark reaction settings. Data from different libraries cannot be merged for analysis.
-o, --outdir (Optional)Specify the output directory for all analysis results and reports.
Default: ./ (current directory)
Example:
--outdir ./VDJ_analysis_output
-t, --threads (Optional)Set the number of CPU threads to be used during the analysis.
Default: Use all available CPU cores
Example:
--threads 16
-s, --beadstrans (Optional)Provide the singlecell.csv file from a scRNA analysis for cell filtering and information integration.
singlecell.csv output file from a 5' scRNA analysis of the same sample.--keep_all_cells).Default: None
Example:
--beadstrans ./RNA_analysis_output/outs/singlecell.csv
--darkreaction (Optional)Configure the dark cycle settings for the VDJ library.
R1 (dark cycle in Read1) or unset (no dark cycle).Default: auto
Example:
# Dark cycle present in Read1
--darkreaction R1
β οΈ Important Note: Incorrect settings may lead to cell barcode identification failure. Specify manually only if you know the library structure or if auto-detection fails.
--customize (Advanced)Precisely define the extraction structure for barcodes, UMIs, and effective sequences (reads) for non-standard libraries. This is an advanced feature that overrides --darkreaction settings.
"<type>,<read>:<start>-<end>", with multiple segments separated by semicolons (;).
cb (cell barcode), umi (UMI), R1/R2 (effective sequence).Example:
# Example of a standard VDJ library configuration
--customize "cb,R1:1-10;cb,R1:11-20;umi,R1:21-30;R1,R1:31-120;R2,R2:1-150"
β οΈ Risk Warning: Incorrect custom configurations can lead to data loss or analysis failure. Use only when standard configurations do not meet your needs.
--enrichment_primers (Optional)Specify a file containing internal enrichment primers for VDJ region-specific amplification.
Default: None
Example file content:
GTCCTCGGTGGCCTCCACGTG
AGCACCTGGGGCCTCGGCCAC
CCTGGACTCCTGGGCCCCAG
--keep_all_cells (Flag)Enable this parameter to retain all detected cells without filtering based on RNA data.
--beadstrans parameter is not provided. It is suitable for standalone VDJ analysis or when maximizing cell recovery is desired.Default: Not set (but enabled by default if --beadstrans is absent)
--r2_only (Flag)Enable this parameter to use only Read2 sequences for VDJ assembly.
Default: Not set
--sample_read_pairs (Optional)Extract a specified number of read pairs from the input FASTQ files for analysis.
Default: None (uses all data)
Example:
--sample_read_pairs 10000000
π‘ Tip
This document is continuously updated. If you find any errors or have information to add, your feedback is welcome.
π Document Version: 3.0 beta | Last Updated: 2025
𧬠DNBelab C Series HT scVDJ Analysis Software
High-performance single-cell immune repertoire data analysis pipeline