Lesson 1Differential expression analysis: DESeq2, edgeR, limma-voom — model design, contrasts, an multiple-testing correctionDis section details differential expression workflows using DESeq2, edgeR, an limma-voom, focusing pon model design, contrasts, dispersion estimation, an multiple-testing correction to obtain reliable gene lists an effect size estimates.
Designing experimental models an covariatesSetting contrasts fi complex comparisonsRunning DESeq2 end-to-end workflowUsing edgeR an limma-voom pipelinesMultiple-testing correction an FDR controlInterpreting log2 fold changes an shrinkageLesson 2Data organization an file naming conventions: sample sheets, raw/processed separation, consistent identifiersDis section describes best practices fi organizing RNA‑seq project files, including sample sheets, directory layouts, raw versus processed data separation, an consistent identifiers dat simplify scripting, tracking, an reproducibility.
Designing a clear directory hierarchySeparating raw an processed dataCreating robust sample sheets an metadataConsistent sample an library identifiersVersioning reference genomes an indicesBacking up an archiving project dataLesson 3Gene-level quantification strategies: featureCounts, htseq-count, tximport fi transcript-to-gene summarizationDis section explains gene-level quantification from aligned or pseudo-aligned reads, comparing featureCounts an htseq-count, an detailing how tximport aggregates transcript-level estimates into robust gene-level matrices fi downstream statistical analysis.
Counting reads wid featureCounts optionsUsing htseq-count modes an annotationsHandling strandedness an multimapping readsImporting Salmon an kallisto wid tximportBuilding gene-level count matricesAssessing quantification quality an coverageLesson 4Tools fi data download an organization: SRA Toolkit (prefetch/fastq-dump), ENA FTP/Aspera, wget/rsync, an recommended inputs/outputsDis section covers reliable strategies fi downloading an organizing RNA‑seq data, focusing pon SRA Toolkit, ENA access, command-line transfer tools, an defining consistent input an output structures dat support automation an reproducibility.
Using SRA Toolkit prefetch an fasterq-dumpAccessing ENA via FTP an AsperaDownloading wid wget an rsync safelyChoosing raw an processed file formatsDocumenting download metadata an checksumsAutomating downloads wid scripts an logsLesson 5Quality control tools an outputs: FastQC, MultiQC, key metrics to inspect (per-base quality, adapter content, duplication, GC)Dis section focuses pon RNA‑seq quality control, using FastQC an MultiQC to summarize key metrics such as per-base quality, adapter contamination, duplication, an GC content, an to decide whether trimming or resequencing is required.
Running FastQC pon raw an trimmed readsInterpreting per-base quality profilesDetecting adapters an overrepresented sequencesEvaluating duplication an GC contentAggregating reports wid MultiQCDefining QC thresholds an actionsLesson 6Read trimming an filtering: when to trim, tools (Trim Galore/Cutadapt/fastp), main parameters an outputsDis section explains when an how to trim RNA‑seq reads, covering adapter an quality trimming, length filtering, an key parameters in tools such as Trim Galore, Cutadapt, an fastp, while avoiding over-trimming dat harms downstream analyses.
Deciding whether trimming is necessaryAdapter detection an removal strategiesQuality-based trimming thresholdsMinimum length an complexity filtersUsing Trim Galore an Cutadapt optionsFastp fi integrated QC an trimmingLesson 7Basic downstream analyses: GO/KEGG enrichment (clusterProfiler), GSEA preranked, pathway visualization, an gene set selectionDis section introduces downstream functional analyses after differential expression, including GO an KEGG enrichment wid clusterProfiler, preranked GSEA, pathway visualization, an principled strategies fi selecting an filtering gene sets.
Preparing ranked gene lists fi GSEAGO an KEGG enrichment wid clusterProfilerChoosing appropriate gene set databasesVisualizing enriched pathways an networksFiltering an prioritizing gene setsReporting functional results reproduciblyLesson 8High-level pipeline layout: data download, QC, trimming, alignment/pseudo-alignment, quantification, differential expression, downstream analysisDis section presents di overall RNA‑seq pipeline structure, from data acquisition an QC through trimming, alignment or pseudo-alignment, quantification, normalization, differential expression, an downstream functional analysis, emphasizing modular, scripted workflows.
Defining pipeline stages an dependenciesPlanning inputs, outputs, an file flowIntegrating QC, trimming, an alignmentLinking quantification to DE analysisConnecting DE to enrichment workflowsDocumenting di pipeline wid diagramsLesson 9Normalization an exploratory data analysis: TPM/FPKM limits, DESeq2 normalization, PCA, sample-sample distance heatmapsDis section covers normalization an exploratory analysis of RNA‑seq data, discussing limitations of TPM an FPKM, DESeq2-based normalization, variance stabilization, principal component analysis, an sample distance heatmaps fi detecting batch effects.
Limitations of TPM an FPKM measuresDESeq2 size factors an normalizationVariance-stabilizing an rlog transformsPrincipal component analysis of samplesSample-sample distance heatmapsDetecting batch effects an outliersLesson 10Basic visualization best practices: MA plots, volcano plots, heatmaps, pathway dotplots, an interactive report options (R Markdown, Jupyter)Dis section introduces effective visualization strategies fi RNA‑seq results, emphasizing clear communication of differential expression, sample structure, an pathway changes using static plots an interactive, reproducible reports built in R Markdown or Jupyter.
Constructing an interpreting MA plotsDesigning clear volcano plots fi DE genesBuilding publication-quality heatmapsPathway dotplots fi enrichment resultsInteractive R Markdown RNA-seq reportsJupyter-based exploratory visualizationLesson 11Alignment vs pseudo-alignment: STAR, HISAT2, Salmon, kallisto — tradeoffs an outputs (BAM, transcript/genecounts)Dis section compares alignment-based tools such as STAR an HISAT2 wid pseudo-alignment tools like Salmon an kallisto, highlighting tradeoffs in speed, accuracy, resource use, an outputs including BAM files an transcript or gene-level counts.
When to choose STAR or HISAT2 alignersConfiguring genome indexes an annotationsUsing Salmon in quasi-mapping modeRunning kallisto fi rapid quantificationComparing BAM an quant.sf style outputsBenchmarking speed, memory, an accuracy