Lesson 1Differential expression analysis: DESeq2, edgeR, limma-voom — model design, contrasts, and multiple-testing correctionThis part details different expression works using DESeq2, edgeR, and limma-voom, zeroing on model setup, contrasts, spread guessing, and multiple-test fixing to get sure gene lists and effect size guesses.
Designing experimental models and covariatesSetting contrasts for complex comparisonsRunning DESeq2 end-to-end workflowUsing edgeR and limma-voom pipelinesMultiple-testing correction and FDR controlInterpreting log2 fold changes and shrinkageLesson 2Data organization and file naming conventions: sample sheets, raw/processed separation, consistent identifiersThis part describes best ways for sorting RNA-seq project files, like sample sheets, folder setups, raw against processed data split, and steady IDs that make scripting, tracking, and repeat-ability easy.
Designing a clear directory hierarchySeparating raw and processed dataCreating robust sample sheets and metadataConsistent sample and library identifiersVersioning reference genomes and indicesBacking up and archiving project dataLesson 3Gene-level quantification strategies: featureCounts, htseq-count, tximport for transcript-to-gene summarizationThis part explains gene-level counting from matched or fake-matched reads, comparing featureCounts and htseq-count, and showing how tximport pulls transcript-level guesses into strong gene-level tables for later stats work.
Counting reads with featureCounts optionsUsing htseq-count modes and annotationsHandling strandedness and multimapping readsImporting Salmon and kallisto with tximportBuilding gene-level count matricesAssessing quantification quality and coverageLesson 4Tools for data download and organization: SRA Toolkit (prefetch/fastq-dump), ENA FTP/Aspera, wget/rsync, and recommended inputs/outputsThis part covers sure ways for getting and sorting RNA-seq data, zeroing on SRA Toolkit, ENA getting, command-line move tools, and setting steady input and output setups that back auto and repeat-ability.
Using SRA Toolkit prefetch and fasterq-dumpAccessing ENA via FTP and AsperaDownloading with wget and rsync safelyChoosing raw and processed file formatsDocumenting download metadata and checksumsAutomating downloads with scripts and logsLesson 5Quality control tools and outputs: FastQC, MultiQC, key metrics to inspect (per-base quality, adapter content, duplication, GC)This part zeros on RNA-seq quality check, using FastQC and MultiQC to sum key measures like per-base quality, adapter dirt, copies, and GC content, and to decide if trimming or re-sequencing is needed.
Running FastQC on raw and trimmed readsInterpreting per-base quality profilesDetecting adapters and overrepresented sequencesEvaluating duplication and GC contentAggregating reports with MultiQCDefining QC thresholds and actionsLesson 6Read trimming and filtering: when to trim, tools (Trim Galore/Cutadapt/fastp), main parameters and outputsThis part explains when and how to trim RNA-seq reads, covering adapter and quality trimming, length filter, and key settings in tools like Trim Galore, Cutadapt, and fastp, while dodging over-trimming that hurts later works.
Deciding whether trimming is necessaryAdapter detection and removal strategiesQuality-based trimming thresholdsMinimum length and complexity filtersUsing Trim Galore and Cutadapt optionsFastp for integrated QC and trimmingLesson 7Basic downstream analyses: GO/KEGG enrichment (clusterProfiler), GSEA preranked, pathway visualization, and gene set selectionThis part brings in later function works after different expression, like GO and KEGG rich with clusterProfiler, pre-ranked GSEA, pathway showing, and right ways for picking and filtering gene sets.
Preparing ranked gene lists for GSEAGO and KEGG enrichment with clusterProfilerChoosing appropriate gene set databasesVisualizing enriched pathways and networksFiltering and prioritizing gene setsReporting functional results reproduciblyLesson 8High-level pipeline layout: data download, QC, trimming, alignment/pseudo-alignment, quantification, differential expression, downstream analysisThis part shows the whole RNA-seq line setup, from data getting and QC through trimming, matching or fake-matching, counting, normal-izing, different expression, and later function work, stressing modular, scripted flows.
Defining pipeline stages and dependenciesPlanning inputs, outputs, and file flowIntegrating QC, trimming, and alignmentLinking quantification to DE analysisConnecting DE to enrichment workflowsDocumenting the pipeline with diagramsLesson 9Normalization and exploratory data analysis: TPM/FPKM limits, DESeq2 normalization, PCA, sample-sample distance heatmapsThis part covers normal-izing and explore work of RNA-seq data, talking limits of TPM and FPKM, DESeq2-based normal-izing, change steadying, main part analysis, and sample distance heat maps for spotting batch effects.
Limitations of TPM and FPKM measuresDESeq2 size factors and normalizationVariance-stabilizing and rlog transformsPrincipal component analysis of samplesSample-sample distance heatmapsDetecting batch effects and outliersLesson 10Basic visualization best practices: MA plots, volcano plots, heatmaps, pathway dotplots, and interactive report options (R Markdown, Jupyter)This part brings in good showing ways for RNA-seq outcomes, stressing clear talking of different expression, sample setup, and pathway changes using still plots and interactive, repeat-able reports built in R Markdown or Jupyter.
Constructing and interpreting MA plotsDesigning clear volcano plots for DE genesBuilding publication-quality heatmapsPathway dotplots for enrichment resultsInteractive R Markdown RNA-seq reportsJupyter-based exploratory visualizationLesson 11Alignment vs pseudo-alignment: STAR, HISAT2, Salmon, kallisto — tradeoffs and outputs (BAM, transcript/genecounts)This part compares match-based tools like STAR and HISAT2 with fake-match tools like Salmon and kallisto, pointing out trade-offs in speed, rightness, resource use, and outputs like BAM files and transcript or gene-level counts.
When to choose STAR or HISAT2 alignersConfiguring genome indexes and annotationsUsing Salmon in quasi-mapping modeRunning kallisto for rapid quantificationComparing BAM and quant.sf style outputsBenchmarking speed, memory, and accuracy