Lesson 1Memory checking: free -m, /proc/meminfo, slabtop, smem—understanding used against available memory and swap actionsIn this part, you will check memory use with free, /proc/meminfo, slabtop, and smem. It covers Linux caching, buffers, and reclaim processes, how to read swap use, and spot memory leaks, fragmentation, and wrong limit settings.
Reading free -m and grasping cached memoryMain fields in /proc/meminfo for checkingUsing slabtop to look at kernel slab useUsing smem to assign memory per processSpotting swap thrashing and OOM dangersLesson 2Network use and blocks: iftop, nload, ss, netstat, ip -s link, tc, tcpdump—finding network overload and bad connectionsThis part covers finding network use and blocks with iftop, nload, ss, ip, tc, and tcpdump. You will learn to spot overload, noisy neighbours, connection states, and packet problems that make applications slow.
Watching live bandwidth with iftop and nloadChecking sockets and states with ssUsing ip -s link to see interface errorsBasics of tc for shaping and rate limitsTargeted packet catch with tcpdumpLesson 3Storage delay and deep input/output: blktrace, bpftrace (simple scripts), fio for tests—measuring and reading delay and throughputThis part covers storage delay and deep input/output check using blktrace, simple bpftrace scripts, and fio tests. You will learn to measure delay and throughput, read queue depth, and tell device limits from workload problems.
Understanding delay, IOPS, and throughputUsing blktrace to check block input/output patternsSimple bpftrace scripts for disk delayMaking fio workloads like productionReading fio reports and spotting blocksLesson 4Process checking: ps, top/htop filters, pgrep, pidstat, nice/renice—finding CPU- and memory-heavy processesYou will learn to check processes with ps, top or htop filters, pgrep, pidstat, and nice or renice. The part shows how to spot CPU and memory heavy tasks, track input/output per process, and adjust priorities to cut competition.
Listing and filtering processes with psUsing pgrep and pkill safely and exactlyUsing pidstat for per process CPU and input/outputFiltering top and htop by user or resourceAdjusting priorities with nice and reniceLesson 5System resource overview: top, htop, vmstat, mpstat, dstat—what each shows and normal output patternsHere you will learn to read whole system resource snapshots using tools like top, htop, vmstat, mpstat, and dstat. The part focuses on grasping CPU, memory, and load measures, and knowing normal against bad use patterns.
Key CPU, load, and memory fields in topUsing htop for interactive process checkvmstat for run queue, swap, and input/output insightmpstat for per-CPU use and steal timedstat for combined multi-resource timelinesLesson 6Disk input/output and filesystem checks: iostat, iotop, sar -d, lsblk, df -h, du -sh, tune2fs, xfs_info—spotting input/output blocks and low spaceThis part focuses on disk input/output and filesystem health using iostat, iotop, sar -d, lsblk, df, du, tune2fs, and xfs_info. You will learn to spot overload, queue build-up, filesystem errors, and low space that harm performance.
Using iostat to spot busy and slow devicesUsing iotop to find input/output heavy processessar -d for past disk use trendsChecking layout and types with lsblk and dfFinding space users with du and inode checksLesson 7System logs and journaling: journalctl (systemd), /var/log/messages, /var/log/syslog, auth logs—what to search and whyThis part explains how to use systemd journalctl and old log files like /var/log/messages, /var/log/syslog, and auth logs. You will learn what patterns to search, how to filter noise, and how logs help main cause check.
journalctl basics and useful filter optionsReading /var/log/messages and /var/log/syslogFinding errors, warnings, and rate-limited eventsChecking auth and sudo related logsLinking log times with incidentsLesson 8Time-based and past monitoring: sar, sysstat, collectl—gathering and reading past measures to link eventsYou will learn how to gather and read past measures using sar, sysstat, and collectl. The part explains how to plan data gathering, read time series reports, and link performance oddities with config changes or setups.
Starting and setting sysstat gatheringUsing sar for CPU, memory, and input/output historyReading sar network and load average trendsUsing collectl for multi-resource timelinesLinking measures with change timesLesson 9Kernel and scheduler insights: dmesg, sysctl -a, /proc/sys/vm parameters—what kernel messages and tunables showHere you will explore kernel and scheduler insights using dmesg, sysctl, and /proc/sys/vm parameters. The part explains how kernel messages, tunables, and scheduler actions show hardware problems, wrong setups, and tuning choices.
Reading dmesg for hardware and driver issuesListing and asking sysctl tunable valuesKey /proc/sys/vm parameters for memoryScheduler related kernel parameters overviewSafely keeping kernel tuning changesLesson 10Way to find main cause: step-by-step choice tree to group issues as CPU, RAM, disk input/output, or networkThis part shows a practical choice tree for main cause check. You will learn how to group incidents as CPU, memory, disk input/output, or network bound, which tools to use in each part, and how to refine ideas using gathered proof.
First triage and problem statementGrouping CPU against input/output bound signsTelling memory pressure from leaksSpotting network against local blocksStep-by-step idea testing with measures