Lesson 1When to kill, restart, or throttle a process: safe kill practices, systemctl restart, and using cgroups and nice/reniceKnow when to end, reboot, or slow a process and do it safely. Learn signal kinds, safe kill ways, systemctl restart actions, and how to use cgroups and nice or renice to curb effects in Zambia.
Choosing SIGTERM, SIGKILL, and othersUsing kill and pkill with safeguardsRestarting services with systemctlThrottling CPU with nice and reniceLimiting resources using cgroupsDocumenting and automating remediesLesson 2Analysing swap usage and OOM events: dmesg, kernel OOM killer logs, and /var/log/kern.logLook into swap use and Out Of Memory events with free, dmesg, kernel OOM logs, and /var/log/kern.log. Spot thrashing signs, tune swappiness, and choose when to add RAM or tweak limits in Zambian servers.
Checking swap usage with free and /procRecognising swap thrashing symptomsReading dmesg for OOM killer entriesParsing /var/log/kern.log detailsTuning swappiness and vm overcommitDeciding when to add RAM or adjust limitsLesson 3Identifying hot processes: ps, ps aux --sort, pgrep, pidstat and mapping PIDs to servicesQuickly find hot or badly behaving processes using ps, pgrep, pidstat, and sort options. Link PIDs to services, units, and containers to tie resource use to culprits in Zambian environments.
Sorting ps output by CPU and memoryUsing pgrep and pkill name filtersMonitoring per-process stats with pidstatMapping PIDs to systemd unitsRelating PIDs to containers or cgroupsTracking short-lived bursty processesLesson 4Identifying recurring resource spikes: inspecting cron, systemd timers, at jobs, and application schedulersFind ways to detect repeating CPU, memory, and I/O spikes by matching metrics to scheduled tasks. Check cron, systemd timers, at jobs, and app schedulers to fix noisy or clashing jobs in Zambia.
Listing and reading user and system crontabsInspecting systemd timers and calendar unitsReviewing at jobs and one-off schedulesTracing app-level schedulers and workersCorrelating spikes with job execution timesRefining or staggering noisy recurring jobsLesson 5Memory troubleshooting: free, /proc/meminfo, smem, pmap and checking for memory leaksBuild skills to fix memory woes using free, /proc/meminfo, smem, and pmap. Tell cache from true pressure, find per-process use, and spot signs of memory leaks or fragmentation in Zambian systems.
Interpreting free and available memoryReading /proc/meminfo key fieldsUsing smem for per-process breakdownsInspecting process maps with pmapSpotting memory leak growth patternsDifferentiating cache from real pressureLesson 6Integrating with monitoring data (Prometheus, Grafana) and using historical metrics to determine trendsCombine local fixing with Prometheus and Grafana info. Use past metrics, dashboards, and alerts to spot trends, slips, and slow changes, and check performance fix impacts in Zambian monitoring.
Reviewing key CPU and load dashboardsInspecting memory, cache, and swap panelsAnalysing disk and network latency graphsUsing PromQL to slice historical metricsCorrelating deploys with metric changesValidating fixes with before and after viewsLesson 7Load vs CPU saturation: uptime, load average interpretation and relation to CPU coresClear up system load averages and their tie to CPU cores and run queues. Tell good high load from CPU overload, and match load with I/O wait, switches, and delay in Zambian contexts.
Reading uptime and load averagesRelating load to CPU core countsSeparating runnable and blocked tasksIdentifying CPU-bound saturation casesRecognising I/O wait driven loadUsing vmstat and mpstat to confirmLesson 8Collecting live system metrics: top, htop, vmstat, mpstat, iostat and how to interpret outputsGather and make sense of live Linux performance metrics with top, htop, vmstat, mpstat, and iostat. Grasp CPU, memory, and I/O views, main fields, refresh times, and spot jams in real time for Zambia.
Reading CPU usage in top and htopMonitoring memory and swap in topUsing vmstat for system-wide snapshotsAnalysing CPU stats with mpstatChecking disk I/O patterns with iostatChoosing sampling intervals and filtersLesson 9Using perf, strace, and ltrace for deep process analysis and when to use eachKnow when and how to use perf, strace, and ltrace for in-depth process checks. Profile CPU hot spots, trace calls, inspect library calls, and cut overhead while grabbing useful diagnostics in Zambian apps.
Profiling CPU hotspots with perf recordViewing perf reports and call graphsTracing syscalls with strace safelyFiltering noisy strace outputInspecting library calls using ltraceChoosing the right tool for each symptomLesson 10Using lightweight profiling and tracing tools (py-spy, gdb, flamegraphs) for Python appsTarget light profiling and tracing for Python apps with py-spy, gdb, and flamegraphs. Grab stack samples in production, find hot code paths, and read flamegraphs without halting services in Zambia.
Sampling Python stacks with py-spyGenerating and reading flamegraphsAttaching gdb safely to live PythonHandling stripped or optimized buildsProfiling async and multithreaded codeReducing profiler overhead in production