Lesson 1Creating derived fields: rolling demand (7/30/90-day), lead time deviation (actual - standard), shipments per unit, cost per unit shippedThis section covers creating derived metrics that enhance supply chain analysis. You will compute rolling demand windows, lead time deviations, shipment intensity, and cost per unit shipped for monitoring and modelling, using examples relevant to local trade.
Rolling 7/30/90-day demand calculationsLead time deviation and variability metricsShipments per order line and per unitCost per unit shipped and per laneStoring derived fields for reuse and auditsLesson 2Normalising categorical fields: supplier_id, transport_mode, warehouse_id; mapping synonyms and encoding for analysisThis section explains how to normalise categorical fields such as supplier, transport mode, and warehouse identifiers. You will map synonyms, standardise codes, and encode categories for modelling and reporting, considering Ugandan naming conventions.
Standardising supplier and warehouse IDsCleaning transport_mode and route labelsBuilding synonym and alias mapping tablesHandling slowly changing categorical valuesEncoding categories for ML and BI toolsLesson 3Reading large CSVs reliably in Excel, Python (pandas) and R (data.table/readr): parsing dates and typesThis section shows how to reliably read large CSVs in Excel, Python, and R without corrupting types or dates. You will handle delimiters, encodings, memory limits, chunked loading, and schema definitions for repeatable ingestion in resource-constrained settings.
Configuring delimiters, headers, and encodingsParsing dates, times, and time zones correctlyControlling column types in pandas and RChunked and incremental CSV loadingDealing with Excel row limits and crashesLesson 4Outlier detection for time series and cross-sectional fields: z-score, IQR, rolling median filters, and domain thresholdsThis section teaches methods to detect and handle outliers in time series and cross-sectional supply chain data. You will apply z-score, IQR, rolling statistics, and domain thresholds, then choose to cap, correct, or exclude values, adapted for African market data.
Visual screening of outliers in time seriesZ-score and modified z-score approachesIQR fences and robust spread measuresRolling median and rolling MAD filtersDomain-based thresholds and capping rulesLesson 5Data consistency checks: duplicate rows, negative quantities, mismatched units/currencies, date continuity per SKU-warehouseThis section focuses on validating transactional consistency in supply chain CSVs. You will detect duplicate records, negative or impossible quantities, unit and currency mismatches, and gaps or overlaps in SKU–warehouse time series, vital for Ugandan imports.
Detecting and resolving duplicate rowsFlagging negative or impossible quantitiesValidating units of measure and conversionsChecking currency codes and FX alignmentEnsuring date continuity per SKU–warehouseLesson 6Time zone and business calendar adjustments: handling holidays, cutoffs, and business days for lead time calculationsThis section covers aligning timestamps with business calendars so lead times, service levels, and cutoffs are computed on comparable business days. You will adjust for weekends, regional holidays, and warehouse-specific operating schedules, including Ugandan public holidays.
Standardising time zones across systemsBuilding business day and holiday calendarsModelling shipping and order cutoff timesConverting calendar days to business daysLead time calculation examples in PythonLesson 7Column semantics & metadata mapping: interpreting date, SKU, warehouse, supplier, demand, forecast, inventory, shipments, lead times, costs, flagsThis section focuses on defining column semantics and metadata for supply chain CSVs. You will map fields to business concepts, document units and grain, and create dictionaries that support governance and reuse, suited to local supply chains.
Identifying grain: SKU, location, and timeDefining business meaning of key columnsDocumenting units, currencies, and calendarsCreating and maintaining data dictionariesTagging quality flags and status indicatorsLesson 8Automated data profiling: distributions, missingness matrix, unique counts, value ranges, cardinalityThis section introduces automated profiling of supply chain CSVs. You will compute distributions, missingness matrices, unique counts, ranges, and cardinality to quickly assess data quality and prioritise cleaning work in Ugandan contexts.
Generating summary statistics at scaleVisualising missingness matrices and heatmapsAnalysing value ranges and out-of-bounds dataCardinality checks for keys and categoriesAutomated profiling with pandas and R toolsLesson 9Detecting and handling missing values: imputation strategies per column (demand, forecast, inventory, costs) and when to drop rowsThis section explains how to analyse and treat missing values in demand, forecast, inventory, and cost fields. You will compare imputation options, design column-specific rules, and decide when dropping rows or segments is safer for local data.
Profiling missingness patterns and mechanismsSimple and advanced numeric imputationsImputing categorical and flag variablesColumn-specific rules for supply chain fieldsCriteria for dropping rows or time segments