Lesson 1Schema Validation: Required Fields, Data Types, Date Parsing, and Timezone HandlingLearn how to set and enforce good schemas for order-level data, checking needed fields, data types, and date formats while properly handling time zones, late data, and changes in schemas from different sources in our Liberian systems.
Defining required order-level fieldsChecking numeric and string data typesParsing dates and timestamps safelyStandardizing time zones and offsetsCatching schema drift and evolutionAutomated schema checks in pipelinesLesson 2Documenting Data Lineage and Assumptions for Reproducibility and AuditabilityLearn how to record data lineage, business rules, and modeling assumptions for retail order pipelines, making it easy to repeat, govern, and audit across teams, tools, and changing source systems in Liberia.
Capturing source-to-target mappingsRecording business transformation rulesTracking metric definitions over timeMaintaining data dictionariesVersioning pipelines and schemasAudit trails for regulatory reviewsLesson 3Loading CSVs into Analytical Tools and Environment Setup (Excel, SQL, Python, R, BI Tools)Get practical skills for loading CSV order files into Excel, SQL databases, Python, R, and BI tools, setting up encodings, delimiters, data types, and project environments to ensure repeatable, scalable analytical work in Liberian retail.
Configuring CSV import optionsManaging encodings and delimitersBulk loading into SQL warehousesPython and R data ingestion scriptsConnecting BI tools to raw tablesVersioning and environment managementLesson 4Temporal Derivations: Extracting Date Parts, Rolling Windows, Fiscal Calendars, Week/Month BoundariesExplore ways to pull temporal features from order timestamps, including calendar details, fiscal periods, rolling windows, and custom week or month boundaries that match Liberian retail trading patterns and reporting needs.
Extracting standard date partsBuilding fiscal calendars and periodsCustom retail week and month boundariesRolling windows for KPIsLag and lead features for ordersSeasonality and holiday flagsLesson 5Data Partitioning and Sampling for Efficient Exploration and Reproducible AnalysisLearn how to divide and sample big retail order datasets for quick exploration, model building, and testing, while keeping temporal structure, seasonality, and key business parts for repeatable analytical tests in Liberia.
Partitioning by date and storeTrain, validation, and test splitsStratified sampling by segmentDownsampling and upsampling tacticsCreating reproducible random samplesManaging partitions in data warehousesLesson 6Detecting and Handling Missing Values: Strategies and Imputation Specific to Transactional DataLearn steady methods to find, check, and fix missing values in transactional retail data, picking right imputation or exclusion ways that keep revenue, quantity, and customer behavior signals without biasing analyses in our markets.
Profiling missingness patternsMCAR, MAR, and MNAR in retail dataImputing prices, discounts, and costsHandling missing customer identifiersDealing with incomplete order linesDocumenting imputation decisionsLesson 7Outlier Detection and Treatment for Price, Quantity, Discount, and Revenue FieldsLearn to find, diagnose, and fix outliers in price, quantity, discount, and revenue fields, telling data errors from real extreme behavior to protect model stability and business reporting accuracy in Liberian retail.
Profiling distributions and extremesRule-based outlier thresholdsStatistical and robust detection methodsSeparating errors from rare eventsCapping, trimming, and winsorizingMonitoring outliers over timeLesson 8Standardizing Categorical Fields: Region, Product_Category, Product_Subcategory, Marketing_Channel, Device_TypeLearn how to make key categorical attributes in retail orders standard so regions, product groups, marketing channels, and device types are consistent, analyzable, and ready for segmentation, attribution, and performance reporting in Liberia.
Designing canonical code listsNormalizing region and market labelsStandardizing product category hierarchiesCleaning marketing_channel valuesHarmonizing device_type and platformHandling legacy and deprecated valuesLesson 9Creating Derived Fields: Gross_Margin, Margin_Rate, Average_Order_Value, Unit_Cost, Order_Value ComponentsMaster making core financial and behavioral derived metrics from order data, including gross margin, margin rate, average order value, unit costs, and broken-down order value parts that support profitability and pricing analysis in our shops.
Calculating gross_margin and net_revenueComputing margin_rate and markupsAverage_order_value and basket sizeUnit_cost and unit_price derivationsDecomposing order_value componentsValidating derived metric consistency