Lesson 1Schema Validation: Required Fields, Data Types, Date Parsing, and Timezone HandlingLearn how to set up and apply solid schemas for order-level data, checking must-have fields, data types, and date formats while properly managing time zones, delayed data arrivals, and schema changes from various source systems in a reliable way.
Defining must-have order-level fieldsChecking number and text data typesSafely reading dates and timestampsStandardising time zones and adjustmentsSpotting schema changes and growthAutomatic schema checks in data flowsLesson 2Documenting Data Lineage and Assumptions for Reproducibility and AuditabilityLearn to record data origins, business rules, and modelling assumptions for retail order processes, allowing easy repetition, control, and checking across teams, tools, and changing source systems to keep everything transparent.
Recording source-to-target mappingsNoting business change rulesTracking measure definitions over timeKeeping data dictionaries up to dateVersioning processes and schemasAudit paths for regulatory checksLesson 3Loading CSVs into Analytical Tools and Environment Setup (Excel, SQL, Python, R, BI Tools)Gain hands-on skills for importing CSV order files into Excel, SQL databases, Python, R, and business intelligence tools, setting up encodings, separators, data types, and project spaces to ensure repeatable, scalable analysis workflows that work well in resource-limited settings.
Setting up CSV import choicesHandling encodings and separatorsBulk loading into SQL storagePython and R data import scriptsLinking BI tools to raw tablesVersioning and space managementLesson 4Temporal Derivations: Extracting Date Parts, Rolling Windows, Fiscal Calendars, Week/Month BoundariesDiscover methods to pull time features from order timestamps, including calendar details, financial periods, moving windows, and custom week or month limits that match retail trading habits and reporting needs in local contexts.
Pulling standard date partsBuilding financial calendars and periodsCustom retail week and month limitsMoving windows for key measuresLag and lead features for ordersSeason patterns and holiday markersLesson 5Data Partitioning and Sampling for Efficient Exploration and Reproducible AnalysisLearn to divide and sample large retail order datasets for quick exploration, model building, and testing, while keeping time structure, seasonal patterns, and key business groups intact for repeatable analysis experiments that save time and resources.
Dividing by date and shopTraining, validation, and test splitsLayered sampling by groupDownsampling and upsampling methodsCreating repeatable random samplesManaging divisions in data storageLesson 6Detecting and Handling Missing Values: Strategies and Imputation Specific to Transactional DataLearn organised ways to find, examine, and fix missing values in transactional retail data, picking suitable filling or removal strategies that keep revenue, quantity, and customer behaviour signals strong without skewing analyses in practical scenarios.
Examining missing patternsMCAR, MAR, and MNAR in retail dataFilling prices, discounts, and costsHandling missing customer IDsDealing with incomplete order linesRecording filling choicesLesson 7Outlier Detection and Treatment for Price, Quantity, Discount, and Revenue FieldsLearn to find, diagnose, and fix outliers in price, quantity, discount, and revenue fields, telling data mistakes from real extreme actions to protect model steadiness and business reporting accuracy in everyday retail operations.
Examining distributions and extremesRule-based outlier limitsStatistical and strong detection methodsSeparating errors from rare eventsCapping, trimming, and winsorisingWatching outliers over timeLesson 8Standardising Categorical Fields: Region, Product Category, Product Subcategory, Marketing Channel, Device TypeLearn to make key category attributes in retail orders uniform so regions, product levels, marketing paths, and device types are consistent, easy to analyse, and ready for grouping, linking, and performance reporting in diverse markets.
Designing standard code listsNormalising region and market labelsStandardising product category levelsCleaning marketing channel valuesHarmonising device type and platformHandling old and outdated valuesLesson 9Creating Derived Fields: Gross Margin, Margin Rate, Average Order Value, Unit Cost, Order Value ComponentsMaster making core financial and behaviour derived measures from order data, including gross margin, margin rate, average order value, unit costs, and broken-down order value parts that aid profitability and pricing analysis for better business decisions.
Calculating gross margin and net revenueComputing margin rate and markupsAverage order value and basket sizeUnit cost and unit price derivationsBreaking down order value partsValidating derived measure consistency