Lesson 1Data validation rules: duplicates, referential integrity (customers/products), out-of-range values, negative prices/quantitiesSet up strong validation rules to ensure sales CSVs remain trustworthy. You'll spot duplicates, check referential integrity, and mark out-of-range or negative values before they spoil dashboards and further models.
Detecting duplicate orders and order linesChecking referential integrity keysValidating numeric ranges and thresholdsHandling negative prices and quantitiesBuilding reusable validation checklistsLesson 2Understanding column semantics: order_id, order_date, customer_id, customer_region, product_id, product_category, product_subcategory, quantity, unit_price, discount, revenue, cost, channelGrasp the meaning and purpose of key sales columns for dashboards. You'll map identifiers, dates, product details, quantities, and money fields, ensuring uniform meaning across models and visuals.
Order-level identifiers and grain selectionCustomer and region identification fieldsProduct, category, and subcategory rolesQuantity, unit_price, discount, and revenueCost, channel, and margin-related fieldsLesson 3Handling discounts and price calculations: recomputing revenue from unit_price, quantity, and discount and reconciling with reported revenueLearn to recalculate and verify revenue and price metrics accurately. You'll compute line revenue from unit price, quantity, and discount, match with reported totals, and flag mismatches for checking.
Revenue formulas from unit_price and quantityApplying percentage and absolute discountsReconciling computed and reported revenueDetecting inconsistent discount patternsDocumenting pricing and discount logicLesson 4Time-based transformations: extracting year, quarter, month, week, weekday, rolling windows, and fiscal calendarsMaster transforming order dates into useful time features for analysis. You'll extract calendar and fiscal details, create rolling windows, and prepare steady time fields for dashboards and time-series models.
Extracting year, quarter, month, and weekDeriving weekday and weekend indicatorsBuilding rolling and moving window metricsImplementing fiscal calendars and offsetsAligning time grains for dashboardsLesson 5Data cleaning transformations: trimming, case normalisation, standardising region and channel labelsDive into practical cleaning steps to make raw sales CSVs consistent and ready for analysis. You'll trim extra spaces, normalise case, and standardise region and channel labels to prevent duplicates and faulty dashboard filters.
Trimming whitespace and invisible charactersCase normalization for text dimensionsStandardizing region and channel taxonomiesMerging near-duplicate label variantsDocumenting cleaning rules for reuseLesson 6Derived metrics and transformations: profit = revenue - cost, profit_margin = profit / revenue, gross_margin, AOV = revenue / order_count, unit_total = quantity * unit_priceLearn to calculate essential sales metrics from raw CSV data. You'll figure profit, margins, AOV, and unit totals, making sure formulas are consistent, well-noted, and match business definitions.
Computing profit and gross marginCalculating profit_margin safelyDeriving AOV from revenue and ordersUnit totals from quantity and unit_priceAligning metrics with business definitionsLesson 7Techniques for reproducible ETL: documented steps, scripts, notebooks, and checksums for CSV import integrityDesign repeatable ETL pipelines for sales CSVs. You'll script changes, track versions, use notebooks for checking, and apply checksums plus validation to ensure import reliability over time.
Scripting repeatable CSV transformationsUsing notebooks for exploratory ETLVersioning ETL code and configurationChecksums and file integrity validationAutomated ETL runs and loggingLesson 8Missing values and null patterns: detection methods, imputation strategies, and when to drop rowsHandle missing or null values in sales CSVs expertly. You'll profile null patterns, pick imputation methods, decide row drops, and note assumptions to safeguard downstream metrics.
Profiling missingness across key columnsVisualizing null patterns and correlationsImputation strategies for numeric fieldsImputation strategies for categorical fieldsRules for safely dropping rows or columnsLesson 9Data types and parsing: date formats, numeric types, categorical encoding, handling string vs numeric valuesParse dates, numbers, and categories in sales CSVs correctly. You'll separate text from numeric fields, use locale-aware parsing, and create stable categorical encodings for repeated refreshes.
Detecting column data types in CSV importsParsing dates with multiple locale formatsHandling numeric separators and currency symbolsDesigning stable categorical encodingsConverting mixed-type columns safelyLesson 10Dealing with multi-line orders and aggregation at order vs order-line levelManage orders spanning multiple lines in sales CSVs. You'll differentiate order and order-line levels, aggregate properly, and avoid double-counting revenue, quantity, and discounts in dashboards.
Identifying order vs order-line grainAggregating revenue at order levelSummarizing discounts across linesAvoiding double counting in rollupsChoosing grain for dashboard metrics