Lesson 1Data validation rules: duplicates, referential integrity (customers/products), out-of-range values, negative prices/quantitiesSet up strong validation rules to keep your sales CSVs reliable. You'll spot duplicates, ensure links between customers and products hold up, and catch out-of-range or negative values before they mess up your dashboards and other models.
Detecting duplicate orders and order linesChecking referential integrity keysValidating numeric ranges and thresholdsHandling negative prices and quantitiesBuilding reusable validation checklistsLesson 2Understanding column semantics: order_id, order_date, customer_id, customer_region, product_id, product_category, product_subcategory, quantity, unit_price, discount, revenue, cost, channelGet clear on what each key sales column means for dashboards. You'll map out IDs, dates, product details, quantities, and money fields to keep meanings consistent across your models and charts.
Order-level identifiers and grain selectionCustomer and region identification fieldsProduct, category, and subcategory rolesQuantity, unit_price, discount, and revenueCost, channel, and margin-related fieldsLesson 3Handling discounts and price calculations: recomputing revenue from unit_price, quantity, and discount and reconciling with reported revenueLearn to recalculate and check revenue and price figures properly. You'll work out line revenue from unit price, quantity, and discount, match it against totals, and flag any issues for fixing.
Revenue formulas from unit_price and quantityApplying percentage and absolute discountsReconciling computed and reported revenueDetecting inconsistent discount patternsDocumenting pricing and discount logicLesson 4Time-based transformations: extracting year, quarter, month, week, weekday, rolling windows, and fiscal calendarsSee how to turn order dates into useful time features for analysis. You'll pull out calendar and fiscal details, set up rolling windows, and prep time fields ready for dashboards and time-series work.
Extracting year, quarter, month, and weekDeriving weekday and weekend indicatorsBuilding rolling and moving window metricsImplementing fiscal calendars and offsetsAligning time grains for dashboardsLesson 5Data cleaning transformations: trimming, case normalisation, standardising region and channel labelsDive into practical cleaning steps to make raw sales CSVs consistent and ready for analysis. You'll trim extra spaces, fix case issues, and standardise region and channel names to dodge duplicates and faulty filters.
Trimming whitespace and invisible charactersCase normalization for text dimensionsStandardizing region and channel taxonomiesMerging near-duplicate label variantsDocumenting cleaning rules for reuseLesson 6Derived metrics and transformations: profit = revenue - cost, profit_margin = profit / revenue, gross_margin, AOV = revenue / order_count, unit_total = quantity * unit_pricePick up how to calculate key sales metrics from raw CSV data. You'll figure profit, margins, AOV, and unit totals, making sure formulas match business needs and are well-noted.
Computing profit and gross marginCalculating profit_margin safelyDeriving AOV from revenue and ordersUnit totals from quantity and unit_priceAligning metrics with business definitionsLesson 7Techniques for reproducible ETL: documented steps, scripts, notebooks, and checksums for CSV import integrityFind out how to build repeatable ETL pipelines for sales CSVs. You'll script changes, track versions, use notebooks for checks, and add checksums plus validation to ensure imports stay solid over time.
Scripting repeatable CSV transformationsUsing notebooks for exploratory ETLVersioning ETL code and configurationChecksums and file integrity validationAutomated ETL runs and loggingLesson 8Missing values and null patterns: detection methods, imputation strategies, and when to drop rowsGet the hang of spotting and fixing missing or null values in sales CSVs. You'll check null patterns, pick imputation methods, decide on dropping rows, and note assumptions to safeguard your metrics.
Profiling missingness across key columnsVisualizing null patterns and correlationsImputation strategies for numeric fieldsImputation strategies for categorical fieldsRules for safely dropping rows or columnsLesson 9Data types and parsing: date formats, numeric types, categorical encoding, handling string vs numeric valuesLearn to parse dates, numbers, and categories right in sales CSVs. You'll tell text from numbers, use locale-smart parsing, and set up stable categorical codes that hold up on refreshes.
Detecting column data types in CSV importsParsing dates with multiple locale formatsHandling numeric separators and currency symbolsDesigning stable categorical encodingsConverting mixed-type columns safelyLesson 10Dealing with multi-line orders and aggregation at order vs order-line levelHandle orders spread over multiple lines in sales CSVs. You'll separate order from line level, aggregate properly, and avoid counting revenue, quantity, or discounts twice in dashboards.
Identifying order vs order-line grainAggregating revenue at order levelSummarizing discounts across linesAvoiding double counting in rollupsChoosing grain for dashboard metrics