Lesson 1Data validation rules: duplicates, referential integrity (customers/products), out-of-range values, negative prices/quantitiesSet up strong validation rules to ensure sales CSVs remain reliable. You'll spot duplicates, maintain referential integrity, and identify out-of-range or negative values to prevent errors in dashboards and further analyses.
Spotting duplicate orders and linesVerifying referential integrity keysChecking numeric ranges and limitsManaging negative prices and quantitiesCreating reusable validation listsLesson 2Understanding column semantics: order_id, order_date, customer_id, customer_region, product_id, product_category, product_subcategory, quantity, unit_price, discount, revenue, cost, channelGrasp the meaning and purpose of essential sales columns in dashboards. You'll map identifiers, dates, product details, quantities, and financial fields, ensuring uniform semantics across models and visuals.
Order identifiers and grain choiceCustomer and region fieldsProduct, category, subcategory rolesQuantity, unit price, discount, revenueCost, channel, margin fieldsLesson 3Handling discounts and price calculations: recomputing revenue from unit_price, quantity, and discount and reconciling with reported revenueLearn to recalculate and verify revenue and pricing metrics. You'll compute line revenue using unit price, quantity, and discount, match it with reported totals, and highlight discrepancies for checking.
Revenue formulas from unit price and quantityApplying percentage and fixed discountsMatching calculated and reported revenueSpotting irregular discount patternsRecording pricing and discount rulesLesson 4Time-based transformations: extracting year, quarter, month, week, weekday, rolling windows, and fiscal calendarsTransform order dates into detailed time features for analysis. You'll extract calendar and fiscal details, create rolling windows, and prepare consistent time fields for dashboards and time-series work.
Extracting year, quarter, month, weekDeriving weekday and weekend markersCreating rolling and moving metricsSetting up fiscal calendars and shiftsMatching time levels for dashboardsLesson 5Data cleaning transformations: trimming, case normalization, standardizing region and channel labelsApply practical cleaning to make raw sales CSVs consistent and ready for analysis. You'll remove extra spaces, standardise case, and unify region and channel labels to prevent duplicates and filter issues.
Removing whitespace and hidden charactersStandardising case for text fieldsUnifying region and channel categoriesCombining similar label variationsRecording cleaning rules for reuseLesson 6Derived metrics and transformations: profit = revenue - cost, profit_margin = profit / revenue, gross_margin, AOV = revenue / order_count, unit_total = quantity * unit_priceDerive important sales metrics from basic CSV data. You'll calculate profit, margins, AOV, and unit totals, ensuring formulas are consistent, documented, and match business standards.
Calculating profit and gross marginSafely computing profit marginDeriving AOV from revenue and ordersUnit totals from quantity and priceMatching metrics to business needsLesson 7Techniques for reproducible ETL: documented steps, scripts, notebooks, and checksums for CSV import integrityDesign repeatable ETL pipelines for sales CSVs. You'll script changes, track versions, use notebooks for testing, and apply checksums and validations to ensure data integrity over time.
Scripting repeatable CSV changesUsing notebooks for ETL explorationVersioning ETL code and settingsChecksums for file validationAutomated ETL runs and recordsLesson 8Missing values and null patterns: detection methods, imputation strategies, and when to drop rowsHandle missing or null values in sales CSVs effectively. You'll profile patterns, select imputation methods, decide on dropping rows, and document choices to safeguard downstream metrics.
Profiling missing data in columnsVisualising null patterns and linksImputation for numeric fieldsImputation for category fieldsGuidelines for dropping rows safelyLesson 9Data types and parsing: date formats, numeric types, categorical encoding, handling string vs numeric valuesParse dates, numbers, and categories correctly in sales CSVs. You'll separate text from numbers, use locale-sensitive parsing, and create stable categorical codes for ongoing refreshes.
Detecting data types in CSV loadsParsing dates in various formatsHandling numeric separators and symbolsCreating stable category codesSafely converting mixed columnsLesson 10Dealing with multi-line orders and aggregation at order vs order-line levelManage orders spanning multiple lines in sales CSVs. You'll differentiate order and line levels, aggregate properly, and prevent double-counting revenue, quantity, and discounts in dashboards.
Distinguishing order vs line levelsAggregating revenue at order levelSumming discounts across linesAvoiding double counts in summariesSelecting level for dashboard metrics