Lesson 1Design of Transaction Tables: Orders, Order Items, Returns, Lifetime Value Signals and Field ChoicesLearn how to design core transaction tables that capture orders, line items, returns, and lifetime value signals. We discuss key fields, normalisation choices, and how to support downstream analytics and recommendation workloads for local businesses.
Order header vs line item schema designModelling returns, refunds, and cancellationsCapturing discounts, coupons, and taxesStoring lifetime value and margin signalsKeys, indexes, and partitioning choicesLesson 2Handling Noisy and Sparse Behavioural Data: Sessionisation, Bot Filtering, Deduplication, Event WeightingExplore techniques to clean noisy behavioural logs and make sparse data usable. You will learn sessionisation rules, bot and scraper filtering, deduplication logic, and event weighting strategies tailored to recommendation training in varied data scenarios.
Sessionisation rules and timeoutsDetecting and filtering bots and scrapersClick, view, and purchase deduplicationEvent weighting for model trainingHandling sparse users and cold startsLesson 3Design of Product Catalogue Table: Product ID, Title, Category Hierarchy, Attributes, Price, Brand, Stock, Images, Canonical Text, EmbeddingsLearn how to structure a product catalogue table that supports fast retrieval and rich recommendations. We cover identifiers, attributes, pricing, stock, media, canonical text, and embeddings, plus strategies for updates and denormalisation in local catalogues.
Stable product and variant identifiersCategory hierarchy and attributesPrice, stock, and availability fieldsImages, media, and canonical textStoring and updating item embeddingsLesson 4Feature Engineering Principles for Recommendations: Recency, Frequency, Monetary, Item Popularity, Category Affinity, User EmbeddingsDiscover core feature engineering principles for recommender systems. We detail recency, frequency, monetary value, popularity, category affinity, and user embeddings, including aggregation windows and leakage-safe computation patterns for practical use.
Recency, frequency, and monetary featuresItem and category popularity signalsUser–category and brand affinity scoresSequence-based and session featuresUser and item embedding generationLesson 5Auxiliary Datasets: Item Metadata, Taxonomy, Promotions, Content (Descriptions), Supplier DataUnderstand how auxiliary datasets enrich recommendations beyond raw clicks and orders. We cover item metadata, taxonomy, promotions, content, and supplier feeds, plus how to keep them consistent, versioned, and joinable at scale in regional supply chains.
Designing item metadata schemasMaintaining product taxonomy hierarchiesModelling promotions and price rulesStoring rich content and descriptionsIntegrating supplier and feed dataLesson 6Data Cleaning and Imputation Strategies: Missing Attributes, Price Anomalies, Invalid TimestampsLearn practical data cleaning and imputation methods for e-commerce. We address missing attributes, anomalous prices, invalid timestamps, and inconsistent currencies, focusing on rules, heuristics, and impact on recommendation quality in local currencies.
Detecting and fixing missing attributesHandling outlier and zero pricesCorrecting invalid or noisy timestampsCurrency, tax, and unit normalisationDocumenting cleaning rules and impactsLesson 7Design of Event Stream and Interaction Table: Event ID, User ID/Session ID, Event Type, Product ID, Timestamp, Context (Referrer, Page Type), Device, Event ValueDesign a unified interaction table and event stream that captures user behaviour across channels. Learn event schemas, identifiers, context fields, and how to support both real-time streaming and offline batch recommendation pipelines for diverse devices.
Choosing event and user identifiersModelling event types and propertiesCapturing context, device, and referrerEvent time, ingestion time, and orderingStreaming vs batch storage patternsLesson 8Design of User Profiles Table: Essential Fields (User ID, Signup Timestamp, Email Hash, Demographics, Lifecycle Stage, Segments, Opt-In Flags) and RationaleDesign a user profiles table that balances personalisation power with privacy and compliance. We cover essential fields, lifecycle and segments, opt-in flags, hashing sensitive data, and how profiles feed recommendation models in privacy-conscious settings.
Core identifiers and signup metadataDemographics and lifecycle stagesBehavioural and marketing segmentsConsent, opt-in, and preference flagsPrivacy, hashing, and retention rules