Lesson 1Design of transactional tables: orders, order_items, returns, lifetime_value signals and field choicesMaster creating essential transactional tables to record orders, individual items, refunds, and customer value indicators. We explore vital fields, data organisation options, and supporting analysis and recommendation tasks.
Order header vs line item schema designModeling returns, refunds, and cancellationsCapturing discounts, coupons, and taxesStoring lifetime value and margin signalsKeys, indexes, and partitioning choicesLesson 2Handling noisy and sparse behavioural data: sessionisation, bot filtering, deduplication, event weightingInvestigate ways to purify messy behavioural records and utilise thin data. You will cover session grouping rules, bot removal, duplicate elimination, and event importance adjustments fitted for recommendation preparation.
Sessionization rules and timeoutsDetecting and filtering bots and scrapersClick, view, and purchase deduplicationEvent weighting for model trainingHandling sparse users and cold startsLesson 3Design of product catalogue table: product_id, title, category hierarchy, attributes, price, brand, stock, images, canonical_text, embeddingsUnderstand structuring a product catalogue table for quick access and detailed suggestions. We discuss identifiers, features, costs, availability, visuals, standard text, and vector representations, along with update and simplification tactics.
Stable product and variant identifiersCategory hierarchy and attributesPrice, stock, and availability fieldsImages, media, and canonical textStoring and updating item embeddingsLesson 4Feature engineering principles for recommendations: recency, frequency, monetary, item popularity, category affinity, user embeddingsUncover fundamental feature creation rules for suggestion systems. We explain timeliness, repetition, spending value, item fame, category links, and user vectors, covering grouping periods and safe calculation methods.
Recency, frequency, and monetary featuresItem and category popularity signalsUser–category and brand affinity scoresSequence‑based and session featuresUser and item embedding generationLesson 5Auxiliary datasets: item metadata, taxonomy, promotions, content (descriptions), supplier dataGrasp how extra datasets enhance suggestions past basic interactions. We include item details, classification systems, deals, descriptions, and provider information, plus maintaining consistency, versions, and large-scale connections.
Designing item metadata schemasMaintaining product taxonomy hierarchiesModeling promotions and price rulesStoring rich content and descriptionsIntegrating supplier and feed dataLesson 6Data cleaning and imputation strategies: missing attributes, price anomalies, invalid timestampsAcquire hands-on data purification and filling techniques for e-commerce. We tackle absent features, odd prices, wrong times, and currency mismatches, emphasising guidelines, shortcuts, and effects on suggestion standards.
Detecting and fixing missing attributesHandling outlier and zero pricesCorrecting invalid or noisy timestampsCurrency, tax, and unit normalizationDocumenting cleaning rules and impactsLesson 7Design of event stream and interaction table: event_id, user_id/session_id, event_type, product_id, timestamp, context (referrer, page_type), device, event_valueCreate a combined interaction table and event flow capturing user actions across platforms. Learn event outlines, identifiers, background fields, and aiding live streams and batch suggestion processes.
Choosing event and user identifiersModeling event types and propertiesCapturing context, device, and referrerEvent time, ingestion time, and orderingStreaming vs batch storage patternsLesson 8Design of user profiles table: essential fields (user_id, signup_ts, email_hash, demographics, lifecycle stage, segments, opt-in flags) and rationaleBuild a user profiles table balancing customisation with privacy rules. We cover key fields, life stages, groups, consent markers, data securing, and how profiles supply suggestion models.
Core identifiers and signup metadataDemographics and lifecycle stagesBehavioral and marketing segmentsConsent, opt‑in, and preference flagsPrivacy, hashing, and retention rules