Lesson 1Design of transactional tables: orders, order_items, returns, lifetime_value signals and field choicesMaster designing key purchase tables for orders, items, refunds, and customer value indicators. We cover vital fields, data organisation, and support for analysis and recommendation tasks.
Order header vs line item schema designModeling returns, refunds, and cancellationsCapturing discounts, coupons, and taxesStoring lifetime value and margin signalsKeys, indexes, and partitioning choicesLesson 2Handling noisy and sparse behavioural data: sessionisation, bot filtering, deduplication, event weightingTackle messy and thin user activity logs to make them useful. Learn session grouping, bot removal, duplicate cleanup, and activity weighting for effective recommendation training.
Sessionization rules and timeoutsDetecting and filtering bots and scrapersClick, view, and purchase deduplicationEvent weighting for model trainingHandling sparse users and cold startsLesson 3Design of product catalogue table: product_id, title, category hierarchy, attributes, price, brand, stock, images, canonical_text, embeddingsStructure a product list table for quick access and smart suggestions. Includes IDs, details, prices, stock, pictures, standard text, and embeddings, with update and simplification tips.
Stable product and variant identifiersCategory hierarchy and attributesPrice, stock, and availability fieldsImages, media, and canonical textStoring and updating item embeddingsLesson 4Feature engineering principles for recommendations: recency, frequency, monetary, item popularity, category affinity, user embeddingsKey feature building for recommenders: recent activity, repeat visits, spending, hot items, category likes, and user profiles, with safe timing and computation methods.
Recency, frequency, and monetary featuresItem and category popularity signalsUser–category and brand affinity scoresSequence‑based and session featuresUser and item embedding generationLesson 5Auxiliary datasets: item metadata, taxonomy, promotions, content (descriptions), supplier dataBoost recommendations with extra data like product info, categories, deals, descriptions, and supplier updates. Keep them synced, tracked, and ready for large-scale linking.
Designing item metadata schemasMaintaining product taxonomy hierarchiesModeling promotions and price rulesStoring rich content and descriptionsIntegrating supplier and feed dataLesson 6Data cleaning and imputation strategies: missing attributes, price anomalies, invalid timestampsPractical fixes for e‑commerce data issues like gaps, odd prices, wrong dates, and currency mismatches, using rules and checks to safeguard recommendation performance.
Detecting and fixing missing attributesHandling outlier and zero pricesCorrecting invalid or noisy timestampsCurrency, tax, and unit normalizationDocumenting cleaning rules and impactsLesson 7Design of event stream and interaction table: event_id, user_id/session_id, event_type, product_id, timestamp, context (referrer, page_type), device, event_valueBuild a central activity table and stream for user actions across platforms. Covers event details, IDs, context, and support for live and batch recommendation processes.
Choosing event and user identifiersModeling event types and propertiesCapturing context, device, and referrerEvent time, ingestion time, and orderingStreaming vs batch storage patternsLesson 8Design of user profiles table: essential fields (user_id, signup_ts, email_hash, demographics, lifecycle stage, segments, opt‑in flags) and rationaleCraft user profile tables balancing customisation with privacy rules. Includes key fields, stages, groups, consents, secure hashing, and links to recommendation engines.
Core identifiers and signup metadataDemographics and lifecycle stagesBehavioral and marketing segmentsConsent, opt‑in, and preference flagsPrivacy, hashing, and retention rules