Lesson 1Design of transactional tables: orders, order_items, returns, lifetime_value signals an' field choicesLearn how fi design core transactional tables dat capture orders, line items, returns, an' lifetime value signals. We discuss key fields, normalization choices, an' how fi support downstream analytics an' recommendation workloads effectively.
Order header vs line item schema designModeling returns, refunds, and cancellationsCapturing discounts, coupons, and taxesStoring lifetime value and margin signalsKeys, indexes, and partitioning choicesLesson 2Handling noisy an' sparse behavioral data: sessionization, bot filtering, deduplication, event weightingExplore techniques fi clean noisy behavioral logs an' make sparse data usable. Yuh will learn sessionization rules, bot an' scraper filtering, deduplication logic, an' event weighting strategies tailored to recommendation training fi better accuracy.
Sessionization rules and timeoutsDetecting and filtering bots and scrapersClick, view, and purchase deduplicationEvent weighting for model trainingHandling sparse users and cold startsLesson 3Design of product catalog table: product_id, title, category hierarchy, attributes, price, brand, stock, images, canonical_text, embeddingsLearn how fi structure a product catalog table dat supports fast retrieval an' rich recommendations. We cover identifiers, attributes, pricing, stock, media, canonical text, an' embeddings, plus strategies fi updates an' denormalization fi efficiency.
Stable product and variant identifiersCategory hierarchy and attributesPrice, stock, and availability fieldsImages, media, and canonical textStoring and updating item embeddingsLesson 4Feature engineering principles fi recommendations: recency, frequency, monetary, item popularity, category affinity, user embeddingsDiscover core feature engineering principles fi recommender systems. We detail recency, frequency, monetary value, popularity, category affinity, an' user embeddings, includin' aggregation windows an' leakage-safe computation patterns fi reliability.
Recency, frequency, and monetary featuresItem and category popularity signalsUser–category and brand affinity scoresSequence‑based and session featuresUser and item embedding generationLesson 5Auxiliary datasets: item metadata, taxonomy, promotions, content (descriptions), supplier dataUnderstand how auxiliary datasets enrich recommendations beyond raw clicks an' orders. We cover item metadata, taxonomy, promotions, content, an' supplier feeds, plus how fi keep dem consistent, versioned, an' joinable at scale fi seamless integration.
Designing item metadata schemasMaintaining product taxonomy hierarchiesModeling promotions and price rulesStoring rich content and descriptionsIntegrating supplier and feed dataLesson 6Data cleaning an' imputation strategies: missing attributes, price anomalies, invalid timestampsLearn practical data cleaning an' imputation methods fi e-commerce. We address missing attributes, anomalous prices, invalid timestamps, an' inconsistent currencies, focusin' on rules, heuristics, an' impact on recommendation quality fi clean data.
Detecting and fixing missing attributesHandling outlier and zero pricesCorrecting invalid or noisy timestampsCurrency, tax, and unit normalizationDocumenting cleaning rules and impactsLesson 7Design of event stream an' interaction table: event_id, user_id/session_id, event_type, product_id, timestamp, context (referrer, page_type), device, event_valueDesign a unified interaction table an' event stream dat captures user behavior across channels. Learn event schemas, identifiers, context fields, an' how fi support both real-time streaming an' offline batch recommendation pipelines fi smooth flow.
Choosing event and user identifiersModeling event types and propertiesCapturing context, device, and referrerEvent time, ingestion time, and orderingStreaming vs batch storage patternsLesson 8Design of user profiles table: essential fields (user_id, signup_ts, email_hash, demographics, lifecycle stage, segments, opt-in flags) an' rationaleDesign a user profiles table dat balances personalization power wid privacy an' compliance. We cover essential fields, lifecycle an' segments, opt-in flags, hashing sensitive data, an' how profiles feed recommendation models fi secure personalization.
Core identifiers and signup metadataDemographics and lifecycle stagesBehavioral and marketing segmentsConsent, opt‑in, and preference flagsPrivacy, hashing, and retention rules