Lesson 1Setup of Deal Tables: Orders, Order Parts, Returns, Long-Term Value Signs and Field PicksLearn to build main deal tables that catch orders, line parts, returns, and long-term value signs. We talk key fields, normal choices, and how to help later analysis and recommendation tasks for smooth shop work.
Order header vs line item schema designModeling returns, refunds, and cancellationsCapturing discounts, coupons, and taxesStoring lifetime value and margin signalsKeys, indexes, and partitioning choicesLesson 2Dealing with Messy and Thin Behavior Data: Session Grouping, Bot Cleaning, Duplicate Removal, Event WeighingLook into ways to clean messy behavior logs and make thin data useful. You will learn session rules, bot and scraper cleaning, duplicate logic, and event weighing plans fit for recommendation training in busy markets.
Sessionization rules and timeoutsDetecting and filtering bots and scrapersClick, view, and purchase deduplicationEvent weighting for model trainingHandling sparse users and cold startsLesson 3Setup of Product Stock Table: Product ID, Title, Category Tree, Traits, Price, Brand, Stock, Pictures, Standard Text, EmbeddingsLearn to structure a product stock table for quick pulls and rich recommendations. We cover IDs, traits, prices, stock, media, standard text, and embeddings, plus update and non-normal plans for fast access.
Stable product and variant identifiersCategory hierarchy and attributesPrice, stock, and availability fieldsImages, media, and canonical textStoring and updating item embeddingsLesson 4Feature Building Rules for Recommendations: Newness, Oftenness, Money Value, Item Fame, Category Link, User EmbeddingsFind main feature building rules for recommender systems. We detail newness, oftenness, money value, fame, category link, and user embeddings, with group windows and leak-safe count patterns for true results.
Recency, frequency, and monetary featuresItem and category popularity signalsUser–category and brand affinity scoresSequence‑based and session featuresUser and item embedding generationLesson 5Extra Data Sets: Item Details, Category System, Deals, Content (Descriptions), Supplier InfoUnderstand how extra data sets make recommendations better than just clicks and orders. We cover item details, category system, deals, content, and supplier feeds, plus keeping them steady, versioned, and joinable at big scale.
Designing item metadata schemasMaintaining product taxonomy hierarchiesModeling promotions and price rulesStoring rich content and descriptionsIntegrating supplier and feed dataLesson 6Data Cleaning and Fill Strategies: Missing Traits, Price Oddities, Wrong TimesLearn real data cleaning and fill methods for e-commerce. We handle missing traits, odd prices, wrong times, and uneven money types, focusing on rules, guesses, and effect on recommendation quality for clean work.
Detecting and fixing missing attributesHandling outlier and zero pricesCorrecting invalid or noisy timestampsCurrency, tax, and unit normalizationDocumenting cleaning rules and impactsLesson 7Setup of Event Flow and Interaction Table: Event ID, User ID/Session ID, Event Type, Product ID, Time Stamp, Context (Referrer, Page Type), Device, Event ValueBuild a united interaction table and event flow that catches user actions across ways. Learn event plans, IDs, context fields, and how to help both live streaming and off-batch recommendation lines for full coverage.
Choosing event and user identifiersModeling event types and propertiesCapturing context, device, and referrerEvent time, ingestion time, and orderingStreaming vs batch storage patternsLesson 8Setup of User Profiles Table: Key Fields (User ID, Signup Time, Email Hash, People Stats, Life Stage, Groups, Opt-In Signs) and ReasonsBuild a user profiles table that balances personal power with privacy and rules. We cover key fields, life stages and groups, opt-in signs, hashing touchy data, and how profiles feed recommendation models safely.
Core identifiers and signup metadataDemographics and lifecycle stagesBehavioral and marketing segmentsConsent, opt‑in, and preference flagsPrivacy, hashing, and retention rules