Lesson 1Dimensionality reduction and selection: mutual information, correlation filtering, recursive feature elimination, stability selectionWe examine methods to reduce feature dimensionality and select robust predictors, using mutual information, correlation filtering, recursive feature elimination, and stability selection tailored to noisy, high-frequency equity features.
Mutual information for feature rankingCorrelation and redundancy filteringRecursive feature elimination workflowsStability selection across time splitsBalancing parsimony and performanceLesson 2Cross-sectional and universe features: z-scores, rank transforms, winsorization, normalization (demeaning vs scaling)We design cross-sectional and universe-level transformations, including z-scores, ranks, winsorization, and normalization, to make features comparable across stocks and robust to outliers and scale differences.
Cross-sectional z-scoring of featuresRank transforms and percentile scoresWinsorization and outlier handlingDemeaning by sector or universeScaling for heterogeneous price levelsLesson 3Feature construction sanity checks: leakage avoidance, lookahead prevention, timestamp hygieneThis section focuses on sanity checks that prevent data leakage in feature construction, including correct lagging, lookahead prevention, timestamp alignment, and ensuring that only information available at prediction time is used.
Identifying and preventing label leakageCorrect lagging of features and targetsTimestamp alignment and market calendarsHandling corporate actions and revisionsBacktest-time vs trade-time informationLesson 4Momentum and mean-reversion signals: short-window momentum, RSI, and interpretation for 1–10 day horizonsThis section develops short-horizon momentum and mean-reversion signals, including short-window momentum and RSI, and explains how to interpret, normalize, and combine them for 1–10 day equity forecasts.
Defining short-window momentum signalsRSI settings for 1–10 day tradingMean-reversion in intraday gapsCombining momentum and reversal cuesRegime-dependent signal behaviorLesson 5Price-based technical indicators: simple and exponential moving averages, MACD, Bollinger Bands — parameter choices for short horizonsThis section covers short-horizon price-based indicators, focusing on how to parameterize moving averages, MACD, and Bollinger Bands for 1–10 day forecasts, and how to avoid overfitting when tuning lookback windows and thresholds.
Choosing SMA and EMA lookback windowsShort-horizon MACD parameter settingsBollinger Bands for 1–10 day signalsCombining overlapping price indicatorsAvoiding overfitting in indicator tuningLesson 6Volume and liquidity features: On-Balance Volume, volume-weighted average price (VWAP) approximations, turnover and bid-ask spread proxiesThis section develops volume and liquidity features such as OBV, VWAP approximations, turnover, and bid–ask spread proxies, explaining how they capture order flow, trading frictions, and short-term price impact for equity forecasts.
On-Balance Volume for short-term flowApproximating VWAP from daily dataTurnover and dollar volume measuresBid–ask spread and illiquidity proxiesFiltering illiquid and microcap stocksLesson 7Market-relative features: stock minus SPY returns, beta estimation over rolling windows, residuals from market modelThis section builds market-relative features such as excess returns over SPY, rolling beta estimates, and residuals from a market model, clarifying how to separate idiosyncratic signals from broad market moves.
Stock minus index excess returnsRolling beta estimation choicesSingle-factor market model residualsUsing sector and style factor controlsHedging market exposure in signalsLesson 8Volatility and risk features: rolling realized volatility, GARCH basics for short horizons, intraday volatility proxies from OHLC (range-based estimators)Here we build volatility and risk features suited to 1–10 day horizons, including rolling realized volatility, simple GARCH-style measures, and range-based estimators from OHLC data, with emphasis on robustness and microstructure noise.
Rolling realized volatility constructionGARCH-style short-horizon volatilityRange-based OHLC volatility estimatorsVolatility scaling of returns and signalsHandling jumps and volatility clusteringLesson 9Feature stability: autocorrelation, information coefficient (IC) computation, decay of predictive power over horizonWe analyze feature stability over time using autocorrelation, information coefficient (IC), and decay curves, learning how predictive power changes with horizon and how to monitor signal degradation in production.
Autocorrelation of features and signalsCross-sectional IC and rank ICIC decay across forecast horizonsRolling IC and t-stat monitoringDetecting regime shifts in featuresLesson 10Lagged returns and cumulative returns: constructing 1, 5, 10-day targets and predictors, log vs arithmetic returnsWe construct lagged and cumulative return features and targets for 1, 5, and 10 day horizons, comparing log and arithmetic returns, compounding conventions, and how to align predictors with forecast windows.
Defining 1, 5, and 10 day return targetsLog vs arithmetic returns trade-offsRolling cumulative return featuresOverlapping vs non-overlapping windowsAdjusting for dividends and splits