Lesson 1Observability stack: metrics, distributed tracing, structured logging, SLOs and alerting designDesign an observability stack that combines metrics, tracing, and structured logs with SLOs and alerts, enabling fast incident detection, root-cause analysis, and continuous improvement of reliability and performance.
Key metrics and RED/USE methodologiesDistributed tracing and trace samplingStructured logging and correlation IDsSLOs, SLIs, and error budget policiesAlert design, routing, and runbooksLesson 2Designing a modular-monolith-first migration plan (bounded contexts, vertical slicing, API boundaries)Define a modular-monolith-first roadmap, using bounded contexts, vertical slices, and clear API boundaries to reduce risk, enable parallel work, and prepare the codebase for selective, low-friction extraction into services later.
Identifying domains and bounded contextsVertical slicing of features and workflowsDefining internal and external API boundariesStrangler patterns inside a monolithGovernance for shared libraries and modulesLesson 3Introducing AI features: capability scoping, model hosting, inference architecture, data governance and safety guardrailsScope AI capabilities, choose hosting and inference architectures, and define data governance and safety guardrails so AI features add value while respecting privacy, compliance, and reliability constraints.
Prioritizing AI use cases and ROIModel selection and hosting optionsOnline, batch, and streaming inferencePrompt, feature, and embedding storesData governance, safety, and red-teamingLesson 4CI/CD and release engineering: automated pipelines, environment promotion, and deployment strategies (blue-green, canary)Design CI/CD pipelines, environment promotion flows, and deployment strategies such as blue-green and canary, enabling frequent, low-risk releases with strong controls, observability, and rollback mechanisms.
Pipeline stages and quality gatesBuild artifacts and dependency hygieneEnvironment promotion and approvalsBlue-green and canary deploymentsProgressive delivery and rollbacksLesson 5Service decomposition criteria, interface patterns, and backward compatibility strategiesEstablish criteria for splitting services, choose interface patterns, and design for backward compatibility so teams can evolve contracts safely, avoid breaking changes, and support gradual client migration across releases.
Business and technical drivers for decompositionService size, cohesion, and coupling heuristicsSynchronous versus asynchronous interfacesVersioning and contract evolution patternsFeature flags and compatibility shimsLesson 6Automated testing strategy: unit, integration, contract testing, end-to-end testing, and test-data managementDefine an automated testing strategy that balances unit, integration, contract, and end-to-end tests, with robust test data management, to keep delivery fast while protecting reliability and enabling safe refactoring.
Test pyramid and coverage goalsUnit and component test practicesIntegration and contract testingEnd-to-end and nonfunctional testsTest data, fixtures, and environmentsLesson 7Data storage strategy: multi-tenant schema patterns, row/DB-per-tenant tradeoffs, and hybrid isolation criteriaDefine a data storage strategy for multi-tenant systems, comparing shared schemas, row-per-tenant, and database-per-tenant models, and using hybrid isolation to balance security, noisy neighbours, cost, and operational complexity.
Tenant identification and routing designShared schema and row-per-tenant patternsDatabase-per-tenant pros and consHybrid isolation and tiered tenantsBackup, restore, and tenant migrationLesson 8Comparing architecture options: monolith, modular monolith, and incremental microservices migrationCompare monolith, modular monolith, and microservices options, and define an incremental migration path that aligns with team maturity, product risk, and scaling needs instead of following trends blindly.
Monolith strengths, limits, and anti-patternsModular monolith design and governanceWhen microservices are justifiedIncremental extraction and coexistenceArchitecture decision records and reviewsLesson 9Cloud approach: multi-account/tenant isolation, region strategy, and cost/availability tradeoffsPlan a cloud architecture that uses accounts, regions, and isolation boundaries to manage tenants, resilience, and compliance while controlling cost, optimizing availability, and simplifying operations and incident response.
Multi-account patterns and guardrailsTenant isolation and blast radius limitsRegion selection and latency tradeoffsDisaster recovery and failover designCost allocation, tagging, and showback