Hands On FullStack Development

Hands On FullStack Development

Week 3 : Server Fleet Management

Jun 24, 2026
∙ Paid

🗄️ 52 FAANG Interview Question Vault — $299 one-time

All 52 walkthroughs + 52 drill cards + 6 framework cheatsheets + the Senior+ System Design Playbook. Everything, immediately. Lifetime access with all future updates included.

→ [VAULT CHECKOUT LINK]

What We Are Building

This lesson delivers a production-grade server fleet control plane for InfraWatch:

  1. Inventory domain — Rich server models with lifecycle state, health telemetry, tags, dependencies, and audit history

  2. Operations API — CRUD with pagination, soft-delete, advanced search, bulk actions, groups, and templates

  3. Validation layer — IP/hostname checks, TCP connectivity, HTTP health, SSL expiry, and network discovery

  4. Credential plane — SSH key generation, encrypted storage, rotation, and connection testing

  5. Live console — Datadog-style React dashboard with real-time WebSocket health updates


Core Concept: Inventory as the Source of Truth

Every metrics pipeline, alert rule, and automation workflow in a platform like Datadog or PagerDuty starts with one question: which servers exist right now, and are they reachable? Server inventory is not a spreadsheet—it is the authoritative graph that downstream systems query before they act.

The insight most tutorials skip: lifecycle state and health state are different axes. A server can be active in provisioning terms but degraded in health terms. Mixing them into one status field breaks autoscaling logic (which cares about lifecycle) and paging logic (which cares about health). InfraWatch Fleet keeps both dimensions explicit.


Where This Fits in the Overall System

Authentication Foundation     identity gate for all API calls
        │
Server Fleet Management       ◀ this lesson
        │
Metrics Collection            time-series from known inventory
        │
Alerts & Notifications        rules evaluated per server identity

You are building the inventory spine that metrics agents attach to and alert engines evaluate against. Without it, alerts fire on hostnames nobody registered—or worse, on hosts that were decommissioned last week.


Component Architecture

Presentation — Dark-theme React console: fleet overview metrics, searchable inventory grid, validation tools, credential manager. Polls REST every 15 seconds; WebSocket pushes health deltas instantly.

API — Flask blueprints under /api/v1. Thin routes delegate to services: server_service, health_service, ssh_service, bulk_service.

State — PostgreSQL stores servers, tags, groups, templates, audit logs, SSH keys. Redis caches health snapshots and increments check counters.

Runtime — Background health monitor runs every 30 seconds. Docker Compose orchestrates Postgres 16, Redis 7, API, and UI.

User's avatar

Continue reading this post for free, courtesy of System Design Roadmap.

Or purchase a paid subscription.
© 2026 System Design Roadmap · Privacy ∙ Terms ∙ Collection notice
Start your SubstackGet the app
Substack is the home for great culture