Systems / Production Engineer

Building reliable, real-time distributed systems for production environments.

I support high-performance platforms across Linux, Kubernetes, AWS, Kafka, and observability stacks. My work spans automation, incident response, production engineering, and performance tuning for data-intensive systems.

  • 7+ years supporting production systems
  • 99.99% uptime across Kubernetes and AWS workloads
  • Strong in Python, Bash, Kafka, Prometheus, Grafana, and Linux

Core Focus

Site reliability, cloud operations, observability, and low-latency distributed systems.

45%

MTTR reduction through automation

60%

On-call paging reduction

1M+

Events/day in data pipelines

100K+

Users supported in production

About

Operationally calm, technically deep, and outcome focused.

I am a systems and production engineer with experience supporting high-performance, real-time distributed systems on Linux, Kubernetes, and AWS. I build operational tooling in Python and Bash, investigate complex incidents, and work closely with engineers, traders, and research teams to improve platform reliability, speed, and resilience.

My background combines software engineering, infrastructure, automation, observability, and production support. I enjoy turning noisy operational problems into repeatable systems, robust runbooks, and measurable reliability gains.

Skills

Technologies and production strengths

SRE & Production Engineering

Incident response, on-call support, root-cause analysis, SLO/SLI design, runbooks, chaos and load testing, production support during market hours.

Scripting & Automation

Python, Bash, Go (familiar), Ruby, operational tooling, workflow automation, Ansible, Terraform, deployment and recovery scripting.

Linux & Kubernetes

Linux systems administration, Kubernetes debugging, Helm, operators, Docker, systemd, networking, DNS, firewalls, and performance tuning.

Observability & Messaging

Prometheus, Grafana, Splunk, OpenTelemetry, ELK, Apache Kafka, alerting, dashboards, and log aggregation.

Cloud & Data

AWS, Azure, GCP, PostgreSQL, Oracle, MySQL, MongoDB, Redis, CI/CD with Jenkins and GitHub Actions.

Trading Domain

Market data feed handlers, CME MDP 3.0, FIX protocol, tick and bar data, low-latency trading workflows, market microstructure.

Experience

Selected professional experience

Software Engineer · Sprouts AI

Chicago, IL

Mar 2025 – Present
  • Support 20+ production microservices on AWS EKS during live operating hours with 99.99% uptime.
  • Built Python and Bash automation for deployments, log collection, and incident response, reducing MTTR by 45%.
  • Resolved issues across Kafka pipelines and Kubernetes workloads, cutting recurring incidents by 40% through postmortems and platform hardening.

Software Engineer · Resilience Inc

Chicago, IL

Aug 2023 – Mar 2025
  • Operated a real-time market-data pipeline across Kafka, Kubernetes, and PostgreSQL, maintaining sub-second data freshness SLOs during market hours.
  • Authored Python/Bash automation and Kubernetes operators for auto-remediation, reducing on-call paging by 60%.
  • Partnered with traders, quants, and engineers to investigate latency spikes using Prometheus, Grafana, perf, strace, and tcpdump, improving tick-to-trade latency by 30%.
  • Built self-service tooling for safe deployment and monitoring of strategy jobs, eliminating 10+ weekly tickets.

Software Engineer / IT Project Manager · Tata Steel

Remote

Apr 2019 – Aug 2023
  • Operated cloud-native Java microservices on Kubernetes with 99.99% uptime across 100K+ users.
  • Built Kafka and PostgreSQL-backed data pipelines processing 1M+ events/day with Python/Bash runbooks and Ansible playbooks.
  • Led Agile SDLC adoption using Git, Jenkins, JUnit, and Pytest, reducing release lead time by 60% and mentoring engineers on reliability practices.

Projects

Work that reflects my systems mindset

CME MDP 3.0 Multicast Market-Data Feed Handler

C++20 · Low latency · Trading infra

A low-latency feed handler that decodes live CME tick data using lock-free queues, per-instrument threading, and seqlock snapshots to support real-time trading-system workflows.

View on GitHub

TradeSentinel

Observability · Kubernetes · Kafka · Python

An observability platform for trading systems showing order flow, latency distributions, pod health, Kafka lag, and Prometheus alerts, with auto-remediation utilities for common production failures.

View on GitHub

Publication

Research contribution

Comparative Study of Movie Recommendation System Using Feature Engineering and Improved Error Function

IEEE INCOFT 2021 · DOI: 10.1109/INCOFT55651.2022.10094480

Education

Academic background

M.S. in Computer Science

Illinois Institute of Technology

B.Tech in Computer Science

National Institute of Technology, Calicut

Contact

Let’s connect.

I’m open to software engineering, systems, SRE, production engineering, and trading infrastructure opportunities.