System Performance Engineer at Microsoft
I'm a System Performance Engineer at Microsoft, specializing in benchmarking, profiling, and optimizing large-scale systems that power experiences for millions of users worldwide. My work centers on extracting maximum performance from infrastructure—identifying bottlenecks, measuring what matters, and making systems faster, more efficient, and more reliable.
With deep expertise in performance analysis and benchmarking methodologies, I've optimized critical systems that process billions of requests daily, reducing latency, cutting costs, and improving resource utilization. My approach combines rigorous measurement with data-driven insights to deliver tangible improvements. Beyond performance engineering, I maintain a strong curiosity in machine learning—exploring how ML can inform and enhance performance optimization strategies.
Microsoft
Designed and implemented a comprehensive benchmarking platform for evaluating system performance across distributed services. The framework automates load generation, resource monitoring, and statistical analysis, enabling teams to measure throughput, latency, and resource utilization under various workloads. Benchmarking results drove optimization efforts that improved overall system performance by 45% and identified critical bottlenecks before production deployment.
Profiled and optimized a globally-distributed caching layer serving 150K+ RPS, reducing P95 latency from 45ms to 8ms. Through systematic benchmarking, identified hot keys, analyzed memory access patterns, and optimized serialization protocols. Implemented advanced monitoring and alerting that reduced cache-related incidents by 85%. The optimizations saved $1.5M annually in infrastructure costs while improving user experience across all regions.
Built an ML-powered system to detect performance anomalies and predict capacity issues before they impact users. Applied statistical analysis and machine learning models to historical performance metrics, identifying subtle degradation patterns that traditional monitoring missed. The system provides early warnings for performance regressions, enabling proactive optimization and reducing P95 incident response time by 60%. This project combines my performance engineering expertise with curiosity-driven exploration of ML applications.
Load Testing, Profiling, Performance Analysis, Capacity Planning, Latency Optimization, Throughput Tuning
Prometheus, Grafana, Azure Monitor, Application Insights, Distributed Tracing, Metrics Analysis
C#, Python, SQL, PowerShell, Bash, Performance Testing Frameworks, Statistical Analysis
Distributed Systems, Azure, Kubernetes, Machine Learning, Anomaly Detection, Data Analysis
I'm always open to discussing performance optimization challenges, benchmarking methodologies, and opportunities to make systems faster and more efficient. Whether it's tackling complex performance bottlenecks, designing benchmarking frameworks, or exploring ML applications in performance engineering—let's connect.