Theme
Juvenal Diaz

Juvenal Diaz

Senior DevOps / MLOps / Platform Engineer

Contacts: +52 449 217 6833, juvenaldiaz522@gmail.com

Professional Summary

Senior infrastructure and reliability engineer with 12+ years of experience across Linux, Kubernetes-based platforms, cloud operations, incident response, automation, and production support. Currently operating a Kubernetes/Terraform PaaS used by 20,000+ internal developers, with hands-on work in maintenance, emergency changes, tooling, documentation, GitOps-style delivery, and continuous improvement. Targeting senior remote DevOps, SRE, platform, and MLOps roles where reliability, automation, and clear operations matter.

Impact Snapshot

20,000+ internal users supported through a Kubernetes/Terraform platform as a service.

10,000+ external Oracle Analytics customers supported through incident response, Linux troubleshooting, SQL tuning, automation, and runbook work.

4M+ requests per minute and 30+ microservices supported on a PCI-compliant platform using containers, orchestration, alerting, and DevOps practices.

Top performer history across Oracle and Rackspace teams, with onboarding, automation epics, and on-call process improvement work.

DevOps / SRE / MLOps Skill Matrix

Platform engineering

Kubernetes, kubeadm, Helm, Argo CD, Kyverno, OpenTofu/Terraform, Docker Buildx, local registries, Linux, container runtimes, storage, and worker placement.

SRE and operations

Incident response, emergency changes, planned maintenance, monitoring, alert tuning, Prometheus, Grafana, Loki, node-exporter, runbooks, RCA, and ITIL process experience.

Automation and delivery

Bash, Python, Ansible, REST APIs, GitOps delivery loops, Gitea Actions, Bitbucket workflows, secret scanning, image scanning, and repeatable infrastructure scripts.

MLOps direction

FastAPI inference service patterns, model metrics, drift detection, canary/rollback workflows, model-serving platform design, MLflow/KServe/Ray learning path, and Kubernetes-based ML operations.

Remote Work Fit

Employment History

Aug 2024 → Current

Site Reliability Developer – Oracle | Spectra

Operate a Kubernetes/Terraform platform as a service that lets 20,000+ internal developers build, run, and operate cloud applications. Daily work includes planned maintenance, emergency changes, tooling improvement, documentation, operational guardrails, and reliability-focused platform support.

June 2022 → July 2024

Site Reliability Developer – Oracle | Analytics

Resolved Oracle Analytics Cloud incidents for 10,000+ external customers, including Linux troubleshooting, SQL query tuning, service/job configuration, and usage issues. Built internal automation with Bash, Python, Ansible, and REST APIs, maintained SOPs, led continuous improvement and automation epics, supported new-hire onboarding, and proposed on-call rotation improvements.

July 2021 → June 2022

Linux Support Engineer - Rackspace

Handled multi-client Linux incidents through phone and ticketing channels across MySQL, Apache, NGINX, Varnish, PHP, VMware, DoS events, storage, backups, and firewalls. Ranked as a top performer by case volume across MX and US teams and helped onboard new hires.

March 2020 → July 2021

Linux Support Engineer - Softtek | Electronic Arts

Provided infrastructure support for a PCI-compliant platform handling 4M+ requests per minute across 30+ microservices. Supported container and orchestration technologies, DevOps operating practices, alert creation, and alert tuning.

August 2017 → March 2020

Cross Functional Manager - Softtek | Electronic Arts

Incident, Problem, Asset Management, and Automation (ITIL-based) process implementation, Continuous Improvement Assessments.

September 2015 → August 2017

Linux Support Engineer / Tech Lead - Softtek | General Electric

Incident, Change management, and monitoring for internal applications. Promoted to tech lead after one year in support position.

February 2013 → August 2015

Customer Support Agent – Teleperformance | Comcast

Provided customer support services taking calls from the US Southwest area to troubleshoot cable, phone, and internet services.