Tony Ganga

Software Engineer

Fort Lauderdale, Florida US

calendar

tonyganga

As an accomplished software engineer with a background in systems administration, I have a proven track record of success working in fast-paced startup and big corporate environments alike. I enjoy working on tooling that increases developer velocity and leads to an overall improved developer experience.

Skills

Programming

Fluent

Golang

Efficient

Javascript
Python

Infrastructure

Fluent

Terraform
Ansible
Postgres
Kafka
Redis
AWS
GCP

Efficient

Kubernetes
HAProxy
gRPC
BigQuery
Aerospike

Work Experience (7)

Staff Software Engineer

Wayfair

https://www.wayfair.com/

I joined Wayfair as a Senior Software Engineer in the Platform Engineering organization with a focus on Observability. I write software that abstracts the infrastructure away from both software engineers at Wayfair and operations teams that manage our infrastructure. I'm currently a Staff Software Engineer and lead a team of 3 software engineers.

Designed and implemented an event driven Cloud Run service in Go, utilizing Eventarc and GCS, to seamlessly stream Cloudflare Logpush events into the Google Cloud Logging ecosystem.
Developed a customized Backstage plugin utilizing Typescript and React to streamline the access to self-service options, providing developers with an intuitive interface for quickly discovering datapoints in the logging infrastructure. This solution enhances efficiency and productivity for developers, while reducing the overall burden on the organization's support teams.
Designed and implemented a job scheduler using Go to perform a diverse range of daily tasks including but not limited to monitoring and correcting infrastructure components, loading statistical data into Google BigQuery, and hydrating our API caching layers.
Wrote a REST API in Golang backed by Google BigQuery that enabled the Observability team to obtain information about an applications telemetry usage (volume, cost, utilization etc).
Wrote a gRPC API in Golang that enabled our customers to perform Observability related operational tasks without needing to open up tickets to the operations teams.
Wrote a Kafka consumer in Golang that enriched metrics out of our telemetry pipeline to gain visibility into the amount of signals we drop/receive.
Wrote a template configuration engine in Golang backed by Consul KV, GCS Buckets and notifications via GCP Pub/Sub to allow software engineers to dynamically configure their telemetry pipelines.

Senior Site Reliability Engineer

Magic Leap

https://www.magicleap.com/en-us

MagicLeap is a mixed reality startup unicorn. I was a member of the core SRE team and an embedded SRE on the Identity team with a focus around evangelizing tooling that makes developers lives easier. I spent most of my day helping architect, design and write new features for the Identity (AuthZ/N) ecosystem that all services interact with at MagicLeap.

Maintained a developer focused Go CLI tool used by the development teams at Magic Leap. I contributed features around bootstrapping net-new AWS/GCP projects and worked towards delivering industry standard templates for specific workloads (serverless functions, ETL workloads, static websites etc).
Worked with Identity team to utilize Datadog APM for traces and use that trace data to get a better understanding of performance bottlenecks in the codebase.
Worked with Identity team to utilize Datadog for custom metrics to create SLI for our ecosystem. This enabled a well defined SLA model for the rest of the organization and SLO for our team.
Worked to identify and harden SPOF/vulnerabilites in our ecosystem and work on solutions in the codebase.
Contributed new features/patched bugs in 20+ Terraform provider forks we maintained.

Software Engineer

American Express

https://www.americanexpress.com/

At American Express I was brought in as a SRE with a focus on writing software to 'Automate the Toil'. I was embedded within the business vertical that was responsible for Fraud, and frequently collaborated with greenfield teams responsible for modernizing software at AMEX.

Wrote a distributed job scheduler in Golang backed by Redis to perform automated failovers for applications and validate their ability to failover into different datacenters with no downtime.
Wrote Prometheus exporters that scraped our Openshift environment to gain visibility into applications that had frequent crash loops and applied this towards an error budget.
Instructed an Introduction to Python course for the operations organization as a part of a company wide upskilling initiatives.
Successfully mentored a group of four interns, giving them foundational knowledge of the Go programming language and expertly guiding them through the intricacies of the internal AMEX Typescript/React stack. Through hands-on support and strategic guidance, facilitated the completion of high-priority projects that resulted in the interns receiving full-time job offers upon completion of the program.

Senior Site Reliability Engineer

Chewy

https://www.chewy.com/

I joined Chewy when it was still in it's startup stage. I was a part of the core SRE team and was responsible for everything in the ecommerce environments. I maintained a few opensource technologies running on VMs in VMware, fully managed by Ansible, that made up the ecommerce environment. I became pretty well versed with HAProxy, Postgres, Redis and gRPC. I deployed the first of many Kubernetes clusters for the warehouse teams for their greenfield applications. I was later moved onto the Cloud Engineering team where we were tasked with setting up a footprint for deploying into AWS. I spent most of my time contributing to how our CI/CD pipelines would look in AWS and onboarding technology into the company.

Maintained 20+ Ansible roles and playbooks that managed the entire ecommerce stack.
Introduced TDD pattern when writing Ansible, improving the confidence of our changes and catching change related issues early.
Worked with Big Data teams to implement Aerospike as a datastore for Chewys recommendation engine.
Worked with vendors such as Datadog, AWS, Splunk and Armory to onboard new products to help accelerate time-to-productivity for product teams at the company.

Senior DevOps Engineer

Matomy

https://matomy.com/

I joined a small team at the startup Matomy as their first DevOps engineer. I worked remotely with the rest of the team located in Israel. This startup focused on creating a domain monetization bidding platform backed by Kafka. I was responsible for modernizing their infrastructure tooling using opensource technologies (CI/CD, metrics, logging etc) and help companies they aquired rearchitect to an event driven ecosystem.

Deployed and managed Kafka and worked with development teams to create an event driven architecture to migrate off their old legacy systems.
Deployed and managed ELK (Elasticsearch, Logstash, Kibana) stack that was used for application and infrastructure logging.
Deployed and managed TIG (Telegraf/InfluxDB/Grafana) stack that was used for metrics from the infrastructure.
Deployed and managed Kubernetes clusters in AWS/VMware to deploy greenfield applications.

DevOps Engineer

Ultimate Software

https://www.ultimatesoftware.com/

I joined Ultimate Software as a DevOps engineer. This was my first job out of a traditional systems administration type of role. I worked on a team that focused on creating tooling for our sales team to consume and sell the core product, Ultipro. I served as a DevOps engineer on this project and architect/design/implemented a solution utilizing opensource components from our PaaS (OpenStack). I later moved onto a new company initiative to replace all of the logging infrastructure with an opensource solution (ELK).

Designed and implemented a solution that our sales team leveraged to give demos and training utilizing RabbitMQ, HAProxy and MongoDB.
Contributed to a Ruby gem we maintained that contained abstraction over modifying resources in our OpenStack environments.
Deployed and managed a series of large scale ELK stacks across four datacenters used for company wide logging.
Designed and implemented multiple CI/CD pipelines around utilities and tooling developers used for to iterate on the core UltiPro product.

Systems Administrator

Synergistix

https://www.synergistix.com/

I joined Synergistix as a Systems Administrator. I was responsible for a VMware and Dell physical server footprint along with core Dell network switches/routers, SonicWall Firewalls, VOIP phones, and all of the related equipment in the office. I spent a lot of time creating Python scripts to automate my daily administrative tasks and this is where I found the power of automation.

Rearchitected and migrated our datacenter from Miami Data Vault to ATT IDC with zero downtime.
Automated the process of troubleshooting datacenter issues by scraping information from iDRAC/PDU/Network equipment APIs.
Automated server patching and software installs through a configuration management system called Kaseya.
Performed various P2V migrations to better utilize our resources and reduce cost in the datacenter.