About Me
Hi, my name’s Dan and I’m a Site Reliability Engineer. I do everything encompassing Infrastructure, Software Development, and Site Reliability.
I enjoy managing distributed systems and infrastructure, but I also have a wealth of development experience. Experience that includes architecting software architectures, implementing CI/CD, and writing libraries, tooling, and webapps.
Passionate about:
- Kubernetes
- AWS
- TLS, DNS, HTTP
- Linux
Great with:
- Ruby
- Golang
- Terraform
- Google Cloud
- Prometheus
- Hashicorp Products
- SIP and Telephony
Fluent with:
- Python
- Jsonnet
- Helm
- GitLab CI
- Chef
Can do:
- CloudFormation
- Java
- PHP
- Ansible
Experience
Work within a small and rebuilding SRE team.
Administer a full spectrum of Google Cloud resources, across multiple regions.
Consult and assist across ~3 feature development teams, a telephony team, and a dedicated machine learning team.
Provide and execute on a vision for reliability and infrastructure across the organization.
Consult, assist, and mentor across ~14 feature development teams and 2 reliability teams.
Lead multiple projects and efforts across reliability and feature development teams.
Consult, assist, and mentor across ~14 feature development teams and 2 reliability teams.
Lead multiple projects and efforts across reliability and feature development teams.
Administer 150+ Node Kubernetes clusters along with centralized logging, metrics, and alerting.
Provide orbital support and management of Kafka and Elasticsearch to feature development teams.
Administer critical Service Discovery infrastructure.
Consult, assist, and mentor across ~7 feature development teams and 2 reliability teams.
Administer Chef-managed instances, Kubernetes clusters, SIP endpoints/trunks, and Consul datacenters across AWS and GCP.
Scoped and performed a blue/green upgrade from Kubernetes 1.9 to 1.13, then assisted the following migrations and upgrades.
Designed and implemented a hub-and-spoke network between multiple Cloud Provider networks.
Develop and maintain multiple Ruby applications and libraries.
Consult and assist ~6 feature development teams.
With a shift into AWS from 3 companies merging, redesigned another company’s legacy Liferay-based architecture using CloudFormation + Ansible + CodeDeploy. Heavy use of Varnish and Fastly/CloudFront.
Designed SolrCloud and Tomcat8 SOA/Microservice infrastructure.
Maintained and developed on multiple Ruby on Rails applications for internal use.
Designed ECS + Service Discovery infrastructure via Terraform to replace above Tomcat8 infrastructure.
Administration of Softlayer + Xen + Java + Puppet + Akamai environments, full Atlassian suite, and company network.
Level I/II/II Help Desk, showed enough proficiency to pull double duty with and later join the System Administration team.
Supported River Centre, Xcel Energy Center, Roy Wilkins events with telephone and data service while also assisting as Level I/II Help Desk for all MSE staff, including desktop, remote, and network support.
Open Source Contributions
Add prometheus-disable-exporter-metrics flag in cert-exporter
joe-elliott/cert-exporter/pull/104
At a certain scale the go_ metrics in the Prometheus Golang client library can be impactful.
node_exporter and other exporters support disabling the metrics of the exporter itself via some form of --web.disable-exporter-metrics, I added that behavior to this exporter.
Fix Route53 implementation in Lemur
An initiative came my way to improve SSL certificate handling within our infrastructure, Lemur is a unique Python project that checked a lot of boxes. However! The Route53 implementation was busted so I fixed it to use within a Proof of Concept.
sshPublicKey via self-service-password
ltb-project/self-service-password/pull/81
I had a project to implement LDAP authentication within AWS, and wanted most elements to be self-service.
To support self-service of SSH Public Keys into LDAP, I added the functionality to an existing PHP project I was using in a Proof of Concept.
Projects
How fast can you pipe data between TCP connections with Go?
We needed to move a client of ours from Akamai NetStorage to Amazon S3 behind Fastly in a matter of months, the client was roughly 61 tv/radio stations equaling more than 16TB of data. A co-worker and I worked together to get a process down where we had CloudFormation spin up instances per station and a Golang application we had written streamed directly from HTTP GETs into S3 PUTs with a very high concurrency, pushing more than 500 Mbps per instance.
We all understand that naming your cattle is bad, they shouldn't be pets.. but hostnames are handy and readable, and [insert reason].
When tasked with transforming a legacy “lifted and shifted” AWS architecture into an AWS auto-scaling architecture, my co-workers and I found the need for something to automagically name instances as they come and go. Given that need, I took the dive in writing a Ruby gem which when baked into an AMI could name an instance when called with cfn-init or User-Data. I was able to pump out a working .gem in about ~4 days, and it is still currently in use with production infrastructure some 4 years later.
Education
Anoka Ramsey
AS Computer Networking and Telecommunications
2011 - 2013
The coursework was Cisco and Windows focused, but built on existing self-taught knowledge of Linux.
A Little More About Me
Alongside my interests in Linux, Networks, and FOSS some of my other interests and hobbies are:
- Food!
- Cooking & Grilling & BBQ
- Outdoors
- Fly Fishing