Ellie Huxtable: "What’s your worst ops blunder?…"

Recent searches

Search options

Only available when logged in.

Ellie Huxtable @ellie@hachyderm.io

What’s your worst ops blunder?

Mine is the time I took down an entire AWS account by miss-spelling an environment variable

Orrrr the time I accidentally deleted all the code from self hosted GitLab right before end of day

#infrastructure #devops

Nov 14, 2022, 05:02 PM··Metatext

6boosts·15favorites

Nov 14, 2022

theblazehen @theblazehen

birdsite meta, mocking

Nov 14, 2022

Ellie Huxtable @ellie

birdsite meta, mocking

Nov 14, 2022

Monica Colangelo @monica

@ellie I have a couple of things that I did/saw/took part in, so much worse than those, that they're actually classified and I can't talk about them

Nov 14, 2022

Rachel @transitory

@ellie not mine but a co-worker was going to delete a folder on traffic node for a customer, and issued the command `rm -rf <foldername>*`

Except his copy/paste had grabbed a space at the end of the folder name.......

Nov 14, 2022

Ellie Huxtable @ellie

@transitory oh shit although I think I’ve done something similar

Nov 14, 2022

Nicd @nicd@fosstodon.org

@ellie That time I almost pushed customer's code to #NPM. #Yarn didn't honor publishConfig setting so didn't push to our own package repo but straight to npmjs, and didn't display which repo it was pushing to... Saved by NPM not publishing packages under and organisation unless the organisation is created first.

Nov 14, 2022

Ellie Huxtable @ellie

@nicd ouchhhh close one!

Nov 14, 2022

... (FKA Gergely Nagy ) @algernon@trunk.mad-scientist.club

@ellie At one point, I was working as one of the sysadmins of a university. We had every device (computer, printer, you name it) on the internal network for Reasons.

I accidentally sent a 300-page print job to all of the printers, instead of just one. That would not have been so bad, because we could cancel them, if only sending a gazillion large print jobs all over the place would not have killed the network while we were remote, an hour's drive from campus.

Nov 14, 2022

Ellie Huxtable @ellie

@algernon ahahaha oh my god so much paper! What were you printing?

Nov 14, 2022

... (FKA Gergely Nagy ) @algernon@trunk.mad-scientist.club

@ellie I do not remember anymore, unfortunately. It was some university stuff for a prof, I think. Something he definitely did not need hundreds of copies of. =)

Nov 14, 2022

Ellie Huxtable @ellie

@algernon I am fairly sure we call that "redundancy"

Nov 14, 2022

Sigvat @sigvat@tyt.maanebedotten.no

@ellie
I'm more dev than ops, but I did manage to freeze our production base for ~15 minutes by turning on the child safety feature of my DB client and not realising it had turned off auto-commit until someone pinged me asking why I had more than a thousand connections to the base.

Does that count?

Nov 14, 2022

Ellie Huxtable @ellie

@sigvat that definitely counts

Nov 14, 2022

Brad @reassuringurl

@ellie once enabled *every* email account managed by Yahoo small business. we all have to learn about the importance of WHERE in your SQL statements at some point. I'm lucky mine was pretty benign :)

Nov 14, 2022

Ellie Huxtable @ellie

@reassuringurl I think we've all ran a query that went a bit wrong at some point!

Nov 14, 2022

yuvipanda @yuvipanda

@ellie I pasted a fairly high privileged root password on a public logged IRC channel accidentally

Nov 14, 2022

Ellie Huxtable @ellie

@yuvipanda was it hunter2?

Nov 18, 2022

Marcus Zi @Marcus_Zi@digitalcourage.social

@ellie i removed the awsOrganizationAccessRole for one of our Accounts by accident and prevent to access this Account for all users.

Nov 14, 2022

Rachel @rachel@transitory.social

@ellie@hachyderm.io not mine but a co-worker was going to delete a folder on traffic node for a customer, and issued the command rm -rf *

Except his copy/paste had grabbed a space at the end of the folder name.......

Oct 29, 2023 *

Ellie Huxtable @ellie

@rachel oh god

I hope everything was backed up ok!

Edit, seeing the dates: …not entirely sure why mastodon just notified me of this

Oct 29, 2023

Bèr Kessels @berkes@mastodon.nl

@ellie For some reason, I thought I needed access to prod DB from my dev machine (basically me being lazy and not doing the debugging and reproduction properly)

So the DB_URL env var, pointed to the live database with payment details of over a million users in it.

Guess what happened when an hour later, I ran the tests in that terminal? Yeah, those with the before() function that called db_reset().

(We had backups. And failover. We're down for less than a minute. Still...)

Oct 29, 2023

Ellie Huxtable @ellie

@berkes oh ouchhhh I bet that one stung

Oct 29, 2023

Harry Keller @harryfk@mastodon.social

@ellie I once ran a data migration on an old codebase, updating the user model, not understanding that this update would trigger real emails to all users. I only realized after notifications to around 10k people
had been sent out, notifying them about “upcoming” conferences that had happened years prior. Almost gave me a heart attack, super embarrassing.

Oct 29, 2023 *

Kfir Breger @kfirbreger@mastodon.social

@ellie honestly, I never had anything worse than some git conflicts. I’ve had colleagues do drop db on prod in 2 separate companies, and I had to do the saving. Does that count?

Oct 29, 2023

wrd @wrd

@ellie i deleted a full prod database because someone didn't enable backups and did not name it and I thought it's a leftover database.

That was a nice post mortem.

Oct 29, 2023

drmorr @drmorr

@ellie

The scene: AWS console, EC2 instances tab.
Me: clicks "select all", and then "terminate instances"
Me, a half second later: "oh SHIT it didn't save my filters, I just terminated every EC2 instance in the account."

Fortunately it was just our dev account and we could recreate everything with terraform but.... Still.

#devops #horrorstories