What’s your worst ops blunder?
Mine is the time I took down an entire AWS account by miss-spelling an environment variable
Orrrr the time I accidentally deleted all the code from self hosted GitLab right before end of day
@ellie I have a couple of things that I did/saw/took part in, so much worse than those, that they're actually classified and I can't talk about them
@ellie not mine but a co-worker was going to delete a folder on traffic node for a customer, and issued the command `rm -rf <foldername>*`
Except his copy/paste had grabbed a space at the end of the folder name.......
@transitory oh shit
@ellie That time I almost pushed customer's code to #NPM. #Yarn didn't honor publishConfig setting so didn't push to our own package repo but straight to npmjs, and didn't display which repo it was pushing to... Saved by NPM not publishing packages under and organisation unless the organisation is created first.
@nicd ouchhhh
@ellie At one point, I was working as one of the sysadmins of a university. We had every device (computer, printer, you name it) on the internal network for Reasons.
I accidentally sent a 300-page print job to all of the printers, instead of just one. That would not have been so bad, because we could cancel them, if only sending a gazillion large print jobs all over the place would not have killed the network while we were remote, an hour's drive from campus.
@algernon ahahaha oh my god
@ellie I do not remember anymore, unfortunately. It was some university stuff for a prof, I think. Something he definitely did not need hundreds of copies of. =)
@algernon I am fairly sure we call that "redundancy"
@sigvat that definitely counts
@ellie once enabled *every* email account managed by Yahoo small business. we all have to learn about the importance of WHERE in your SQL statements at some point. I'm lucky mine was pretty benign :)
@reassuringurl I think we've all ran a query that went a bit wrong at some point!
@ellie I pasted a fairly high privileged root password on a public logged IRC channel accidentally
@yuvipanda was it hunter2?
@ellie i removed the awsOrganizationAccessRole for one of our Accounts by accident and prevent to access this Account for all users.
@ellie@hachyderm.io not mine but a co-worker was going to delete a folder on traffic node for a customer, and issued the command rm -rf *
Except his copy/paste had grabbed a space at the end of the folder name.......
@rachel oh god
I hope everything was backed up ok!
Edit, seeing the dates: …not entirely sure why mastodon just notified me of this
@ellie For some reason, I thought I needed access to prod DB from my dev machine (basically me being lazy and not doing the debugging and reproduction properly)
So the DB_URL env var, pointed to the live database with payment details of over a million users in it.
Guess what happened when an hour later, I ran the tests in that terminal? Yeah, those with the before() function that called db_reset().
(We had backups. And failover. We're down for less than a minute. Still...)
@berkes oh ouchhhh I bet that one stung
@ellie I once ran a data migration on an old codebase, updating the user model, not understanding that this update would trigger real emails to all users. I only realized after notifications to around 10k people
had been sent out, notifying them about “upcoming” conferences that had happened years prior. Almost gave me a heart attack, super embarrassing.
@ellie honestly, I never had anything worse than some git conflicts. I’ve had colleagues do drop db on prod in 2 separate companies, and I had to do the saving. Does that count?
The scene: AWS console, EC2 instances tab.
Me: clicks "select all", and then "terminate instances"
Me, a half second later: "oh SHIT it didn't save my filters, I just terminated every EC2 instance in the account."
Fortunately it was just our dev account and we could recreate everything with terraform but.... Still.