Behind the Scenes: Never Trust User Input

Trending 4 months ago

This article is nan first successful a bid of posts I'm penning astir moving various SaaS products and websites for nan past 8 years. I'll beryllium sharing immoderate of nan issues I've dealt with, lessons I've learned, mistakes I've made, and possibly a fewer things that went right. Let maine know what you think!

Back successful 2019 aliases 2020, I had decided to rewrite nan full backend for Block Sender, a SaaS exertion that helps users create amended email blocks, among different features. In nan process, I added a fewer caller features and upgraded to overmuch much modern technologies. I ran nan tests, deployed nan code, manually tested everything successful production, and different than a fewer random likelihood and ends, everything seemed to beryllium moving great. I wish this was nan extremity of nan story, but...

A fewer weeks later, I was notified by a customer (which is embarrassing successful itself) that nan work wasn't moving and they were getting tons of should-be-blocked emails successful their inbox, truthful I investigated. Many times this rumor is owed to Google removing nan relationship from our work to nan user's account, which nan strategy handles by notifying nan personification via email and asking them to reconnect, but this clip it was thing else.

It looked for illustration nan backend worker that handles checking emails against personification blocks kept crashing each 5-10 minutes. The weirdest portion - location were nary errors successful nan logs, representation was fine, but nan CPU would occasionally spike astatine seemingly random times. So for nan adjacent 24 hours (with a 3-hour break to slumber - sorry customers 😬), I had to manually restart nan worker each clip it crashed. For immoderate reason, nan Elastic Beanstalk work was waiting acold excessively agelong to restart, which is why I had to do it manually.

Debugging issues successful accumulation is ever a pain, particularly since I couldn't reproduce nan rumor locally, fto unsocial fig retired what was causing it. So for illustration immoderate "good" developer, I conscionable started logging everything and waited for nan server to clang again. Since nan CPU was spiking periodically, I figured it wasn't a macro rumor (like erstwhile you tally retired of memory) and was astir apt being caused by a circumstantial email aliases user. So I tried to constrictive it down:

  • Was it crashing connected a definite email ID aliases type?
  • Was it crashing for a fixed customer?
  • Was it crashing astatine immoderate regular interval?

After hours of this, and staring astatine logs longer than I'd attraction to, eventually, I did constrictive it down to a circumstantial customer. From there, nan hunt abstraction narrowed rather a spot - it was astir apt a blocking norm aliases a circumstantial email our server kept retrying on. Luckily for me, it was nan former, which is simply a acold easier problem to debug fixed that we're a very privacy-focused institution and don't shop aliases position immoderate email data.

Before we get into nan nonstop problem, let's first talk astir 1 of Block Sender's features. At nan clip I had galore customers asking for wildcard blocking, which would let them to artifact definite types of email addresses that followed nan aforesaid pattern. For example, if you wanted to artifact each emails from trading email addresses, you could usage nan wildcard marketing@* and it would artifact each emails from immoderate reside that started pinch marketing@.

One point I didn't deliberation astir is that not everyone understands really wildcards work. I assumed that astir group would usage them successful nan aforesaid measurement I do arsenic a developer, utilizing 1 * to correspond immoderate number of characters. Unfortunately, this peculiar personification had assumed you needed to usage one wildcard for each characteristic you wanted to match. In their case, they wanted to artifact each emails from a definite domain (which is simply a autochthonal characteristic Block Sender has, but they must not person realized it, which is simply a full problem successful itself). So alternatively of utilizing *@example.com, they utilized **********@example.com.

 Watching your users usage your app...
POV: Watching your users usage your app...

To grip wildcards connected our worker server, we're utilizing nan Node.js room matcher, which helps pinch glob matching by turning it into a regular expression. This room would past move **********@example.com into thing for illustration nan pursuing regex:

/[\s\S]*[\s\S]*[\s\S]*[\s\S]*[\s\S]*[\s\S]*[\s\S]*[\s\S]*[\s\S]*[\s\S]*@example\.com/i

If you person immoderate acquisition pinch regex, you cognize that they tin get very analyzable very quickly, particularly connected a computational level. Matching nan supra look to immoderate reasonable magnitude of matter becomes very computationally expensive, which ended up tying up nan CPU connected our worker server. This is why nan server would clang each fewer minutes; it would get stuck trying to lucifer a analyzable regular look to an email address. So each clip this personification received an email, successful summation to each of nan retries we built successful to grip impermanent failures, it would clang our server.

So really did I hole this? Obviously, nan speedy hole was to find each blocks pinch aggregate wildcards successful succession and correct them. But I besides needed to do a amended occupation of sanitizing personification input. Any personification could participate a regex and return down nan full strategy pinch a ReDoS attack.

Check retired our hands-on, applicable guideline to learning Git, pinch best-practices, industry-accepted standards, and included cheat sheet. Stop Googling Git commands and really learn it!

Handling this peculiar lawsuit was reasonably elemental - region successive wildcard characters:

block = block.replace(/\*+/g, '*')

But that still leaves nan app unfastened to different types of ReDoS attacks. Luckily location are a number of packages/libraries to thief america pinch these types arsenic well:

Using a operation of nan solutions above, and different safeguards, I've been capable to forestall this from happening again. But it was a bully reminder that you tin ne'er spot personification input, and you should ever sanitize it earlier utilizing it successful your application. I wasn't moreover alert this was a imaginable rumor until it happened to me, truthful hopefully, this helps personification other debar nan aforesaid problem.

Have immoderate questions, comments, aliases want to stock a communicative of your own? Reach retired connected Twitter!

More
Source Stack Abuse
Stack Abuse