The pitfalls of allowing file uploads on your website

ChuckMcM · on May 20, 2014

The bottom line is this, if users can upload something to your site, and then your site will show that thing to other users before you have a chance to figure out if its a problem, then your site will be exploited by bad actors.

For a long time an out of the box server installation would include anonymous ftp access. Of course nothing is quite so attractive as a 'free' place to dump and retrieve stuff. It was kind of like setting up a warez/malware camera trap.

nkurz · on May 20, 2014

and then your site will show that thing to other users

I think this is worth emphasizing more than the article does. The problem is just as much with the after-the-fact direct access as with the upload. Given the wide variety of illegal things you will quickly end up hosting and the amount of traffic this will generate, cross site scripting attacks may not be your top concern.

jacquesm · on May 20, 2014

Even if it does not show anything to other users, just having the wrong extension can already bite you badly.

Uploading php files instead of images has been used to gain access to machines. Anything that gets stored as a file on the filesystem of the destination machine is a huge risk. All it takes is one little misconfiguration somewhere else and you're wide open.

meowface · on May 20, 2014

Not to mention that many Apache configurations will use mod_mime, which by default enables multiple extensions.

So if someone uploads a file called `image.php.jpg`, the file is executed by Apache as PHP code. And obviously verifying the MIME type or even the content of the file won't help you here, since you can just write a JPEG header and then throw in `<?php system("..."); ?>` after it.

Even when you think you're safe based on what you'd consider to be obvious assumptions ("the file extension is whatever comes up after the last period"), there are weird things like this that might bite you.

teach · on May 20, 2014

This is only if you subsequently give them a link to what they uploaded, correct?

I have a site that allows uploads (students turning in Java files) but the files are just stored in a folder on the server that isn't in the web-served path. They can't see the file again once uploaded. I assume (and I think rightly) that there's no security risk in my case.

meowface · on May 20, 2014

It depends on the kind of application, but for the most part you are right. If a file is saved to a path that is not part of the "web root", then it is unlikely that any vulnerabilities will be introduced.

Just make sure it is a hardcoded path, and not one that users can manipulate in any way (a filename of "../../../../file.java" for example). And if there is some other interface that reads files from that directory and outputs them to a page, that will also need to be secured against XSS.

rurounijones · on May 21, 2014

As long as you read the files before you execute them.

Otherwise some bad actor could write a virus / local exploit into their submission which will execute when you compile and run the file.

teach · on May 21, 2014

I never execute them. I just grade them by reading the code. Running them takes FAR longer than reading.

rurounijones · on May 21, 2014

Are these exceedingly simple programs 10 line programs? Otherwise:

How do you know they compile?

How do you know they work?

How do you know they handle all the edge cases you can throw at them.

If you have a 100% accurate parser and compiler in your head, I am impressed.

Our teachers (and this was 15 years ago) had test-runners which would compile and run our programs to make sure they met the requirements of the homework THEN they looked at the code and marked it for style etc.

Sometimes they provided these test runners to us so we could check them ourselves, sometimes they didn't (this was, naturally, harder).

goblin89 · on May 21, 2014

Obviously such workflow, while being fairer, requires a reliable sandbox of some kind—even though one might argue that in a university such things may be of less importance and that allowing for some degree of hacking is educational and perhaps should even be tacitly encouraged, still you'd want to make sure that when students break your system they can't go Bobby Tables on it or dump everyone's private data on black market.

tantalor · on May 20, 2014

Should clarify: "The pitfalls of hosting user-uploaded files on your website"

Hosting user-uploaded files on a separate domain would probably solve this problem.

pimlottc · on May 20, 2014

> Should clarify: "The pitfalls of hosting user-uploaded files on your website"

Indeed. Nothing about this applies to sites that accept uploads for internal use (e.g. parsed as input data).

rpedela · on May 20, 2014

How does simply using a different domain protect against malware?

teraflop · on May 20, 2014

As the article explains, the problem is that SWF files hosted on one domain can execute in the security context of that domain, even when embedded in a page on a completely different site. So allowing attacker-controlled uploads makes any credentials on that domain, such as session cookies and CSRF tokens, vulnerable. If the SWF is hosted on a domain with no sensitive credentials, this particular problem goes away.

sp332 · on May 20, 2014

Reminds me of this Google Docs phishing scam, that uses a Google domain to look legit http://www.symantec.com/connect/blogs/google-docs-users-targ...

tantalor · on May 20, 2014

Google Drive does let you host arbitrary content, but from googledrive.com, not from google.com.

https://support.google.com/drive/answer/2881970

This is basically the same as github.io.

The Symantec article is interesting but only says the fake page is hosted on "Google's servers", not "google.com", but users might believe "googledrive.com" is trustworthy.

bsder · on May 21, 2014

I don't understand. How does serving a flash applet from my domain get you access to anything other than that applet?

I'm not arguing that this attack doesn't exist. I just don't understand what the attacker gets access to.

teraflop · on May 21, 2014

Browser security relies on the "same origin policy" which says that certain operations are restricted to only access resources in the same domain as the active page. In particular, you can't read cookies from another domain, and you can't read the responses of authenticated HTTP requests. XSS and CSRF attacks all rely on an circumventing this protection in various ways.

In this case, Flash considers the origin to be the location of the SWF file. This is different from normal JavaScript where all scripts in a page run under that page's origin, no matter where they're loaded from.

meowface · on May 20, 2014

It depends on the attack you're trying to prevent against.

The blog post in the OP solely discusses XSS vulnerabilities that are introduced by unrestricted file uploads. There are numerous other issues that can occur from arbitrary file uploads (malware hosting, arbitrary code execution if it's PHP, phishing), but to prevent against a user content ever reaching sensitive data via XSS, placing all user data on a separate domain is pretty much your best bet.

callmeed · on May 20, 2014

Hold-on, doesn't using a

    Content-Disposition: attachment; filename=”image.jpg”

header mean you can no longer display the image in your service? Won't browsers treat it as a file download? Most services that allow image uploads do so because the images will get displayed on a page? (that's what I do)

Most services seem to be moving file uploads to S3 (or similar services) these days, so I'm not sure this advice is really helpful. To take that a step further, my preference now is to upload directly to S3 and bypass my app server altogether. At least in Rails, it's fairly easy to setup.

fransr · on May 20, 2014

Problem is that many tend to use S3 but bind a subdomain to it. S3 does not validate the content of those files, so combined with a [wildcard].domain.com crossdomain.xml and you're still as vulnerable as per above.

Some also restricts so that different filetypes on S3 will be served as Inline content, but that will just save you from XSS, and not the CSRF leakage. It's still suprisingly common with a crossdomain.xml restricted to [wildcard].domain.com.

tehwebguy · on May 20, 2014

A nice way to achieve this with Rails is to upload straight to S3 and then use Paperclip to get, verify and process the file.

By uploading straight to S3 you also get a faster upload (than, say, Heroku) and server separation.

staunch · on May 20, 2014

> So if you allow file uploads or printing arbitrary user data in your service, you should always verify the contents as well as sending a Content-Disposition header where applicable.

The idea that you can "verify the contents" is pretty much just wrong. You actually have to parse the files and write out your own known-safe version. It's a real pain in the butt to do that correctly and securely across a wide variety of file types.

Even parsing arbitrary user uploads with something like ImageMagick is probably exploitable, simply because those libraries weren't designed to handle hostile input.

meowface · on May 20, 2014

This isn't too related to what the blog post was discussing, but just to give an example of how you're right:

If a PHP page is allowing file uploads and only verifies the content of the data, but nothing else, then no protection is offered against arbitrary code execution. It's easy to craft a JPEG header and then place `<?php ... ?>` right after it; you could even append it to a valid JPEG body, too.

rhizome · on May 21, 2014

Isn't it a reasonable workaround to scan the file for content-type and make the file available once it passes the criteria for the upload? Find a php file uploaded with an extension of .jpg? "Sorry, there was a problem with your file. Please try again."

ubernostrum · on May 21, 2014

And then you run into polyglot files which are valid for multiple types...

rebel · on May 21, 2014

So out of curiosity, what would be the easiest way to securely accept file uploads? Taking into account all of the possible malicious attacks.

BogdanCalin · on May 21, 2014

Host them on a separate domain like Google and Github does (eg. Google uses googleusercontent.com and Github uses githubusercontent.com).

elchief · on May 20, 2014

I'm pretty sure you can use Apache Tika to check the actual content type of a file too. Either way, I hate flash.

tom_jones · on May 21, 2014

Nice post, just goes to show the value of properly validating uploads!