Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
Self-host your static assets (2019) (csswizardry.com)
112 points by kosasbest on March 5, 2022 | hide | past | favorite | 56 comments


> if lots and lots of sites link to the same CDN-hosted version of, say, jQuery, then surely users are likely to already have that exact file on their machine already

> [However] Safari has completely disabled this feature for fear of abuse where privacy is concerned

All major browsers have now disabled shared caches[0][1]. Using a CDN for resources is strictly worse, now, performance-wise.

[0] https://developers.google.com/web/updates/2020/10/http-cache...

[1] https://arstechnica.com/gadgets/2020/12/firefox-v85-will-imp...


> Using a CDN for resources is strictly worse, now, performance-wise.

It's still normally significantly better if only for the fact the CDN's servers are physically closer to the client.


Not necessarily even then. Connecting to a new domain has the overhead of a DNS lookup, TCP connection and TLS handshake before you can even send the request (and be stuck in slow start for the first few RTTs).


From my experience CDNs degrade performance. For most websites, they are the resources that are loaded the slowest, and are usually loaded synchronously, thus blocking page interaction.

I should note that I live in Israel, and geograpical distribution by CDNs may be at fault here.

Regardless, as others have noted, I have already connected with the website, third party connections have a high chance if impacting perfotmance for the worse.


"Normally," i.e. when the origin itself is not behind a CDN, right?


I really hate that you can't disable the cache partition. Every megabyte counts in a mobile first world.


It's for the best. Cache partitioning solves a browser history leak described in Timing Attacks on Web Privacy[0]

The permutations of different CDNs and asset versions meant the cache hit rate was low anyway.[1][2]

[0] https://www.cs.umd.edu/~jkatz/TEACHING/comp_sec_F04/download... (PDF)

[1] https://zoompf.com/blog/2010/01/should-you-use-javascript-li...

[2] https://web.archive.org/web/20111123170325/http://statichtml...


People should be able to select their own privacy/performance tradeoff.


> The permutations of different CDNs and asset versions meant the cache hit rate was low anyway.

A CDN with DCs in a number of locations and appropriate DNS setup at least has the advantage of local delivery which can have more than zero significance if you have an international audience and you really need low latency.

But if you are in that category then you probably won't (or shouldn't) be loading a bunch of stuff from a CDN or otherwise. And I wouldn't source script from a 3rd party anyway, I never saw the security tradeoff as worth it.


> And I wouldn't source script from a 3rd party anyway, I never saw the security tradeoff as worth it.

Doesn't subresource integrity [1], i.e. specifying the expected hash of the script, solve those security issues? Unless you need to support Internet Explorer, I suppose.

[1] https://developer.mozilla.org/en-US/docs/Web/Security/Subres...


I want aware that had become so well supported. Something to remember for later.

Though for personal projects I'll stick with self-hosting, and in DayJob we unfortunately still have to support IE11 to an extent (and beyond that some of our clients strictly whitelist everything and will reject a 3rd party they can't fully audit, ruling out CDNs for those at least, every with SRI).


> Every megabyte counts in a mobile first world.

I manage a normal looking regional news site. First load is 691KB. Megabytes don't come into it.

(I cheat by not having Google Ads on the page, but lol, Google is the one who sucks. Our house ads are fine.)


If saves 2 hits like that a day, it's over 30MB a month. Not extreme, it's still 1% monthly data saved, for something that likely would have been easy to make into a setting.


That estimation seems pretty suspect though.

Firstly, once you've visited the site personally, the assets will be cached for you anyway. Assuming the site updates its CDN'd assets once a month, the shared cache might possibly save you the first hit each month if you've visited another site that uses the same exact assets beforehand. (But given that the asset has just been updated, this seems unlikely.) So we're not talking about two hits a day.

Secondly, the vast majority of the site will not be on shared CDNs, because it's specific to that site. Images are usually the biggest offender in terms of page load, and those are very unlikely to end up in a shared cache. In a busy marketplace like that of news media, brand identity is very important, so I would guess that most of the CSS is either custom or customised, and therefore not in a shared cache. I don't know how much JS the site uses, but I'll be generous to your argument and assume they use jQuery, which is hosted on a shared CDN. Custom JS is obviously not used on other sites, so won't be in the shared cache.

And thirdly, there's a reasonably high chance, even if the jQuery stuff is hosted on a shared CDN and you've visited another page that uses jQuery before, that either the CDN host is different, or the version of jQuery being downloaded is different. It's unlikely that everyone updates immediately when a new version is released, which leaves you with a long tail of different versions being spread out over different hosts.

A better estimate probably like like this:

* Assume only one hit per month - the rest would have been cached anyway after the first visit to the site. * Assume only a minified version of jQuery is hosted via a shared CDN. * Assume that half the time, the version of jQuery used on this particular site is so unique that it wouldn't have been found in the shared cache.

According to Bundlephobia, the gzipped, minified size of the latest jQuery package is about 30kB. Assuming we would have had to download it anyway on about 50% of requests, and given that every request after the first would is cached anyway, you're saving around 15kB a month on average.

Assuming 3GB data a month (based on your 1% figure), that's about 0.0005% of your data each morning getting a successful cache hit that it would not otherwise have got if the shared cache was not there.


https://developers.google.com/web/updates/2020/10/http-cache...

Google claims a 4% increase in bytes over the network.


so why not cut it from image cruft, videos or bloated code and frameworks?

What kind of pages are we talking? Print-ready image archives?


Because for whatever reason, they seem to have no interest in turning down the quality slider on the JPG you will look at for 0.5 seconds by even one point.

If web designers don't care, at least the browser can try to clean up the mess.


Except that we live in a security "first" world.


Cache partition does very little for security, it's a matter of privacy. They're overlapping but somewhat distinct.


I will never src a script to a third party server. Same with fonts.

I think it's ironic that we thrash chipmakers for side channel attacks that realistically couldn't ever be exploited but the same people turn around and call 14 different scripts from 14 different domains to display a 900kb web page.

Who really knows who has access to or is tracking access to that code? And what happens when one of them goes down?

"The CDN is WAY more reliable than my server."

But your server is the least common denominator. Who cares if your scripts load if your server doesn't? For low volume, unless you have redundancy at every level of the stack; you can't get more reliable than one box.


How do you handle things like taking payments?


You can make an exception for payments (where you rely on the payment processor anyway, so the reliability argument goes away - if the processor is down, the payment will fail even if you self-host the payment script). Of course, embed the payment script only on the actual payment page, even if your processor tells you otherwise and quotes some fraud-related bullshit.


> even if your processor tells you otherwise and quotes some fraud-related bullshit.

You may be right 99.9% of the time. But if your site starts being used by someone automatically testing stolen credit card to see if they work you will be thankful for the automated script detection this gives. Every chargeback cost the vendor $20, now imagine 1k of those in a sort period of time, not pretty.

So far I have fortunately not been a victim of this but have read about people who have and it gives me enough fear to stick that script on all pages.


Can't you just enforce 3D-Secure and be done with it?


I may well be out of date (last looked at it about 18 months ago) but I don’t think all card issuers have implemented 3D secure yet. Although I think there may have been a deadline for them to.

Also friction in the checkout process is bad and can contribute to dropouts. You can decide to always enforce 3d secure or to have it only run when banks require/suggest it. Personally I set a cart total threshold above which it’s required and set it only to be applied when banks require it for lower value orders.


I haven't had to do that in a while but in the past I would just create external links to the payment processor who then links back when the payment is approved/declined.

Obviously you can't reasonably "roll your own" payment provider, but that doesn't mean I need to run their code with my name at the top of the page.


It used to be pretty common to self-host a checkout page, and the server contacts the payment processor. That comes with a security checklist compliance burden these days though. Probably better to send the browser to the processor and wait for them to come back, as you said.


Slight aside, most payment services with a Stipe like js self hosted checkout now suggest you include their JavaScript on all pages off your site, not just the checkout. This is so that they can begin to build a profile of how the user/browser moves around the site in order to use ML to prevent fraud. They also require that you dont self host the js.

The main use of this is to detect automated scripts running cards on your site to test if they work.

Obviously it’s an endless battle as the fraudsters move to using tools such as puppeteer/playwright to simulate a human using a browser.


Not only hosting static assets yourself, it’s best to have them on your main (sub)domain. It saves a dns round trip when visiting site for the first time. It also has the advantage with HTTP/2 being able to use that same initial connection for all assets on the page.

The old way of having your server rendered pages on your main domain and static assets on static.domain.com really should end. Less of an issue if your doing jamstack with no server side rendering or using something like Vercel.

This is also one of the advantages of using a combined waf/cdn such as CloudFlare. It allows you to have everything on the same domain while also having the use of waf and cdn capabilities. There is no need to have a subdomain to hold your static assets if you want to place it behind a cdn service. The only potentially problem is if your main hosting provider or tech stack makes having everything on one domain difficult.

I mostly work with Python on the sever and have found “whitenoise” a great way to achieve this.

Obviously it’s still easer to host any “dynamic” assets or user submitted assets on a subdomain. (There are good arguments that user submitted assets should be on another domain completely for belts and braces security)


One fear that I have with this modern way in which we use so many references in different sources is that any minor fail, can mean a real pain.


> Users may have the file pre-cached

This is no longer true. Due to security concerns browsers no longer re-use the cache between websites.

That said, hosting assets yourself is fantastic if you use a CDN. Accessing American websites without CDN from Europe can be quite rough. I remember one of my first remote freelancing gigs where it took about 5s to load the company’s website hosted on Heroku. Just because of availability zones.

At my current gig, we self-host our assets and it’s fine because we’re US-centric. We host in us-west-2 and I live in SF and everything loads super fast. But you can really tell the dev site loads slower when working from the east coast.

Network latency is no joke.


[flagged]


Self hosting means downloading the source of a script (e.g., jQuery), and hosting on the servers your site/app serve from. So I assume that GPs site is hosted on AWS and this may make more sense?

Else yes, a bit confusing.


What? The whole infra runs on us-west-2 and there are no 3rd party resources loaded from elsewhere.

You expect us to run servers in our bedrooms to count as self hosting?


[flagged]


The article and the comment are talking about hosting the site and its resources in the same place/on the same domain, not about the kind of hosting used. This advice still holds even if your entire site is behind a CDN.


Unfortunately there's a significant cost to actually implementing and maintaining this. For a non-mission critical/hobbyist site, putting a couple links to bootstrap, jquery, d3, vue, google fonts etc. in your header is a lot easier (and cheaper) than setting up your own bundler, build process and CDN. I don't particularly care if my pages take an extra few milliseconds to load.


Why do you need to set up a bundler and build process? If you were sourcing content from third-party domains, they weren’t being bundled anyway. All you’d have to do is swap out the third-party URLs for your own.


Literally no significant extra effort to download the files and put them with your website. If you fetch them from a CDN with a fixed version you're not getting updates anyway.


Putting them on your website means they are being served from your web server every time, which has a much worse performance penalty (and bandwidth cost) than everything the article mentions. The alternative is pushing it to your own CDN, which is the complexity I was referring to.


I wouldn’t exactly say using a bundler is a significant cost, nor that serving a few hundred kB is that much cheaper than relying on a shared CDN. If anything that’s just more lazy than anything else. Modern bundlers like vite aren’t exactly difficult to set up.


What is the cost to putting your frameworks on your site?


I've been a web developer for just over a decade now. Lately, I've been rather amused at how much more elementary my websites are today compared to, say, 5 years ago. I went through all the MVW frameworks, the gulp builds, the MEAN stack, 100% JavaScript SPAs, Typescript, etc, until eventually finding myself back where I started: static HTML page, a lite CSS framework, and almost no JavaScript.

My development is faster than ever, my web pages are faster than ever, and I can spend more time making sure the website delivers information effectively than worrying about gluing a million disparate frameworks together.


Same, its almost a wholesale rejection and works just fine for either getting the point across or driving millions in revenue

In the web3 space, all the “call to actions” are direct payments, so the funnel is reduced to a single step - optimized by orders of magnitude over the long winded funnels of web 2.0 services for even higher margins and even faster sales cycles

most of the learnings from all these other frameworks was to optimize for the most user experiences to keep them engaged for a convoluted product where the user session needs to be transferred between views and made longer and longer

In a world where most sites and services never needed to do that, and the current web3 ones that monetize well definitely don't need to, none of this complicated frontend (or associated backend) matters any more

The only remaining relevant learnings from the past 10 years are responsive design for different screen sizes, CDNs for even more caching, and better auto deployments from version control updates


Imagine if every web page loaded this fast?


Article does a great job of describing self-host. But they don't provide tips on how to self host in an easy way.

1. Is it hosting on our own CDN endpoints. How easy to set this up? Is there a 0 cost way of doing it for our experiments?

2. Alternatively hosting them on our own github pages or repos? - here we still depend on GitHub CDN.


He’s not talking about self host as in CDN vs self host. It’s about not linking to third party assets. You link to yoursite.com/static/bootstrap.js, not someoneelse.com/shared/bootstrap.js.

You can (should, as he says at the end) have this be through a CDN. Just use the same CDN you use for other stuff.


The article is not really talking about "self host" but rather "host in the same place as the rest of your website"


Self host would not be putting them on github pages unless you own github.


I'm surprised noone is mentioning that a CDN totally does help to serve your assets across different geographies.

Even if HTTP/2 is faster to load multiple assets over the same connection, I found it still better for users on other continents to serve my web app over the main domain from my server in US-East or Central and have a global CDN catching ALL my assets (not 5 CDNs for 5 different JS frameworks)


I think the headline is a little confusing. He's not advocating you don't use a CDN, he's advocating to not use a public CDN like code.jquery.com. You should absolutely still use a first-party CDN like Cloudflare so long as it owns your primary domain and all your content (including static assets) are served on that same domain.


Caching. That you control in code - not controlled by server headers which never worked reliably back to the earliest days. Already addressed in browsers in window.caches, serviceworker.

It's not so much you need JQuery version x cached from another origin, but once its loaded from your site, it never needs to be sent again. Heck you cache everything, or everything except index.html if you want to.


> If you have any render-blocking CSS or synchronous JS hosted on third party domains, go and bring it onto your own infrastructure right now.

Another aspect to this is if the user blocks Javascript from various third-party domains, your stuff doesn't execute.


Modern JS can easily use MB of JS libs. Sometimes it would be rather hard to do without them. We also have stuff like icofont.

If I have 100kb of content that needs 500kb of JS... I lose a lot of bandwidth if I were to self host.


> Modern JS can easily use MB of JS libs

1. It can, but - be kind to your site visitors and don't use such bloat.

2. The article said that if this the bandwidth really an issue for you, then you might as well put your entire site on a CDN and be done with it.


You should be serving that stuff through a CDN anyway and those are approximately free. Even if you’re paying, you can cover a lot of JS frameworks at a cent per terabyte.


I see this logic for javascript, css, and fonts, but is it still worthwhile to host image files on a CDN if your site has a lot of high-resolution jpegs?


video player controls are managed by video player api.. is that considered a static asset?




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: