It's interesting that the author writes this whole rant about the stupidity of describing cookies as "small text files" and how that's a ridiculous and incorrect abstraction, without realizing that cookies used to be exactly that, for a very long time, and putting them in a sqlite database is a relatively recent idea, especially at the time that this was written.
I can imagine their rant upon finding a text that describes a telephone as a desktop device connected to the telephone network with a wire. "What kind of fool would describe a cellular data connection as a wire?"
Those that fail to learn from history are doomed to look really silly.
Additionally, at the time the “small text files” description appeared, your average computer user was much more familiar with “files” than “data”. Everyone was still mostly using desktop apps and managing their own files in folders and floppy disks, the cloud wasn’t a thing and the web wasn’t really an app platform yet. At the time, I think “data” (the article’s suggested change) would have read as more opaque and technical than “file”.
As an additional data point, this the exact phrasing that ycombinator.com/legal uses:
> A “cookie” is a piece of information sent to your browser by a website you visit. Cookies can be stored on your computer for different periods of time.
> It's interesting that the author writes this whole rant about the stupidity of describing cookies as "small text files" and how that's a ridiculous and incorrect abstraction, without realizing that cookies used to be exactly that,
That's... not the point of the article? The point is that whether they're stored on a text file, in a sqlite database or on the blockchain is an implementation detail that tells the user nothing about what a cookie actually is and what it is for.
> without realizing that cookies used to be exactly that
I believe the author does realize that:
> at least, I'm guessing IE must implement cookies using separate small files, or must have done so at one point
Netscape used to store cookies similar to curl in a single file, one cookie per line. IE was the only browser (AFAIK) that stored cookies in individual text files, and somehow that became canon. So I don't think it's correct in general to say "cookies used to be text files".
The author also doesn't realize that they're still files! Putting the file in a database doesn't change its nature, it's just storing it in a relational database instead of a hierarchical one.
If you store a jpeg file in a database, is it no longer a jpeg file?
How are they "still" files? The never have been files. They are some value that is part of the HTTP header. No one would call the HSTS value a text file, even though it's a string of printable bytes in the HTTP header that is stored somewhere on the hard drive by the browser, just like a cookie.
What is your definition of "file" that precludes calling these text files, files?
The word "file" has a very broad definition. HTTP/0.9 itself could be accurately described as a protocol for sending a text file as a query and receiving one in response. A basic HTTP client can be implemented by reading and writing files into netcat.
Where would you digitally sure a JPEG that's not a database? A file system is a database. I guess arguably it's a record if its in a non file-specific database?
True, and in a similar vein, we still have the floppy disk icon for "save to disk". In 20 years, most users will have never seen a floppy disk in their entire life. Still, the icon will remain.
Not really. The need to "save" is becoming less and less relevant. All changes are synced with the cloud and stored immediately. Even in some programming IDEs the need to save is a lot less common.
I hope not - for projects with any complexity, knowing I’ve got a checkpoint means I don’t need to think hard about doing risky things. I wonder if Microsoft keeps stats on how many Office users use auto save.
This mirrored my reaction as well. My first thought went something like, "that was a reasonably accurate explanation that most computer users would have understood... circa 1999."
Standard practices among browsers have changed, and your typical modern user of a computing device is no more savvy to techno-speak than a coconut can comprehend Etruscan.
The explanation is dated and needs to be retired; that's no reason to go after it with such vitriol.
Using text files as a mental model to describe cookies is useful. It might not be entirely correct, but it’s still useful.
Discarding a constructive mental model like this for the sake of pedantry — especially when it makes the idea described harder to understand — is just stupid.
And so, it follows that this article is just stupid.
Every time I look into the big new tech thing, or someone is explaining even some old tech, they get lost in technical jargon.
They just prioritize that over any of the details or useful terminology to actually explain what they did or how this thing relates or builds off something that already exists.
Everyone's so worried about getting as many new tech keywords thrown out as they can in order to gatekeep others. Even when communicating to other devs-- I realize some of these sound like they're aimed at non-tech people. These are not out of place statements on Stack Overflow or in YouTube tutorials.
"The Cloud is a revolutionary technology that you be will literally unable to comprehend unless you have these 20 certificates like me." But then it can just be summarized as saving and accessing a file on someone else's computer.
"Typescript is a literally brand new technology built from scratch that you cannot begin to comprehend until you put 10k hours into it and become a master like me." But then it's just JavaScript with extra steps. Just like every other main front end language. They're all JavaScript plus X.
"I built my own computer and then operating system from scratch. Why don't you try the same before suggesting your ideas." Well, you ordered pre-assembled parts from a list somebody else wrote and then you plugged them in with less steps than a Lego kit. And your OS from scratch was just a GUI checkbox installer than you clicked through that now has a thousand driver errors you neglected to mention.
"I passed my FAANG leetcode interview because I'm a better programmer than you. If you can't pass a leetcode interview, are you even trying?" You passed your leetcode interview because you spent hundreds of hours memorizing every question and answer in order to recite it from memory under pressure. You also got lucky by getting a question you happened to memorize. Memorizing the answers before a test doesn't make you better at that subject.
etc.
I honestly can't find a way to explain this that's not negative. If someone can rephrase my points, I would be curious to hear a respectful counter argument.
What exactly is the useful part, though? I mean, it's technically wrong. I don't see how this description improves layman understanding of cookies, much less to the point of offsetting the wrongness of it. In fact, you could argue "text files" is straight up confusing: you'd expect them to be human-readable, but they're not. They're machine-oriented data, often with several layers of encoding and serialization.
The article was right at the time of writing, and it's even righter today.
> What exactly is the useful part, though? I mean, it's technically wrong.
At least in Internet Explorer 3 and Netscape Navigator, cookies were stored as individual files in the file system. In that sense at some point in time it was somewhat correct.
However the relevant part: On e in a time computer users knew files and we're working a lot with them. Operating systems like Windows 95 pushed files into the front (instead of first picking the program and in the program opening the thing they pushed Explorer to "explore the computer" and then finding the right program for the file) There a file was a understandable unit of data.
FWIW, I once explained cookies this way to my non-technical co-founder of that day. I could see the clouds part and he confessed it removed all the magic and mysticism surrounding these web cookie things. It was like “ah, just small text files. Got it.”
Does a text file stop being a text file if it stored inside a zip file?
Cookies are chunks of text, calling that a file, even if it is stored as part of another file doesn't seem wrong, if perhaps it isn't the best way to describe it.
That doesn't seem right. Following that logic, almost anything is a text file, because all data can be encoded as text and be stored in a file.
A private key is a text file, a token is a text file, a host name is a text file, a password is a text file, a registry key is a text file ... the term loses all meaning.
The comments got me wondering about what a cookie can actually contain so I looked it up. A cookie can "include any US-ASCII character excluding a control character, Whitespace, double quotes, comma, semicolon, and backslash." Browsers typically restrict length to 4096 bytes. So "small text file" seems like a reasonable description. (The interesting distinction is text vs binary.)
And, frankly, what’s wrong with using office to open/edit a text file? It works well enough, even if it is akin to using a 747 to get the current weather conditions at altitude.
(For the pilots, I know about pireps. That’s part of the analogy. ;)
The point the article is making is that the entire concept of "plain text" is unfamiliar to non-technical people, so explaining that cookies are "text files" doesn't help them understand cookies.
The problem is that what we're attempting to explain is what kind of information is represented by the cookie. Saying that it's "text" only helps if the listener understands that this means "plain text" and what is meant by that term.
If you give me a piece of paper folded into a hat, do you get the hat back from me? Not if what we mean by "a piece of paper" is "the information contained in the writing on a piece of paper where each number corresponds to a particular symbol in a table".
This part is also not the truth. The data on paper can be as simple as a serial number and valid signature. Take a real world example: concert ticket. Or digital example: jwt ticket.
The paper with "something you don't care but they know" is a proper description in my opinion.
There are a couple of text editors that add a BOM character at the start of files when you save.
I've spent so much time helping young programmers debug their code trying to read/parse those files. They get an error about an unexpected character at the start of the file but when they open it they only see normal text so they assume they're doing something wrong.
All of these definitions miss a very important point about cookies:
A "cookie" is a piece of information sent to your browser by a website you visit, and your browser is required to send that information back to the website when you encounter that website's content (visible or hidden) throughout the web. Sending this information back to the website also informs the website of where you were on the web at that time.
As of 2009 it was difficult to disable 3rd party cookies.
How come we can't build sqlite type I
functionality into filesystems?
So many things would get a lot easier if there was an optimization for storing many tiny files without extreme SSD wear from deleting and recreating, or space amplification.
A directory could be marked as database-like, and all your browser stuff could be regular old folders and files, syncable with SyncThing if they let you choose an accessible folder on mobile.
I did a little research recently, trying to figure out the origin of the Privacy Policy text a firm 'wrote' for our company website.
It was straight copy & paste from what you'll find on half of all websites, with the earliest match going back to < 2000s - over 20 years old. GDPR/DPA made waves of updates at some point but large portions remain the same.
To the contrary - when non-technical people hear the word "data" they think pure magic (even some technical people). Explaining that a cookie is really just a bit of text does seem like it is more likely to help explain the concept than the vague definition proffered by the author.
Typical post from a clueless author who doesn't understand the purpose of simplification. Ordinary people don't need to care about technical implementation details, and "small bits of text" gives a basic idea of the concept as far as it concerns them.
But that's the point of the rant. How does knowing that "cookies are small text files" help the user understanding what a cookie is and does? That's just an irrelevant implementation detail.
'cookies.sqlite is most definitely not a "text file"'
I have a mildly differing opinion. Are not all files, by default, text files? It is just the way the text inside them is structured that determines their actual format?
edit: I might be opening a can of worms here, because I have a suspicion people will start pointing to extensions and whatnot.
Text files map to character encodings that represent human language - they're meant to be written and read directly by human beings. While all text files are binary, not all binary files are text.
A database file can contain text, but it isn't a text file in and of itself.
I think all files are, by default, binary files, and it’s the binary data inside them that determines their actual format.
Try opening a non-text file (e.g. an executable) in a text editor — the editor will probably try to interpret the file as Latin-1 text, but this will just look like garbage, and can’t really be said to be a sane representation of the contents of that file.
To add to this, a text file is only special because it’s binary contents are restricted(ish) to a set of numbers an editor knows how to interpret into human-readable glyphs.
Now then, if you want to discuss how most modern (non-executable) file formats are just zip files, then we can have more fun.
If "text" means binary data where the numbers represent characters according to some mapping (code page), then there can be binary files containing no text, binary files containing among other things some text or binary files consisting entirely of text.
For example, a file of raw audio samples doesn't contain any text. (You could interpret it as text in some code page if (unlike e.g. ISO 8859-1) all bytes are valid in that code page, but you'd end up with nonsense.)
An SQLite file definitely contains text, but also other things, so it's not "plain text". If "text file" means a file consisting entirely of text, and nothing else, it's not a text file.
Thank you for your explanation. Admittedly, I was not thinking of plain text. I was thinking of the first option with the wrong interpretation of the word binary, which made me automatically jump to conclusion that binaries are basically 1s and 0s[1] and clearly it is not that simple.
Thank you everyone. It feels like today I re-learned something I misunderstood a long time ago.
> which made me automatically jump to conclusion that binaries are basically 1s and 0s[1] and clearly it is not that simple.
It essentially _is_ that simple, it’s just how you interpret those 1s and 0s. Say you try to interpret a PNG file as text: you get some representation of the 1s and 0s in that file, just not the right one.
I can imagine their rant upon finding a text that describes a telephone as a desktop device connected to the telephone network with a wire. "What kind of fool would describe a cellular data connection as a wire?"
Those that fail to learn from history are doomed to look really silly.