Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

ASCII has a built-in markup language and a processing control protocol that most people aren't even aware of and most tools out there don't support. This is significant. Look at the parts that are used and parts that aren't. What is the difference between them?


I think the big reason the ASCII C0 characters never took off was because you can’t see or type them.[a] If I’m writing a spreadsheet by hand (like CSV/TSV), I have dedicated keys for the separators (comma and tab keys). I don’t have those for the C0 ones. I don’t even think there’s Alt-### codes for them.

[a]: Regarding “seeing” them, Notepad++ has a nifty feature where it’ll show the control characters’ names in a black box[0]

[0]: https://superuser.com/questions/942074/what-does-stx-soh-and...


> Notepad++ has a nifty feature

Most physical terminals had the ability to show hex or control characters instead of/in addition to text.


Heh. I used those control characters to embed a full text editor within AutoCAD on MS-DOS. Back in the day. Mostly because someone bet me it couldn't be done.


I don’t know. Can you tell me? ;)


I assume the parent is referring to the various control characters like "START OF HEADING", "START OF TEXT", "RECORD SEPARATOR", etc... I haven't seen most of these used for their original control purpose but they date back a long way:

https://ascii.cl/control-characters.htm


I've seen them in some vendor-specific data formats in the financial space.

They seem to be from an era when the formatting models were either fixed-width fields, or a serial set of variable width fields delineated by FIELD SEPERATOR, GROUP SEPERATOR, etc.

What both models lacked was a good way to handle optional/sparse fields. If you have a data structure with 40 sub-fields, a JSON, XML or YAML notation can encode "Just subfield 26" pretty efficiently, but the FIELD SEPERATOR model usually involves having dozens of empty fields to space the one you want, and a lot of delicacy if the field layout changes.


The bits that aren't used don't correspond to printable characters :).


The “C0” block (U+0000 through U+001F) https://en.wikipedia.org/wiki/C0_and_C1_control_codes

They’re almost never used in practice however.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: