> Change the original source to something that doesn't need an archive (e.g., a source that was printed on paper), or for which a link to an archive is only a matter of convenience.
As if paper sources are any less susceptible to becoming increasingly unavailable over time...
You also need some extra bits at the top so that it doesn't overflow (e.g., on an adversarial input filled with copies of the max finite value, followed by just as many copies of its negation, so that it sums to 0). The exact number will depend on the maximum input length, but for arrays stored in addressible memory it will add up to no more than 64 or so.
Thanks, you're right there for accumulating excess, I don't think you can actually get to 64 extra bits but sure, lets say 64 extra bits if you want a general purpose in memory algorithm, it's not cheap but we shouldn't be surprised it can be done.
IME, on GitHub (or the other major public repo services), it's far more likely than not that I can pull up an old version of a project from 10 years ago if I want to experiment with it. (In case other old things used it as a dependency, I really want to reproduce an old result, etc.)
On the vast majority of other distribution platforms, it's at best a 50/50 as to whether (a) the platform still exists with any of its contents, and (b) the authors haven't wiped all the old versions to clear up space or whatever. The former typically fails on academic personal websites (which generally get dumped within 5-15 years), and the latter typically fails on SourceForge-style sites.
That is to say, I am not a big fan of the popular alternatives to Git repos as a distribution method.
Would something be a proof in that sense even if it did use Mizar? As far as I can tell, Mizar has no complete reference for its language semantics, except for the single closed-source implementation. In general, information about the system itself (outside of the library) seems very scarce, or at least scarcely advertised.
Mizar source was "available upon request" for maybe 30-40 years. It got completely open-sourced under GPL some 3 years ago (maybe earlier, not sure), see [1], also [2] and [3] about an alternative implementation in Rust.
Mizar is indeed "scarcely advertised", but all the information is publicly available, who wants to know knows. As for Mizar semantics, see for example [4].
Thank you for that information, all I could find on the website was that "The source code of the Mizar verifier and accompanying tools is available to the members of SUM" [0], which of course does not reflect the newer status quo.
> Mizar is indeed "scarcely advertised", but all the information is publicly available, who wants to know knows.
Yes, there indeed seems to be a good bit of information available, especially about the library and its articles. But some parts seem to be scattered about, unless you already know where to look, or know someone who knows. Perhaps it's a matter of taste.
(For comparison, I've recently been dabbling a fair bit with Metamath: it's not really advertised outside of its small circle these days, but the website does a good job at introducing the system, while also offering a complete reference in the form of the Metamath book. From there, the primary challenges to a new user are the fiddly tooling, the cryptic labeling scheme, and the puzzling DV conditions.)
I'm a bit skeptical about taking the rates of reported soundness bugs between different systems and drawing conclusions about the underlying approaches. There's typically a class of bugs that users can stumble into by doing something unusual, and then another class of bugs that can only really be found by exploiting holes in the implementation. The first class depends on which features get exercised the most, the second class depends on how many eyes are on the source (and how accessible it is), and both classes heavily depend on the overall size of the userbase.
E.g., Metamath is designed to be as theoretically simple as possible, to the point that it's widely considered a toy in comparison to 'serious' proof systems: a verifier is mainly just responsible for pushing around symbols and strings. In spite of this simplicity, I was able to find soundness bugs in a couple major verifiers, simply because few people use the project to begin with, and even fewer take the time to pore over the implementations.
So I'd be hesitant to start saying that one approach is inherently more or less bug-prone than another, except to be slightly warier in general of larger or less accessible kernels.
LCF-style provers like Isabelle/HOL and HOLlight are some of the most widely used, and oldest interactive theorem provers. If they consistently show smaller error rates than other systems, that is an interesting empirical observation. To give but one recent example: Amazon recently announced a vast 260000 lines of Isabelle/HOL-checked correctness proof of their new Nitro hypervisor for AWS EC2 Graviton5 instances.
LCF-style provers have a much smaller trusted computing base than Curry/Howard based provers like Coq, Agda and Lean.
One may wonder if there is a correlation between size of TCB and error rate in widely used provers?
> LCF-style provers have a much smaller trusted computing base than Curry/Howard based provers like Coq, Agda and Lean.
I'm not sure that this is correct. The TCB in a CH based prover is just the implementation of the actual kernel. In LCF, you also have to trust that any tactics are implemented in a programming language that doesn't allow you to perform unsound operations. That's a vast expansion in your TCB. (You can implement LCF-like "tactics" in a CH-based prover via so-called reflection that delegates the proof to a runtime computation, but you do have to prove that your computation yields a correct decision for the problem.)
No. The whole point of the LCF approach is that only kernel functions can generate theorems. Usually this is done by having a Thm module with opaque thm type (so its instances can only be generated by this module) and embedding the base rules of your logical calculus as shallow functions into that Thm module. Thus, all thm objects are correct by construction (w.r.t. your calculus), and you only need to check the Thm module to verify this.
Tactics generate thms by calling functions of the Thm module in some fashion.
At least in Isabelle/ML, it seems like in practice there are also untrusted "oracles" that a proof can invoke to generate "thm" objects [0], and it's not entirely trivial to automatically ensure that only trusted sources are used in a project [1], unless I am misunderstanding the linked thread.
Of course there's no free lunch, as you say, in making sure that high-level proofs are lowered into the trusted part correctly, but it's certainly a piece that should ideally be as simple as possible.
This does not resolve the issue of how to ensure that the "module" system is itself sound and has no unforeseen escape hatches. You have to assume that it is, or else include a complete programming language implementation as part of your TCB. If your system generates proof objects (possibly allowing for some kind of reflection, with the proof objects merely specifying some kind of computation) you can skip this and simply prove that the kernel itself is correct.
The question: "does the ambient programming language do the right thing?" applies to other provers too. So if you assume the semantics of the implementation language is broken, or the compiler, or the computer executing the code, then those issues also need to be added to the TBC of Curry-Howard provers.
This is wrong as the others replies also point out. Tactics in LCF-style provers are not part of the TCB. Here is an example of the TCB for an industrial strength prover:
My understanding is that tactics output piece of proof, but that this isn't trusted by the TCB of an LCF prover. This is very far from my area of expertise, but seems to be confirmed by https://www.cl.cam.ac.uk/techreports/UCAM-CL-TR-39.html
Assuming you're referring to [0], that's a statistical estimate of valid chess positions (based on clever methods of uniform position sampling + fast validity testing), not valid chess games (based on estimating branching factors for very long games).
How I'd put it is that there are two sets of stopping points under FIDE rules:
- After threefold repetition or 50 moves, either player may claim a draw.
- After fivefold repetition or 75 moves, the game is automatically drawn.
Most modern counts of the longest possible chess game, or the total number of possible chess games, are based on fivefold repetition and the 75-move rule.
Meanwhile, threefold repetition and the 50-move rule are still relevant in endgame tablebases, since they rule out certain forced mate sequences.
Endgame tablebases don't take into account threefold repetition; if so, you would have to basically be able to exclude any arbitrary position from the tree, which would seem impossible. The 50-move rule is respected by the Syzygy tablebases, though with the concession that they do not generally give the fewest possible moves to mate (they would rather delay the mate than delaying a pawn push or a capture).
Here's an example (adapted from the URL below): https://syzygy-tables.info/?fen=3R4/5R2/8/8/8/1K6/8/4k3_w_-_... — if you asked pretty much any player, even a child, how to win this, they'd show the staircase mate starting with Re7+ (mate in 4). If you asked a computer or the older Nalimov tablebases, it would say Kc2! (mate in 2). However, if you ask the Syzygy tablebases, they would argue that this is not optimal if we are extremely close to the 50-move rule, so the safest and thus best move is Rf2!! which forces Black to capture the rook on the next turn (they have no other legal moves), resetting the counter and giving a mate in 18.
There were a set of experimental DTM50 tablebases made at some point (though not made public); they store the shortest mate for all 100 possible zeroing counters in any position. See https://galen.xyz/egtb50/ for some discussion.
My understanding is that the only still-lifes known not to have a glider synthesis are those containing the components listed at [0], which are 'self-forcing' and have no possible predecessors other than themselves. Intuitively, one would think there must be other cases of unsynthesizable still-lifes (given that a still-life can have arbitrary internal complexity, whereas gliders can only access the surface), but that's the only strategy we have to find them so far.
> Maybe it's time to try pushing the envelope on this: what's the biggest blobbiest most spacedustful period-4 c/2 orthogonal spaceship that current technology can come up with? Might there be some kind of extensible greyship-like thing that escorts a patch of active agar instead of a stable central region, that might allow an easier proof of non-glider-constructibility?
I always enjoy the absolutely incomprehensible GoL jargon
I do find it annoying how the Temporal API, just like nearly all other datetime APIs, has 0 support for querying leap-second information in any shape or form. Suggested workarounds like temporal-tai all require plugging in a leap-second file and keeping it updated, which is especially painful for client-side JS, where you can't just download a leap-second file from someone else's site thanks to the SOP. Meanwhile, browsers update on a cadence more than sufficient to keep an up-to-date copy, but the datetime APIs refuse to expose leap-second info because they're too committed to "only UTC is in-scope for this project".
(The context is that I want to write some JS tools for astronomical calculations, but UTC conversions need leap-second info, so this trend makes it impossible to write something that Just Works™.)
>only UTC is in-scope for this project
>tools for astronomical calculations
Pity, since UTC is objectively the wrong time for astronomical calculations. Among other problems, UTC runs slightly slower or faster depending on how far the Earth is from the Sun. UTC does not run uniformly (outside of Earth-at-sealevel), instead the length of 1 second will slightly grow or shrink depending on the current configuration of the Solar system.
As you allude to, the correct time scale for this purpose would be TBD (aka Barycentric Dynamical Time), which applies relativistic corrections to act like the atomic clock is fixed at the barycentre of the Solar system. This is the only clock that actually runs "smoothly" for the purposes of astronomical calculations.
> Among other problems, UTC runs slightly slower or faster depending on how far the Earth is from the Sun. UTC does not run uniformly (apart from Earth-at-sealevel), instead the length of 1 second will slightly grow or shrink depending on the current configuration of the Solar system.
That is completely wrong. UTC seconds are exactly SI seconds, which are all the same uniform length (defined by quorum of atomic clocks).
At sea level on Earth, UTC seconds are all exactly the same, yes. That's the definition of UTC.
The trick is, when you're working with things on the scale of the Solar system and larger, it no longer makes sense to assume your frame is "at sea level on Earth." Your relativistic reference frame has shifted, so (thanks Einstein!) time literally changes underneath your feet.
The primary mechanism (but not the only one) is that a clock on Earth will tick slower when Earth is closer to the Sun, due to the effects of gravitational time dilation.[0]
So yes, a clock on Earth always runs at a uniform rate. But because the universe is fundamentally Einsteinian, that still means that eg if you're working with the orbit of Jupiter or a millisecond pulsar, you will see small introduced timing errors if you try to use UTC (or even UT1 which is UTC without leap seconds) instead of TBD.
But it's all relative, all reference frames are different and relative from each other and there is no one reference frame that is somehow special. TBD runs unevenly relative to UTC as much as UTC runs unevenly relative to TBD.
> where you can't just download a leap-second file from someone else's site thanks to the SOP
WDYM by this? Why does the SOP prevent a website from hosting a leap seconds file? All they need to do is set Access-Control-Allow-Origin to allow websites to access it. Or provide it as a JS file—in which case no headers are necessary at all. All the SOP prevents is you hotlinking someone else's leap-seconds file and using their bandwidth without their opt-in.
> Meanwhile, browsers update on a cadence more than sufficient to keep an up-to-date copy
Is this true? I don't know any browser right now that ships with a copy of a leapseconds data file. Adding such a data file and keeping it up to date would probably be a pretty non-trivial task for new browser developers—just for something the browser will never end up using itself. It's not like the ICU/CLDR files where browsers are going to need them anyway for rendering their own user-interface components.
> All they need to do is set Access-Control-Allow-Origin to allow websites to access it. All the SOP prevents is you hotlinking someone else's leap-seconds file and using their bandwidth without their opt-in.
They can, but the major providers (read: the ones I would trust to update it) don't. The IERS doesn't [0], the USNO doesn't [1], IANA doesn't [2], and NIST uses FTP [3]. Keep in mind that these files are constantly being downloaded by various clients for NTP and whatnot, it's not like these providers want to restrict public access, they just don't bother to set the header that would allow JS requests.
> Is this true? I don't know any browser right now that ships with a copy of a leapseconds data file.
From ECMA-262:
> It is required for time zone aware implementations (and recommended for all others) to use the time zone information of the IANA Time Zone Database https://www.iana.org/time-zones/.
Any browser that ships with a copy of tzdb, or knows where to find a copy from the OS, should have access to its leapseconds file. Unless you mean that all of them go solely through ICU and its data files? Which I suppose could be an obstacle unless ICU were to start exposing them.
> but the datetime APIs refuse to expose leap-second info because they're too committed to "only UTC is in-scope for this project".
This doesn't make sense on at least two different levels.
First, pedantically, the definition of UTC as a time scale is that it includes leap seconds. So if you're committed to UTC, then you're supporting leap seconds.
Second, and to more broadly address your point, you should say, "they're too committed to 'only the POSIX time scale is in-scope for this project.'" That more accurately captures the status quo and also intimates the problem: aside from specialty applications, basically everything is built on POSIX time, which specifically ignores the existence of leap seconds.
Sure, but my gripe isn't even that we ought to change the "POSIX time by default" status quo (the ship has long sailed that everyone counts durations by 'calendar seconds'), it's that the underlying libraries don't even provide enough information for "specialty applications" to reliably correct for it, short of perpetually updating it themselves.
Why should they though? To me it seems like a niche of a niche. I think there is plenty of room for scientific datetime libraries to service this need in a way that is likely better than what a general purpose datetime library would provide. (And indeed, there are many of them.)
I say this as someone who had leap second support working in a pre-release version of Jiff[1] (including reading from leapsecond tzdb data) but ripped it out for reasons.[2]
It’s not “niche” if you do things synchronized to GPS timestamps. (i.e. a significant portion of telecom, a bunch of electrical grid stuff, etc).
Anything using GPS as lock references to synchronize stuff that needs to be aligned to the millisecond absolutely cannot tolerate stuff like “the leap second smear”.
The vast majority of people using date time libraries don’t need to represent dates before 1980 or dates after 2036. It doesn’t mean you just bury your head in the sand and ignore a pretty critical part of how time representation works.
That's part of it: If I were writing a standalone program that could extract info from tzdb or whatever, I'd happily jump through those hoops, and not bother anyone else's libraries. I don't really care about the ergonomics. But for JS scripts in particular, there is no information available that is not provided by either the browser APIs or someone's server. And such servers are not in great supply.
> I do find it annoying how the Temporal API, just like nearly all other datetime APIs, has 0 support for querying leap-second information in any shape or form.
That’s because human time keeping doesn’t use leap seconds.
> like nearly all other datetime APIs, has 0 support for querying leap-second information
That's probably because you only need leap second accuracy in niche use cases, like astronomy or GPS. In JavaScript specifically, that kind of accuracy isn't needed for 99% of client-side use cases. Most date-time libraries work with POSIX time which assumes 86,400 seconds each day.
I don't see how any amount of memory technology can overcome the physical realities of locality. The closer you want the data to be to your processor, the less space you'll have to fit it. So there will always be a hierarchy where a smaller amount of data can have less latency, and there will always be an advantage to cramming as much data as you can at the top of the hierarchy.
while that's true, CPUs already have automatically managed caches. it's not too much of a stretch to imagine a world in which RAM is automatically managed as well and you don't have a distinction between RAM and persistent storage. in a spinning rust world, that never would have been possible, but with modern nvme, it's plausible.
Cpus manage it, but ensuring your data structures are friendly to how they manage caches is one of the keys to fast programs - which some of us care about.
Absolutely! And it certainly is true that for the most performance optimized codes, having manual cache management would be beneficial, but on the CPU side, at least, we've given up that power in favor of a simpler programming model.
Part of giving up is what is correct changes too fast. Attempts to do this manually often got great results for a year and then made things worse for the next generation of CPU that did things differently. Anyone who needs manual control thus would need to target a specific CPU and be willing to spend hundreds of millions every year to update the next CPU - there is nobody who is willing to spend that much. The few who would be are better served by putting the important thing into a FPGA which is going to be faster yet for similar costs.
«Memory technology» as in «a single tech» that blends RAM and disk into just «memory» and obviates the need for the disk as a distinct concept.
One can conjure up RAM, which has become exabytes large and which does not lose data after a system shutdown. Everything is local in such a unified memory model, is promptly available to and directly addressable by the CPU.
Please do note that multi-level CPU caches still do have their places in this scenario.
In fact, this has been successfully done in the AS/400 (or i Series), which I have mentioned elsewhere in the thread. It works well and is highly performant.
> «Memory technology» as in «a single tech» that blends RAM and disk into just «memory» and obviates the need for the disk as a distinct concept.
That already exists. Swap memory, mmap, disk paging, and so on.
Virtual memory is mostly fine for what it is, and it has been used in practice for decades. The problem that comes up is latency. Access time is limited by the speed of light [1]. And for that reason, CPU manufacturers continue to increase the capacities of the faster, closer memories (specifically registers and L1 cache).
As if paper sources are any less susceptible to becoming increasingly unavailable over time...
reply