These are great, but Loom discussions almost always mix details with issues of t...

twic · on March 16, 2023

The way i think about the situation is:

1. Platform threads + blocking APIs = comfortable to use, does not scale to many connections

2. Platform threads + non-blocking APIs = painful to use, scales to many connections

3. Virtual threads + blocking APIs = comfortable to use, scales to many connections

So, at the moment, if you want to write software which scales to many connections (which not everybody needs to do, but some do), you have to suffer for it. With virtual threads, it will be a lot more pleasant.

As a concrete example, with non-blocking architecture, there is no way to have an InputStream or OutputStream which streams an arbitrarily large amount of data over the network without buffering it all. Because the contracts of those classes say that they block when there is no data or space available! If you look at the APIs of web servers based on non-blocking APIs (eg [1], [2]), they want you to read or write the whole body in one go, or maybe in chunks, which will be pushed to you if you're reading. You can only build a stream on top of that by buffering everything, or using another thread, which destroys the advantage of non-blocking architecture. There is loads of useful I/O machinery that is built on top of streams, like JSON parsers and generators, so your choice is either not using that, or accepting buffering or extra threads.

With virtual threads, you just use blocking I/O, servers and clients can expose streams for request and response bodies, and it's trivial to use any I/O machinery you like. It's also far easier to write your own.

[1] https://undertow.io/javadoc/1.3.x/io/undertow/io/Receiver.ht...

[2] https://undertow.io/javadoc/1.3.x/io/undertow/io/Sender.html

fulafel · on March 18, 2023

In these discussions it's good to quantify what you mean by "many connections".

Eg web is full of people posting Java OutOfMemory stack traces when they haven't increased the OS resource limits from the default and are imagining that the limit is 10k threads instead of 1M threads on their hw, falsely concluding that Java uses uses a lot of physical memory per thread stack.

jasmer · on March 16, 2023

So why can't I just use an OS regular thread to use a non-blocking API?

What is the advantage of a virtual thread in that case? You are saying that it's painful? How exactly? If you use an Executor that's setup with a proper Thread Pool it's mostly painless?

I mean, virtural threads are nicer but I see that as almost a syntax issue. A bit cleaner.

I guess I'm asking what you mean by 'painful to use?'.

twic · on March 16, 2023

I literally include a concrete example in the comment you are replying to.

xxs · on March 16, 2023

>2. Platform threads + non-blocking APIs = painful to use, scales to many connections

Wow, I have written a substantial amount of non-blocking IO since around java 1.4.2 (when it actually became stable). It was not much harder to use at all (compared to io streams). The issues w/ buffering/scalability and internal scheduling will be exactly the same with green threads.

twic · on March 16, 2023

Me too, and that's not correct.

Firstly, the idea that NIO is "not much harder to use at all (compared to io streams)" is wild on its own. People built Netty (and XNIO) precisely because of how awkward NIO is to use.

But secondly, there are common, useful patterns that are trivial with streams and extremely awkward with asynchronous I/O. For example, here's some code which queries a database and streams the results to the client as JSON:

    OutputStream responseBody; // assume you get this from somewhere
    try (JsonGenerator json = Json.createGenerator(responseBody)) {
        json.writeStartArray();
        try (Connection connection = openDatabaseConnection()) {
            ResultSet results = connection.prepareStatement("select first_name, last_name from users").executeQuery();
            while (results.next()) {
                json.writeStartObject();
                json.write("first_name", results.getString("first_name"));
                json.write("last_name", results.getString("last_name"));
                json.writeEnd();
            }
        }
        json.writeEnd();
    }

It does not load all the data into memory, it does not buffer all the JSON in memory, it automatically handles backpressure (if the socket blocks on a write, it will pause reading results from the database), it releases its resources cleanly if an exception occurs, and it's fourteen lines of very simple code.

You cannot do anything like this with an asynchronous API.

xxs · on March 17, 2023

>It does not load all the data into memory, it does not buffer all the JSON in memory, it automatically handles backpressure (if the socket blocks on a write, it will pause reading results from the database),

This particular feels a lot better with streams as jdbc is blocking by nature and it has no other options.

However, the simple code has its own issues. It closes the output stream on finishing the read, unless the output is really large it wastes quite a lot (the tcp/tlc handshakes, authentication/authorization/etc.). If the output is large enough (implied by not loading the json, and not having where clause in the sql) it exposes the database server to all the denial of service when the clients don't read fast enough, i.e. the database has to support as many connections as the frontend java.

fnordsensei · on March 16, 2023

Additionally, Loom is about structured concurrency, built on top of virtual threads.

Two different projects, albeit that one is making use of the output of the other.