| « French Quarter and Spatial Annotation | Civic Self-Esteem » |
The Touble with mod_proxy
February 13th, 2006I’m hacking on the mod_proxy module in Apache 2.2. I’ve got a pushlet application that for reasons of consistancy must be proxied through Apache. Unfortunately, mod_proxy does not work correctly when the chunked transfer encoding is used in a pushlet fashion.
Apache wants to buffer the upstream request, at least enough to fill an 8k packet. When it gets a chunk it writes it to an 8k buffer. If the buffer is not filled, it will wait until the next chunk arrives before flushing the buffer.
When a pushlet application is sending small chunks, in my case a couple dozen bytes, then the client, which is expecting these chunks as they are generated, is going to recieve the chunks as a much bigger chunk. My application generates its chunks intermitantly, so the client will not see a data except, oh, every other day.
There’s a fix already in place. It doesn’t fix my application. According to the bugzilla entry, it will read the upstream with a non-blocking read, and if the read returns EAGAIN, indicating that the read would block, the 8k buffer is flushed.
It’s broken in two ways. First, somewhere the code path, from the proxy call, to some logging calls, to the raw socket read, a condition swallows the EAGAIN and returns success.
More importantly, the headers, when read, are always read with a blocking read. The non-blocking mode is only used for the body. Since my application sends full chunks, it is the read of the headers that will block, and the mod_proxy code waits and waits.
I’ve gone and added a kludge to send packets as is. The 8k buffering is bad news for my application, since if I want to send a larger chunk, I’ll do that from my pushlet server code. I’ve already considered the chunk size when the chunk was formed. I don’t want mod_proxy to muck with it.
Perhaps I should read the HTTP 1.1 protocol to determine if I can make such petulant demands.
(2)
|

Comments 

It’s odd to me that this is just now seeing the light of day. I’ve just run into this problem myself, but it seems push/chunked encoding apps have been around for years.
It would be nice if the mod_proxy maintainers were to add a patch that allowed you to specify a particular URI as an as-is URI. No buffering whatsoever. Failing that…
I’ve created a cut down version of mod_proxy called mod_chunnel. Chunnel is an amalgam of chunked tunnel.
It begins with mod_proxy. Then it removes anything that is not http or https.
I’m going to follow up by removing any fancy matching or buffering, and any reference to HTTP 0.9 and 1.0.
This is assuming that this module is used by applications, servlets and applets for example, and not by human users, so the complexity for URI pattern matching is useless overhead. It is for applications that are using chunked transfer to create push applications, not for general chunked transfer proxying.
The changes to a pushlet based API are going to be disseminated to application developers, who can place any logic in their applications. There will not be a need to redirect web browsers, or maintain convoluted legacy URLs, which I’m sure is a primary application of mod_proxy.
Sound like a good idea? I want to have a much smaller module so it is easier to maintain.