History log of /haiku/src/kits/network/libnetapi/HttpRequest.cpp
Revision Date Author Comments
# 1322d507 27-Jan-2019 Adrien Destugues <pulkomandy@gmail.com>

HttpRequest: write whole request to socket

Better performance by using a single write, and some servers may not be
happy about getting so many TCP fragments for the HTTP header.

Change-Id: If7139e2a7748ea423d470676e70bd523a89031b2
Reviewed-on: https://review.haiku-os.org/c/909
Reviewed-by: waddlesplash <waddlesplash@gmail.com>


# 44cff45d 20-Aug-2018 Adrien Destugues <pulkomandy@pulkomandy.tk>

HttpRequest: chunk length are in hex

Thanks to mmlr for spotting this. The wrong format specifier was used,
which would lead the server to get the wrong size and do strange things.

Chunked uploads should now work a lot better.

While I was at it, put the line termination in the printf to save a
write to the socket (these are unbuffered and each of them costs us a
syscall, and in some cases this has been found to confuse webservers as
we end up sending super small TCP packets).


# 3a50df1b 01-Aug-2018 Murai Takashi <tmurai01@gmail.com>

Network kit: Fix -Wformat-overflow

Increase array size, since gcc8 x86_64 warns 'sprintf' output
between 2 and 20 bytes into a destination of size 16
[-Werror=format-overflow=].

Change-Id: I641db97d963b64b0c3434cd498f29f4dcb61c373
Reviewed-on: https://review.haiku-os.org/472
Reviewed-by: waddlesplash <waddlesplash@gmail.com>


# ed8f28a4 26-Nov-2017 Adrien Destugues <pulkomandy@pulkomandy.tk>

Move HeadersReceived hook after parsing of cookies

I still don't get what's happening, but doing the cookie parsing at the
same time as the main thread is handling HeadersReceived seems to
trigger a memory corruption, and it will escape all my attempts to debug
it (adding printfs or any other slight change to the code will make it
go away). So just chage the order we do things and hope that's enough to
always avoid it.

As a side effect, HeadersReceived can now rely on the cookies being
already stored in the cookie jar, which I think makes more sense.

I still plan to rewrite the HTTP request code as a proper state machine,
instead of one long Run() function. This would allow to run it in
smaller steps, and thus group multiple requests in a single thread
(triggering them from poll, select, or similar).


# f9e1854f 29-Jan-2017 Adrien Destugues <pulkomandy@gmail.com>

libbnetapi: fix access to HTTP headers

The asynchronous listener had no reliable way to access HTTP result and
headers from the callbacks. As the callbacks are triggered
asynchronously, they can be run after the request has carried on and,
for example, followed an HTTP redirect, clearing its internal state.

The HeadersReceived callback now passes a reference to BUrlResult for
the request. There are two cases:
- Synchronous listener: passes a reference to the request's results
directly
- Asynchronous listener: archives a copy of the result into the
notification message, and passes a reference to the unarchived copy.

Unfortunately this comes with several ABI and API breakages:
- Change to the prototype of HeadersReceived()
- Change to the class hierarchy of BUrlResult (implements BArchivable)

All users of HTTP requests will need to be updated if they implemented
in HeadersReceived or used BUrlResult.


# a9665fc6 31-Oct-2016 Adrien Destugues <pulkomandy@pulkomandy.tk>

HttpRequest: use data from the input buffer first

The HttpRequest protocol loop is designed using an input buffer storing
data from the socket. At each loop, we try to parse some of the data,
and then read more from the socket.

However, in some cases (in particular with chunks, which we parse only
one at a time in a loop iteration), we may not use all the data from the
buffer. Eventually, we will be left with an "empty" socket (nothing to
read from there) but the request not completed because there is still
data in the input buffer.

In that case, we would hang waiting for a read on the socket, instead of
processing data from the input buffer.

Change the code to read from the socket only if a loop iteration did not
manage to read anything from the input buffer. This means the input
buffer is too small for the next thing to process (it contains less than
one line of data, for example), and in that case we can safely read from
the socket without being blocked.

This should fix several cases where the network code was stuck doing
nothing, including https://my.justenergy.com/ reported in #13010.


# 2ecff85c 13-Nov-2015 Adrien Destugues <pulkomandy@pulkomandy.tk>

HttpRequest: don't send an empty URL for GET request

* The new proxy in Thalys trains doesn't like that.


# c6149613 10-Nov-2015 Adrien Destugues <pulkomandy@pulkomandy.tk>

Implement CONNECT pass-through for HTTPS proxy

* When using a proxy, HTTPS connexion must still go directly to the
target website. The proxy can then act as a TCP stream relay and just
transmit the raw SSL stream between the client and website.
* For this, we ask the proxy sending an HTTP request with the CONNECT
method. If the proxy supports this, we can then send anything as the
payload and it will be forwarded.
* Untested, as the network here in Dusseldorf doesn't let me use a
proxy.

ticket : #10973


# 4849ab6c 09-Nov-2015 Adrien Destugues <pulkomandy@pulkomandy.tk>

BHttpRequest: add SSL certificate exception management.

When an HTTPS request uses an SSL certificate that OpenSSL considers
untrusted, and the user decides to continue anyway, add the certificate
to an exception list. Match certificates against this list and don't ask
the user again if they are already there.

Fixes #12004. Thanks to markh for the initial patch and peeking into the
WebKit code!


# caf2bf01 18-Feb-2015 Adrien Destugues <pulkomandy@gmail.com>

indentation fix.

Catched by Axel.


# 6f1d5d48 19-Feb-2015 Adrien Destugues <pulkomandy@gmail.com>

HttpRequest: implement POST>GET conversion on redirects

302 and 303 redirects must convert POST requests to GET (and remove the
POST data).
Fixes the following problems (at least):
* Login to github going to the "unicorn!" page
* Gmail failing to load and staying at the loaderbar page


# c861dfdb 26-Jan-2015 Adrien Destugues <pulkomandy@gmail.com>

HttpRequest: fix HTTP to HTTPS redirects

When redirected from http to https, we did not switch to SSL and port
443 and kept using unencrypted http on port 80.


# 71c761d9 05-Jan-2015 Adrien Destugues <pulkomandy@gmail.com>

HttpRequest: rewind the input data before sending it

* This is needed for redirects to work as expected.
* It makes it harder to send data starting from the middle of a
BPositionIO (you now need a wrapper object), but that is an uncommon
feature so it is acceptable.

Fixes #11687.


# 5ee2151e 04-Nov-2014 Adrien Destugues <pulkomandy@gmail.com>

BHttpRequest: propagate SSL errors to listener

This way it's possible to handle them in applications.


# 4f978fe4 18-Oct-2014 Adrien Destugues <pulkomandy@gmail.com>

BNetBuffer: add some error checks.

The allocation of fImpl can fail, and some methods used it without
checking. Return an error code (or NULL or 0) instead of crashing in
these cases.

Also InitCheck the fInputBuffer in BHttpRequest before trying to use it.

Fixes #11350.


# c98378e5 15-Sep-2014 Adrien Destugues <pulkomandy@gmail.com>

Add HTTP proxy support.

* Move default context management to BUrlRequest since some code
(including the testsuite) bypass the BUrlProtocolRoster.
* Introduce proxy host and port in BUrlContext
* Have BHttpRequest use the proxy when making requests


# 89b4e98a 04-Aug-2014 Adrien Destugues <pulkomandy@pulkomandy.tk>

Move signal hack to BNetworkRequest

* This is used to unlock sockets when a read is pending after a close
* It is not needed on requests that don't use a socket.


# 2f9b1874 04-Aug-2014 Adrien Destugues <pulkomandy@pulkomandy.tk>

Factor out a BNetworkRequest

* Shares common behavior between the Gopher and HTTP request handlers.
* Most of this can be used when implemeting other protocols.


# bcd6a67b 04-Aug-2014 Adrien Destugues <pulkomandy@pulkomandy.tk>

Don't advertise deflate compression support.

There is some misunderstanding on what the "deflate" is, and we can't
reliably decode it in all cases. So, don't advertise support for it and
let servers use gzip (or no compression) instead.

Fixes #11093


# a1cce970 28-Jul-2014 Adrien Destugues <pulkomandy@pulkomandy.tk>

HttpRequest: more small fixes and cleanups

* Remove unneeded field fOutputHeaders and convert it to a local for the
only method that uses it,
* Don't return EOVERFLOW when flushing data from ZLib (the ZLib
decompressor returns this, but zlib docs states that this is NOT an
error condition).
* Replace unneeded temporary BNetBuffer of fixed size with BStackOrHeapArray.


# 021ebc2f 28-Jul-2014 Adrien Destugues <pulkomandy@pulkomandy.tk>

Add the port to the HTTP Host header when needed.

* When the port is not the default one, it must be added to the "Host"
header so the server knows what we're connecting to.

Fixes #11070.


# 6a13b12a 21-Jul-2014 Adrien Destugues <pulkomandy@pulkomandy.tk>

Write all HTTP headers to the socket in one go.

We don't have support for TCP_CORK, which would let the kernel handle
this, so this resulted in lots of very small packets being sent over the
network. Besides the performance issues, this confused aliceadsl.fr HTTP
server and prevented logging in to their website.

Fixes #10556.


# 9f7d29b0 21-Jul-2014 Adrien Destugues <pulkomandy@pulkomandy.tk>

Fix two problems with chunked gzipped HTTP replies.

* receiveEnd is set in a different place in case of chunked transfers,
which would cause the decompressor to never be flushed.
* In the case of chunked transfers, we call Flush() without any input
data (to flush only whatever is remaining in the decompression buffer).
This causes ZLib to return Z_BUF_ERROR which is translated to
B_BUFFER_OVERFLOW. This is a non-fatal error and is expected behavior in
that case. Don't handle this as an error, and do use the extracted data.

Fixes various cases of missing the last chunk of a page (pastie.org,
Google search results, and more).


# 92dd9f73 16-Jul-2014 Adrien Destugues <pulkomandy@pulkomandy.tk>

Style fixes, no functional changes.


# 3528905b 16-Jul-2014 Adrien Destugues <pulkomandy@pulkomandy.tk>

Parse multiple HTTP at once

Instead of relying on the global protocol loop to call _ParseHeaders
once for each header, extract as much as possible from the current
buffer.

This saves memory, avoids useless operations on the socket and various
processing steps, and fixes #10245.

Also improve the handling of 0-size requests to make sure they terminate
properly.


# 2573655b 02-Jul-2014 Ingo Weinhold <ingo_weinhold@gmx.de>

Revert "Revert "HttpRequest: support gzip and deflate compression.""

This reverts commit 256080b112e417fc4fd2f3f9fcb23485e1b23b42.

With the following changes:
* Adjusted to the BZlibCompressionAlgorithm API.
* Add some error handling.


# 256080b1 18-Jun-2014 Ingo Weinhold <ingo_weinhold@gmx.de>

Revert "HttpRequest: support gzip and deflate compression."

This reverts commit c3d0dd7a5e6ca1d2d43b6ebfb4c6a67300c780f7.

Conflicts:
src/kits/network/libnetapi/HttpRequest.cpp
src/kits/network/libnetapi/Jamfile


# 895fa41e 11-Jun-2014 Adrien Destugues <pulkomandy@pulkomandy.tk>

Make handling of Http Authentication thread safe

* Each BHttpAuthentication object is locked on all field accesses,
* They are owned by the BUrlContext and never deleted, so there is no
need for reference-counting them,
* The BUrlContext itself is now reference counted, and all BUrlRequests
hold a reference to it.

This makes sure using the BHttpAuthentication objects from requests is
thread-safe.


# 463ffbfd 11-Jun-2014 Adrien Destugues <pulkomandy@pulkomandy.tk>

First steps towards cookie jar thread-safety

* Change the semantics of the iterators copy constructor and assignment
operator: they now return a new iterator for the same cookie jar (and
same url for the UrlIterator). They don't try to point to the same
position as the copied iterator. The only purpose of these is to write
code such as:

Iterator it = jar.GetIterator();

so having a full copy isn't that useful.

* The per-domain cookie lists are now protected with a read-write lock.
The iterators retain a read lock while they are handling cookies from
that list. They get a write lock when doing Remove. Adding a cookie to
the jar also gets the write lock for the matching list

* Fix a memory leak when adding a new domain-list to the jar failed

* Simplify the declaration of the PrivateHashMap type (it would be
even simpler if HashMap was a public API)

* The domain hashmap is now a SynchronizedHashMap. It is locked as long
as an Iterator or UrlIterator exists, which may be a problem as these
are public APIs. Writing safe iterators for an hashmap with concurrent
accesses is not easy, so the API could be modified to return a list of
domains and a list of cookies for a given domain or URL instead. This
would suit the intended uses just as well.

* The jar now store const cookies, so there is no need to lock them for
access/modification. Updating a cookie is done by replacing it with
another one in the jar (with the same domain and value). There is still
the problem of deleting a cookie while other threads may still access
it, this will be fixed by making cookies BReferenceable.


# 4991d3fb 12-Apr-2014 Adrien Destugues <pulkomandy@pulkomandy.tk>

Fix build.


# cfc4b623 12-Apr-2014 Adrien Destugues <pulkomandy@pulkomandy.tk>

Network Kit: Prepare for HTTP range requests

* The DataReceived hook gets a position argument, making it possible for
listeners to handle out-of-order data (from two range requests at
different positions, for example)
* Adjust HaikuDepot (only user of the API in our sources)
* Add a copy constructor to HTTPRequest that copies the relevant
parameters from an existing request. Makes it easy to repeat a request
with a different range. Could be useful for restarting downloads, or
paralellizing them.
* Add SetRangeStart, SetRangeEnd calls to HTTPRequest, no implementation
yet. I'm putting all the API changes in this commit as it needs to be
synced with a matching haikuwebkit release.
* All archs must update to HaikuWebkit 1.3.0. Previous versions are
broken by this.


# a8d8e823 09-Apr-2014 Adrien Destugues <pulkomandy@pulkomandy.tk>

HttpRequest: handle 302 and 307 redirects.

* Makes jamendo.com player work, as their soundfiles are behind a
temporary redirect for load balancing.


# 67d06c88 17-Feb-2014 Adrien Destugues <pulkomandy@pulkomandy.tk>

Httprequest: remove "chunked" from http accept-encoding.

This is useless, chunked support is mandatory in HTTP1.1, and it's not a
content-encoding, but a transfer-encoding, so accept-encoding wouldn't
help anyway.


# 3e2e0e63 12-Feb-2014 Stephan Aßmus <superstippi@gmx.de>

BHttpRequest: Improve cookie string building loop...

... to avoid some checks. Does it make the code more readable? Not
that it was hard to follow before.


# 1514eb37 13-Feb-2014 Stephan Aßmus <superstippi@gmx.de>

BHttpRequest: More elegant way to build cookieString


# 6aeaeade 13-Feb-2014 Stephan Aßmus <superstippi@gmx.de>

BHttpRequest: Small fixes

* Allow BString to find a semicolon more efficiently.
* Replace a dynamic stack allocated array with BStackOrHeapArray
in one more place.


# 3e358c1f 13-Feb-2014 Adrien Destugues <pulkomandy@pulkomandy.tk>

HttpRequest: use BStackOrHeapArray

Sometimes we get enough bytes at once from the connection to trigger a
stack overflow. Allocate memory on the heap instead.


# c3d0dd7a 10-Feb-2014 Adrien Destugues <pulkomandy@pulkomandy.tk>

HttpRequest: support gzip and deflate compression.

* Use the ZlibDecompressor to decompress the data
* Advertise support in accept-encoding

This should make web browsing feel even faster on wesites that support
these compresion schemes. It also fixes some websites (www.ru,
rainloop.net, ...) that serve gzipped resources even to browser not
supporting it.


# 59246cf7 22-Jan-2014 Stephan Aßmus <superstippi@gmx.de>

HttpRequest: Apparently fContext can be NULL. CID 1162798


# ab390d3a 17-Jan-2014 Adrien Destugues <pulkomandy@pulkomandy.tk>

Style fixes and allocation checks


# 547c1486 16-Jan-2014 Adrien Destugues <pulkomandy@pulkomandy.tk>

Add some missing std::nothrow

... and allocation failure checks.


# 3d864cd8 10-Jan-2014 Adrien Destugues <pulkomandy@pulkomandy.tk>

Remove B_PROT_* and related code

Use standard error codes instead.
This allows using error code returned by the underlying functions
directly, and makes it possible to use strerror for debugging. So, we
can also remove StatusString() from the various *Request classes.


# 5b53e2e5 02-Jan-2014 Adrien Destugues <pulkomandy@pulkomandy.tk>

HttpRequest: close the connection on Stop()

When calling Stop(), we expect the request thread to exit as soon as
possible. Closing the connection unlocks it from any blocking read() or
write(), avoiding some lockup situations.


# 07d157db 11-Dec-2013 Adrien Destugues <pulkomandy@pulkomandy.tk>

BHttpRequest::Result() returns a BUrlResult&.

This overrides BUrlRequest::Result. The returned reference points to a
BHttpResult and can be cast by callers.


# dc6d2ef6 26-Nov-2013 Adrien Destugues <pulkomandy@pulkomandy.tk>

HttpRequest: simplify and optimize receiving loop

* Do not start with a ridiculously small buffer for socket reads.
Sockets return data they have available, instead of trying to fill as
much of the buffer as possible. In some cases a single Ethernet frame
can hold a complete request.
* Remove some looping and try parsing all the request in sequence each
time we receive some bytes.
* Avoid reallocating a temporary buffer each time we read some data from
the socket. Instead, allocate it once, and grow it as needed. Since
servers usually send chunks of equal size, we should get away with one
reallocation on the first chunk.


# 754bbf48 26-Nov-2013 Jérôme Duval <jerome.duval@gmail.com>

libnetapi: second pass of style cleanup

* remarks from Axel


# 509755e1 26-Nov-2013 Adrien Destugues <pulkomandy@pulkomandy.tk>

HttpRequest: remove fOutputBuffer

We can send the data directly to the output socket instead of copying it
into a BString first, at the cost of very slightly less information in
debug output.


# 564e2566 15-Nov-2013 Adrien Destugues <pulkomandy@pulkomandy.tk>

Various fixes to Services Kit

* Remove useless dummy protocol loop in UrlRequest
* Stop HTTP requests before deleting the socket and other things the
loop may still be using
* Deletion of items from the authentication map wasn't working
* Remove some debug traces


# c2c1ce1d 04-Nov-2013 John Scipione <jscipione@gmail.com>

Style fixes to HttpRequest


# 9ce2f7e3 28-Oct-2013 Adrien Destugues <pulkomandy@pulkomandy.tk>

Improve HTTP authentication support.

The authentication state is stored (in a hash map, using the domain+path
as a key) in the UrlContext class. It can then be reused for multiple
requests to the same place. We also lookup stored authentications for
parent directories and stop at the first we find.

Authentication state is not stored on disk (unlike cookies), and there
can only be one for each domain+path.


# f6782201 24-Oct-2013 Adrien Destugues <pulkomandy@pulkomandy.tk>

Move UrlResult to HttpResult

* Remove the fRawData field, as handling it is too complicated (it's
not easy to have proper copy semantics on a BDataIO) and it's not used
anyway, as the listener DataReceived call is enough to get the data and
handle it.
* All the remaining fields are HTTP-only, so rename the class to
HttpResult and attach it to HttpRequest instead of UrlRequest.


# b3d13a00 19-Oct-2013 Adrien Destugues <pulkomandy@pulkomandy.tk>

Network Kit: Coverity scan review and fixes

CID 1108353, 1108335: memory leak.
CID 610473: unused variable.
CID 1108446, 1108433, 1108432, 1108419, 1108400, 991710, 991713, 991712,
610098, 610097, 610096, 610095: uninitialized field
CID 1108421: unused field

Change the ownership of the result for Url/HttpRequests. The request now
owns its result and you either access it by reference while the request
is live, or copy it to keep it after the request destruction. To help
with that, get BUrlResult copy constructor and assignment operator to
work.

Performance issue: copying the BUrlResult also copies the underlying
BMallocIO data. This should be shared between the BUrlResult objects to
make the copy lighter. The case of BUrlSynchronousRequest is now
particularly inefficient, with at least 2 copies needed to get at the
result.


# 25b034e9 17-Oct-2013 Adrien Destugues <pulkomandy@pulkomandy.tk>

HttpRequest: docs and memory management fixes

* Now takes ownership of headers, form data and input data
* Split Set* and Adopt* methods to help with proper use of this (Set
does a copy)
* Write documentation.


# ced0e0be 16-Oct-2013 Adrien Destugues <pulkomandy@pulkomandy.tk>

BUrl: use a regex to parse URLs

* The RFC provide a regular expression for URI parsing, so just use it.
* Allows parsing URIs with missing components (no scheme or authority)
* This allows to parse relative URLs as expected
* Can also handle things such as data: or mailto:
* Also more fixes to handling of incomplete URIs, some flags weren't
always set to the right values.

This gets Windows Live Mail (or is it called Outlook?) working, with
some other fixes on WebKit side.


# 7696f7dd 15-Oct-2013 Adrien Destugues <pulkomandy@pulkomandy.tk>

HttpRequest: allow custom http methods

* The W3C XmlHttpRequest testsuite likes to use "CHICKEN" as a method.
* Also add constants for all specified methods in HTTP 1.1.


# c9d31eee 14-Oct-2013 Adrien Destugues <pulkomandy@pulkomandy.tk>

More cookie fixes

* Add some error handling in NetworkCookie and don't add broken cookies
(or should I say crumbs?) to the cookie jar
* More control on the path and domain, as well as the expiration time

We now pass Opera cookie testsuite functionality tests, as well as some
of the negative tests (we even do better than curl). Not going further
right now as this works well enough for positive cases and most
security/privacy issues are fixed (cross domain and cross path cookie
setting or spying).


# b7d85d66 11-Oct-2013 Adrien Destugues <pulkomandy@pulkomandy.tk>

fix build.


# f9d987ae 11-Oct-2013 Adrien Destugues <pulkomandy@pulkomandy.tk>

HttpRequest: put cookies in a single header entry

* Http spec says headers can be split when they are comma separated
* However, cookies are semicolon separated, so it is not acceptable to
split them.
* We will want to implement some way to limit the cookie header entry
size, as servers have a limit on what they can accept (usually around 4K
characters). The RFC also says we don't need to remember more than 20
cookies per domain.


# 185471c8 10-Oct-2013 Adrien Destugues <pulkomandy@pulkomandy.tk>

HttpRequest: follow 302 redirects by default.


# 8ca6eeb7 09-Oct-2013 Adrien Destugues <pulkomandy@pulkomandy.tk>

HttpRequest: missing fields initializations

* Some fields weren't initialized, leading to random crashes later on
* Remove the enum that was used for protocol options
* Use a single field to track the request state, instead of separate
booleans.


# a5826aaf 08-Oct-2013 Adrien Destugues <pulkomandy@pulkomandy.tk>

Don't send a chunked transfer terminator for non-chunked transfers.

* Fixes oversight from previous change.
* Thanks hamishm for watching !


# afd547b3 08-Oct-2013 Adrien Destugues <pulkomandy@pulkomandy.tk>

Refactor UrlRequest/UrlProtocol in the Service Kit

* Remove the BUrlRequest class, which was only delegating work to
BUrlProtocol and subclasses
* Rename BUrlProtocol to BUrlRequest, and BUrlRequestHttp to BHttpRequest
* Creating a request is now done through the BUrlProtocolRoster. For
now there is just a static MakeRequest method, this will be completed
when we get to actually allowing add-ons to provide different request
handlers.

This allows cleanup of the API for requests:
* Remove the universal SetOption method with constants, and have
dedicated setters for each protocol option.
* Setters can now have multiple parameters, for example you can give
BHTTPRequest a BDataIO and a known size
* In this case, the BHttpRequest will not use HTTP chunked transfers,
which were always used before and made most servers unhappy (tested and
failed with lighttpd, google accounts and github).


# caf2bf0181d1216576f9c06b6c6886194241c325 18-Feb-2015 Adrien Destugues <pulkomandy@gmail.com>

indentation fix.

Catched by Axel.


# 6f1d5d480b3d6c96e3324f8f3792dd71071224bb 19-Feb-2015 Adrien Destugues <pulkomandy@gmail.com>

HttpRequest: implement POST>GET conversion on redirects

302 and 303 redirects must convert POST requests to GET (and remove the
POST data).
Fixes the following problems (at least):
* Login to github going to the "unicorn!" page
* Gmail failing to load and staying at the loaderbar page


# c861dfdb7e58022b877c94a6f4ff3144854a240b 26-Jan-2015 Adrien Destugues <pulkomandy@gmail.com>

HttpRequest: fix HTTP to HTTPS redirects

When redirected from http to https, we did not switch to SSL and port
443 and kept using unencrypted http on port 80.


# 71c761d99bdc590d853f7a246d14247493008a0e 05-Jan-2015 Adrien Destugues <pulkomandy@gmail.com>

HttpRequest: rewind the input data before sending it

* This is needed for redirects to work as expected.
* It makes it harder to send data starting from the middle of a
BPositionIO (you now need a wrapper object), but that is an uncommon
feature so it is acceptable.

Fixes #11687.


# 5ee2151e2cfd6b8cff463f4b94a8dd166ff4c5f6 04-Nov-2014 Adrien Destugues <pulkomandy@gmail.com>

BHttpRequest: propagate SSL errors to listener

This way it's possible to handle them in applications.


# 4f978fe4dbc53cdae952b9df5d618cf8d9023dcb 18-Oct-2014 Adrien Destugues <pulkomandy@gmail.com>

BNetBuffer: add some error checks.

The allocation of fImpl can fail, and some methods used it without
checking. Return an error code (or NULL or 0) instead of crashing in
these cases.

Also InitCheck the fInputBuffer in BHttpRequest before trying to use it.

Fixes #11350.


# c98378e51ae02d8ad6e833c7eb8223a10cbd46f5 15-Sep-2014 Adrien Destugues <pulkomandy@gmail.com>

Add HTTP proxy support.

* Move default context management to BUrlRequest since some code
(including the testsuite) bypass the BUrlProtocolRoster.
* Introduce proxy host and port in BUrlContext
* Have BHttpRequest use the proxy when making requests


# 89b4e98a8fbef003d06732181a384bda84c969c8 04-Aug-2014 Adrien Destugues <pulkomandy@pulkomandy.tk>

Move signal hack to BNetworkRequest

* This is used to unlock sockets when a read is pending after a close
* It is not needed on requests that don't use a socket.


# 2f9b1874977669807fe200c7d3595d86a1984454 04-Aug-2014 Adrien Destugues <pulkomandy@pulkomandy.tk>

Factor out a BNetworkRequest

* Shares common behavior between the Gopher and HTTP request handlers.
* Most of this can be used when implemeting other protocols.


# bcd6a67bc063e0ec4fd4506f7d573d24cac8fb06 04-Aug-2014 Adrien Destugues <pulkomandy@pulkomandy.tk>

Don't advertise deflate compression support.

There is some misunderstanding on what the "deflate" is, and we can't
reliably decode it in all cases. So, don't advertise support for it and
let servers use gzip (or no compression) instead.

Fixes #11093


# a1cce97050323eca3dc37d2d44eaf89b3e0be323 28-Jul-2014 Adrien Destugues <pulkomandy@pulkomandy.tk>

HttpRequest: more small fixes and cleanups

* Remove unneeded field fOutputHeaders and convert it to a local for the
only method that uses it,
* Don't return EOVERFLOW when flushing data from ZLib (the ZLib
decompressor returns this, but zlib docs states that this is NOT an
error condition).
* Replace unneeded temporary BNetBuffer of fixed size with BStackOrHeapArray.


# 021ebc2f8c1c004defbb817a7cdebf80b6eaf346 28-Jul-2014 Adrien Destugues <pulkomandy@pulkomandy.tk>

Add the port to the HTTP Host header when needed.

* When the port is not the default one, it must be added to the "Host"
header so the server knows what we're connecting to.

Fixes #11070.


# 6a13b12a9bc79961dda05acd35f99e9dc9c3b04d 21-Jul-2014 Adrien Destugues <pulkomandy@pulkomandy.tk>

Write all HTTP headers to the socket in one go.

We don't have support for TCP_CORK, which would let the kernel handle
this, so this resulted in lots of very small packets being sent over the
network. Besides the performance issues, this confused aliceadsl.fr HTTP
server and prevented logging in to their website.

Fixes #10556.


# 9f7d29b05e4322045bb91438fa951fd454df43e0 21-Jul-2014 Adrien Destugues <pulkomandy@pulkomandy.tk>

Fix two problems with chunked gzipped HTTP replies.

* receiveEnd is set in a different place in case of chunked transfers,
which would cause the decompressor to never be flushed.
* In the case of chunked transfers, we call Flush() without any input
data (to flush only whatever is remaining in the decompression buffer).
This causes ZLib to return Z_BUF_ERROR which is translated to
B_BUFFER_OVERFLOW. This is a non-fatal error and is expected behavior in
that case. Don't handle this as an error, and do use the extracted data.

Fixes various cases of missing the last chunk of a page (pastie.org,
Google search results, and more).


# 92dd9f7360a9f664088b00c2bc0f34527b798730 16-Jul-2014 Adrien Destugues <pulkomandy@pulkomandy.tk>

Style fixes, no functional changes.


# 3528905be60b3752b3a99f5c1ce32bb7f74e6be8 16-Jul-2014 Adrien Destugues <pulkomandy@pulkomandy.tk>

Parse multiple HTTP at once

Instead of relying on the global protocol loop to call _ParseHeaders
once for each header, extract as much as possible from the current
buffer.

This saves memory, avoids useless operations on the socket and various
processing steps, and fixes #10245.

Also improve the handling of 0-size requests to make sure they terminate
properly.


# 2573655b7962928c847ecc4690d73f0f5b6afb19 02-Jul-2014 Ingo Weinhold <ingo_weinhold@gmx.de>

Revert "Revert "HttpRequest: support gzip and deflate compression.""

This reverts commit 256080b112e417fc4fd2f3f9fcb23485e1b23b42.

With the following changes:
* Adjusted to the BZlibCompressionAlgorithm API.
* Add some error handling.


# 256080b112e417fc4fd2f3f9fcb23485e1b23b42 18-Jun-2014 Ingo Weinhold <ingo_weinhold@gmx.de>

Revert "HttpRequest: support gzip and deflate compression."

This reverts commit c3d0dd7a5e6ca1d2d43b6ebfb4c6a67300c780f7.

Conflicts:
src/kits/network/libnetapi/HttpRequest.cpp
src/kits/network/libnetapi/Jamfile


# 895fa41e0b50a1117eb2a20a539f394aaa26de51 11-Jun-2014 Adrien Destugues <pulkomandy@pulkomandy.tk>

Make handling of Http Authentication thread safe

* Each BHttpAuthentication object is locked on all field accesses,
* They are owned by the BUrlContext and never deleted, so there is no
need for reference-counting them,
* The BUrlContext itself is now reference counted, and all BUrlRequests
hold a reference to it.

This makes sure using the BHttpAuthentication objects from requests is
thread-safe.


# 463ffbfde42e3486c5b2366b7903a3aa1ddf77e1 11-Jun-2014 Adrien Destugues <pulkomandy@pulkomandy.tk>

First steps towards cookie jar thread-safety

* Change the semantics of the iterators copy constructor and assignment
operator: they now return a new iterator for the same cookie jar (and
same url for the UrlIterator). They don't try to point to the same
position as the copied iterator. The only purpose of these is to write
code such as:

Iterator it = jar.GetIterator();

so having a full copy isn't that useful.

* The per-domain cookie lists are now protected with a read-write lock.
The iterators retain a read lock while they are handling cookies from
that list. They get a write lock when doing Remove. Adding a cookie to
the jar also gets the write lock for the matching list

* Fix a memory leak when adding a new domain-list to the jar failed

* Simplify the declaration of the PrivateHashMap type (it would be
even simpler if HashMap was a public API)

* The domain hashmap is now a SynchronizedHashMap. It is locked as long
as an Iterator or UrlIterator exists, which may be a problem as these
are public APIs. Writing safe iterators for an hashmap with concurrent
accesses is not easy, so the API could be modified to return a list of
domains and a list of cookies for a given domain or URL instead. This
would suit the intended uses just as well.

* The jar now store const cookies, so there is no need to lock them for
access/modification. Updating a cookie is done by replacing it with
another one in the jar (with the same domain and value). There is still
the problem of deleting a cookie while other threads may still access
it, this will be fixed by making cookies BReferenceable.


# 4991d3fb528cbf34a3103dfc124c9f29a6d499c4 12-Apr-2014 Adrien Destugues <pulkomandy@pulkomandy.tk>

Fix build.


# cfc4b62367c09ad0c26ba233d3804d815948463a 12-Apr-2014 Adrien Destugues <pulkomandy@pulkomandy.tk>

Network Kit: Prepare for HTTP range requests

* The DataReceived hook gets a position argument, making it possible for
listeners to handle out-of-order data (from two range requests at
different positions, for example)
* Adjust HaikuDepot (only user of the API in our sources)
* Add a copy constructor to HTTPRequest that copies the relevant
parameters from an existing request. Makes it easy to repeat a request
with a different range. Could be useful for restarting downloads, or
paralellizing them.
* Add SetRangeStart, SetRangeEnd calls to HTTPRequest, no implementation
yet. I'm putting all the API changes in this commit as it needs to be
synced with a matching haikuwebkit release.
* All archs must update to HaikuWebkit 1.3.0. Previous versions are
broken by this.


# a8d8e823ea0b5ad0716d028ee43381a0c4691664 09-Apr-2014 Adrien Destugues <pulkomandy@pulkomandy.tk>

HttpRequest: handle 302 and 307 redirects.

* Makes jamendo.com player work, as their soundfiles are behind a
temporary redirect for load balancing.


# 67d06c88020cf9a6dac2718b8ca18544325e4397 17-Feb-2014 Adrien Destugues <pulkomandy@pulkomandy.tk>

Httprequest: remove "chunked" from http accept-encoding.

This is useless, chunked support is mandatory in HTTP1.1, and it's not a
content-encoding, but a transfer-encoding, so accept-encoding wouldn't
help anyway.


# 3e2e0e63cdb9945167228e7a5ea43d96988a57a5 12-Feb-2014 Stephan Aßmus <superstippi@gmx.de>

BHttpRequest: Improve cookie string building loop...

... to avoid some checks. Does it make the code more readable? Not
that it was hard to follow before.


# 1514eb37531fa415e4bbea76e3a9b250a8c62a42 13-Feb-2014 Stephan Aßmus <superstippi@gmx.de>

BHttpRequest: More elegant way to build cookieString


# 6aeaeade6b48a9161ee6b888f8ca6a9acad665d8 13-Feb-2014 Stephan Aßmus <superstippi@gmx.de>

BHttpRequest: Small fixes

* Allow BString to find a semicolon more efficiently.
* Replace a dynamic stack allocated array with BStackOrHeapArray
in one more place.


# 3e358c1fcab3196ff9e1805a3fc484fd9884cac7 13-Feb-2014 Adrien Destugues <pulkomandy@pulkomandy.tk>

HttpRequest: use BStackOrHeapArray

Sometimes we get enough bytes at once from the connection to trigger a
stack overflow. Allocate memory on the heap instead.


# c3d0dd7a5e6ca1d2d43b6ebfb4c6a67300c780f7 10-Feb-2014 Adrien Destugues <pulkomandy@pulkomandy.tk>

HttpRequest: support gzip and deflate compression.

* Use the ZlibDecompressor to decompress the data
* Advertise support in accept-encoding

This should make web browsing feel even faster on wesites that support
these compresion schemes. It also fixes some websites (www.ru,
rainloop.net, ...) that serve gzipped resources even to browser not
supporting it.


# 59246cf737f7866f93aa8a704ee820a5cd730704 22-Jan-2014 Stephan Aßmus <superstippi@gmx.de>

HttpRequest: Apparently fContext can be NULL. CID 1162798


# ab390d3af3a0167584041758de25f383660c5332 17-Jan-2014 Adrien Destugues <pulkomandy@pulkomandy.tk>

Style fixes and allocation checks


# 547c1486ff31415b89ffc4e87e3d06e933850b96 16-Jan-2014 Adrien Destugues <pulkomandy@pulkomandy.tk>

Add some missing std::nothrow

... and allocation failure checks.


# 3d864cd870209032d0aa58c39b40b9cf15c79d1e 10-Jan-2014 Adrien Destugues <pulkomandy@pulkomandy.tk>

Remove B_PROT_* and related code

Use standard error codes instead.
This allows using error code returned by the underlying functions
directly, and makes it possible to use strerror for debugging. So, we
can also remove StatusString() from the various *Request classes.


# 5b53e2e516b47db8f6b932c43a7b8389abb18587 02-Jan-2014 Adrien Destugues <pulkomandy@pulkomandy.tk>

HttpRequest: close the connection on Stop()

When calling Stop(), we expect the request thread to exit as soon as
possible. Closing the connection unlocks it from any blocking read() or
write(), avoiding some lockup situations.


# 07d157db549c536dcf3450f73db36ba4f7c020ec 11-Dec-2013 Adrien Destugues <pulkomandy@pulkomandy.tk>

BHttpRequest::Result() returns a BUrlResult&.

This overrides BUrlRequest::Result. The returned reference points to a
BHttpResult and can be cast by callers.


# dc6d2ef6642035671a23b09b61f41a938aaeac61 26-Nov-2013 Adrien Destugues <pulkomandy@pulkomandy.tk>

HttpRequest: simplify and optimize receiving loop

* Do not start with a ridiculously small buffer for socket reads.
Sockets return data they have available, instead of trying to fill as
much of the buffer as possible. In some cases a single Ethernet frame
can hold a complete request.
* Remove some looping and try parsing all the request in sequence each
time we receive some bytes.
* Avoid reallocating a temporary buffer each time we read some data from
the socket. Instead, allocate it once, and grow it as needed. Since
servers usually send chunks of equal size, we should get away with one
reallocation on the first chunk.


# 754bbf4866278ecd2da2c517560bc90c67a3a6f5 26-Nov-2013 Jérôme Duval <jerome.duval@gmail.com>

libnetapi: second pass of style cleanup

* remarks from Axel


# 509755e136168e2930dd7e1301d979f6e9244778 26-Nov-2013 Adrien Destugues <pulkomandy@pulkomandy.tk>

HttpRequest: remove fOutputBuffer

We can send the data directly to the output socket instead of copying it
into a BString first, at the cost of very slightly less information in
debug output.


# 564e2566492c1b9cf9bf7fdaede7ea7683dab5dd 15-Nov-2013 Adrien Destugues <pulkomandy@pulkomandy.tk>

Various fixes to Services Kit

* Remove useless dummy protocol loop in UrlRequest
* Stop HTTP requests before deleting the socket and other things the
loop may still be using
* Deletion of items from the authentication map wasn't working
* Remove some debug traces


# c2c1ce1dc5b934c07e7a7a276501e18ce5f78471 04-Nov-2013 John Scipione <jscipione@gmail.com>

Style fixes to HttpRequest


# 9ce2f7e3863c7c69284eaad6466c6b0247f0037b 28-Oct-2013 Adrien Destugues <pulkomandy@pulkomandy.tk>

Improve HTTP authentication support.

The authentication state is stored (in a hash map, using the domain+path
as a key) in the UrlContext class. It can then be reused for multiple
requests to the same place. We also lookup stored authentications for
parent directories and stop at the first we find.

Authentication state is not stored on disk (unlike cookies), and there
can only be one for each domain+path.


# f6782201f0dd84204dacf7b1f018f56d9594d972 24-Oct-2013 Adrien Destugues <pulkomandy@pulkomandy.tk>

Move UrlResult to HttpResult

* Remove the fRawData field, as handling it is too complicated (it's
not easy to have proper copy semantics on a BDataIO) and it's not used
anyway, as the listener DataReceived call is enough to get the data and
handle it.
* All the remaining fields are HTTP-only, so rename the class to
HttpResult and attach it to HttpRequest instead of UrlRequest.


# b3d13a000c6fd58d91bdf15fa3abdcc4d3546eff 19-Oct-2013 Adrien Destugues <pulkomandy@pulkomandy.tk>

Network Kit: Coverity scan review and fixes

CID 1108353, 1108335: memory leak.
CID 610473: unused variable.
CID 1108446, 1108433, 1108432, 1108419, 1108400, 991710, 991713, 991712,
610098, 610097, 610096, 610095: uninitialized field
CID 1108421: unused field

Change the ownership of the result for Url/HttpRequests. The request now
owns its result and you either access it by reference while the request
is live, or copy it to keep it after the request destruction. To help
with that, get BUrlResult copy constructor and assignment operator to
work.

Performance issue: copying the BUrlResult also copies the underlying
BMallocIO data. This should be shared between the BUrlResult objects to
make the copy lighter. The case of BUrlSynchronousRequest is now
particularly inefficient, with at least 2 copies needed to get at the
result.


# 25b034e99c56c4b297dcb23ea58262fd81660bac 17-Oct-2013 Adrien Destugues <pulkomandy@pulkomandy.tk>

HttpRequest: docs and memory management fixes

* Now takes ownership of headers, form data and input data
* Split Set* and Adopt* methods to help with proper use of this (Set
does a copy)
* Write documentation.


# ced0e0be044a482503d4f9d37e9db1b0af77143f 16-Oct-2013 Adrien Destugues <pulkomandy@pulkomandy.tk>

BUrl: use a regex to parse URLs

* The RFC provide a regular expression for URI parsing, so just use it.
* Allows parsing URIs with missing components (no scheme or authority)
* This allows to parse relative URLs as expected
* Can also handle things such as data: or mailto:
* Also more fixes to handling of incomplete URIs, some flags weren't
always set to the right values.

This gets Windows Live Mail (or is it called Outlook?) working, with
some other fixes on WebKit side.


# 7696f7dd5492c8553f1c81a7ac27c4c4403dbcaa 15-Oct-2013 Adrien Destugues <pulkomandy@pulkomandy.tk>

HttpRequest: allow custom http methods

* The W3C XmlHttpRequest testsuite likes to use "CHICKEN" as a method.
* Also add constants for all specified methods in HTTP 1.1.


# c9d31eeed6ffa94ae37ce66959242a9a9ed40a60 14-Oct-2013 Adrien Destugues <pulkomandy@pulkomandy.tk>

More cookie fixes

* Add some error handling in NetworkCookie and don't add broken cookies
(or should I say crumbs?) to the cookie jar
* More control on the path and domain, as well as the expiration time

We now pass Opera cookie testsuite functionality tests, as well as some
of the negative tests (we even do better than curl). Not going further
right now as this works well enough for positive cases and most
security/privacy issues are fixed (cross domain and cross path cookie
setting or spying).


# b7d85d666a5f56a33f856c13933f31bb4791e4c4 11-Oct-2013 Adrien Destugues <pulkomandy@pulkomandy.tk>

fix build.


# f9d987ae68a6355b123ce9f9b4f96f772fab2d7d 11-Oct-2013 Adrien Destugues <pulkomandy@pulkomandy.tk>

HttpRequest: put cookies in a single header entry

* Http spec says headers can be split when they are comma separated
* However, cookies are semicolon separated, so it is not acceptable to
split them.
* We will want to implement some way to limit the cookie header entry
size, as servers have a limit on what they can accept (usually around 4K
characters). The RFC also says we don't need to remember more than 20
cookies per domain.


# 185471c84479a23ff1cd2eed751002c986287524 10-Oct-2013 Adrien Destugues <pulkomandy@pulkomandy.tk>

HttpRequest: follow 302 redirects by default.


# 8ca6eeb77ca6e2715f5a4c4806211b503472d5ae 09-Oct-2013 Adrien Destugues <pulkomandy@pulkomandy.tk>

HttpRequest: missing fields initializations

* Some fields weren't initialized, leading to random crashes later on
* Remove the enum that was used for protocol options
* Use a single field to track the request state, instead of separate
booleans.


# a5826aafb070e7b186f9675a6bc8319b0dc2c5ca 08-Oct-2013 Adrien Destugues <pulkomandy@pulkomandy.tk>

Don't send a chunked transfer terminator for non-chunked transfers.

* Fixes oversight from previous change.
* Thanks hamishm for watching !


# afd547b368a43aa45c2be5dde6052242ea1eefce 08-Oct-2013 Adrien Destugues <pulkomandy@pulkomandy.tk>

Refactor UrlRequest/UrlProtocol in the Service Kit

* Remove the BUrlRequest class, which was only delegating work to
BUrlProtocol and subclasses
* Rename BUrlProtocol to BUrlRequest, and BUrlRequestHttp to BHttpRequest
* Creating a request is now done through the BUrlProtocolRoster. For
now there is just a static MakeRequest method, this will be completed
when we get to actually allowing add-ons to provide different request
handlers.

This allows cleanup of the API for requests:
* Remove the universal SetOption method with constants, and have
dedicated setters for each protocol option.
* Setters can now have multiple parameters, for example you can give
BHTTPRequest a BDataIO and a known size
* In this case, the BHttpRequest will not use HTTP chunked transfers,
which were always used before and made most servers unhappy (tested and
failed with lighttpd, google accounts and github).