478 Commits

Author SHA1 Message Date
977142214c client: fix race between transport draining and new RPCs (#2919)
Before these fixes, it was possible to see errors on new RPCs after a
connection began draining, and before establishing a new connection.  There is
an inherent race between choosing a SubConn and attempting to creating a stream
on it.  We should be able to avoid application-visible RPC errors due to this
with transparent retry.  However, several bugs were preventing this from
working correctly:

1. Non-wait-for-ready RPCs were skipping transparent retry, though the retry
design calls for retrying them.

2. The transport closed itself (and would consequently error new RPCs) before
notifying the SubConn that it was draining.

3. The SubConn wasn't synchronously updating itself once it was notified about
the closing or draining state.

4. The SubConn would go into the TRANSIENT_FAILURE state instantaneously,
causing RPCs to fail instead of queue.
2019-07-22 16:07:55 -07:00
f7de2c8d62 balancer: filter out grpclb addresses if balancer is not grpclb (#2907) 2019-07-17 15:08:56 -07:00
59fd1f3d41 server: immediately close all connections created after GracefulStop (#2903)
Internal cleanup: replace quit/quitOnce/done/doneOnce with grpcsync.Events.
2019-07-12 13:14:19 -07:00
5caf962939 client: addrConn NewStream and health check cleanup (#2848) 2019-06-26 11:15:17 -07:00
7472edcc1e metadata: write original md before appended md (#2879) 2019-06-25 10:34:12 -07:00
70e8b38052 test: end2end test improvements separate server and client configs. (#2877)
- Seperated and documented the options for client and server sides.
- Better support for multiple grpc.Servers. This will be used in other
  improvements that I have in the works.
- Moved some common functionality from channelz_test.go to
  end2end_test.go.
- Added an option to use the default health service implementation, instead
 of each test creating a new health.Server and passing it in. The
 inidividual tests have not been changed in this PR. I will do that in a
 follow up PR to keep the changes to a reasonable size.
- Fixed one of the tests which had to be fixed because of the separation
  of client and server configs.
2019-06-24 14:53:45 -07:00
ecb921ddb9 test: end2end test cleanup http handler server (#2876)
* end2end test cleanup #1

- Removed some old code which has a TODO asking for it's removal once
  Go1.6 and Go1.7 support is gone.
- Cleaned up a couple of error messages along with it.
2019-06-20 15:35:55 -07:00
fc15416d24 test: fix GoAwayThenClose by wait for cc state change (#2855)
In the end of the test, 10 RPCs are made to make sure data is sent to
the second server. The first RPC of these 10 is made right after the
second server's listener receives a connection. But at this time, the
connectivity state on the client side is not set to READY yet (though
ac's state should be either connecting or ready, the race between ac
and balancer could cause cc to still be in transient failure). So the
first RPC fails due to transient failure, but the following 9 will
succeed.
2019-06-13 15:57:55 -07:00
c7831546a1 test: extend RPC timeout for TestHTTPHeaderFrameErrorHandlingHTTPMode (#2861)
This test sometimes fails with error creating stream due to
DeadlineExceeded. It's very hard to reproduce (failed twice in 100000
runs). Extend the RPC timeout in case it's too short.
2019-06-13 15:29:28 -07:00
684ef04609 Fix a typo in the comment. (#2866)
I was trying to run this test and I had copied the name of the function
from the comment, and it took a good while to figure out why
`go test -run` was returning `testing: warning: no tests to run`.
2019-06-12 15:56:39 -07:00
cd89eaf40e test: fix Test/GracefulStop by not removing activeStreams too aggresivelly (#2857)
Before this fix, stream is removed from activeStreams in finishStream,
which happens when the service handler returns status, without waiting
for the status to be sent by loopyWriter. If GracefulStop() is called in
between, it will close the connection (because activeStreams is empty),
which causes the RPC to fail with "transport is closing". This change
moves the activeStreams cleanup into loopyWriter, after sending status
on wire.
2019-06-12 10:26:18 -07:00
a5396fd45c Remove call to proto.Clone() in http2Server.WriteStatus. (#2842)
* Expose a method from the internal package to get to the raw
  StatusProto wrapped by the status error, and use it from
  http2Server.WriteStatus().
* Add a helper method in internal/testutils to compare two status errors
  and update test code to use that instead of reflect.DeepEqual()
2019-06-10 15:03:12 -07:00
d40a995895 balancer/resolver: add loadBalancingConfig and pre-parsing support (#2732) 2019-05-30 09:12:58 -07:00
f34abd9513 xds: add orca generated file, and move orca to xds folder (#2804) 2019-05-24 12:35:57 -07:00
b7325a3150 Update go.mod for golang/x/tools and staticcheck (#2832) 2019-05-24 11:13:46 -07:00
42baa8b199 channelz: wait for clean up before next test (#2797) 2019-05-02 14:47:50 -07:00
47e1ebe575 client: return helpful error message when wait-for-ready RPCs fail with timeout (#2777) 2019-04-29 12:42:19 -07:00
8260df7a61 grpc: implementation of PreparedMsg API
grpc: implementation of PreparedMsg API
2019-04-19 14:08:08 -07:00
955eb8a3c8 channelz: cleanup channel registration if Dial fails (#2733) 2019-04-02 15:42:35 -07:00
d389f9fac6 balancer: add server loads from RPC trailers to DoneInfo (#2641) 2019-04-02 11:15:36 -07:00
3910b873d3 bar: add ability to update resolver state atomically and pass directly to the balancer (#2693) 2019-03-22 10:48:55 -07:00
495133b619 internal: fix pickoptions in balancer_test (#2698)
The same test was changed by two PRs, merge didn't catch the conflict
2019-03-19 13:26:46 -07:00
ce45558927 balancer: make sure non-nil done returned by Pick is called (#2688)
Special case: when SubConn returned by Picker is not Ready, call done before
looping back to re-pick.
2019-03-19 10:47:09 -07:00
3c84def893 balancer: remove Header from PickOptions; it is also available through context (#2674) 2019-03-15 09:00:55 -07:00
ff28255d10 cleanup: fix typo in comment (#2657)
Although it is spelling mistakes, it might make an effect while reading.
2019-03-14 13:12:48 -07:00
77ce7bc228 minor: typo fix (#2680) 2019-03-11 15:06:47 -07:00
79c9bc6794 client: handle HTTP header parsing error correctly (#2599) 2019-03-06 10:59:01 -08:00
5878d965b2 transport: remove RequireHandshakeHybrid support (#2529)
This removes RequireHandshakeHybrid support and changes the default behavior
to RequireHandshakeOn. Dial calls will now block and wait for a successful
handshake before proceeding. Users relying on the old hybrid behavior (cmux
users) should consult https://github.com/soheilhy/cmux/issues/64.

Also, several tests have been updated to take this into consideration by
sending settings frames.
2019-02-27 11:04:46 -07:00
40cb5618f4 dialOption: export WithContextDialer() (#2629)
fixes #2627
2019-02-25 15:22:10 -08:00
2773c7bbcf Fix styling (#2647)
Fix styling
2019-02-21 16:37:37 -08:00
ed70822b12 keepalive: apply minimum ping time of 10s to client and 1s to server (#2642)
* keepalive: apply minimum ping time of 10s to client and 1s to server

* review fixes
2019-02-21 13:09:37 -08:00
32559e2175 internal: server deletes stream after receiving an RST_STREAM frame
* Fixes established streams leak in the loopy writer.

RSTStreamFrames used to be ignored by the server transport, if a trailer had already been put into the transport's control buffer. If loopy writer couldn't write anything into a stream because of an error on the client side, then this trailer would never be sent. At that point, server would receive an RSTStreamFrame from client. But this RSTStreamFrame would be ignored because a trailer was already put into the control buffer. This would keep the stream open and in memory on the server side.

With this change, a cleanupStream item is put into the transport's control buffer, whenever an RSTStreamFrame is received by the server, even after a trailer has been put into the buffer.

* When client sends a header to initiate a stream just after sending an RST_STREAM, server gets these frames in the correct order.
When server receives the RST_STREAM, it marks the stream as done and defers the deletion of the stream to the loopy writer by putting a cleanupStream item into control buffer.
Then the server receives the header to initiate a stream. It acts on the header immediately and attempts to create the stream. But because the old stream is not deleted, it hits the number of streams limit and fails.
This commit solves this problem by letting server handle the deletion immediately after receiving the RST_STREAM.

* Refactors deleteStream method.

* Moves consts declarations into test function's body.
2019-02-11 17:33:22 -08:00
c2f12b83a7 Fix error formatting based on best practices from Code Review Comments (#2615) 2019-02-07 10:01:40 -08:00
d14ffaeb5c client: deprecate CallCustomCodec and provide new version using encoding.Codec (#2556) 2019-02-01 10:21:31 -08:00
8e6533ee6e client: clean up v1 balancer wrapper error handling (#2511) 2019-01-30 10:56:23 -08:00
0e8a6f931c credentials: add TLS 1.3 cipher suites (#2596)
This lets the tests pass with Go1.12beta2.
2019-01-25 08:47:38 -08:00
9225666342 Modified binary search for the correct delay. (#2584)
* Binary search for the correct delay.

* Removes an unnecessary log line.

* Fixes.

* Switches back to linear search.

* Replaces cancel with a timeout.
2019-01-23 14:49:11 -08:00
59acad4c45 cleanup: more simplifications (#2574) 2019-01-16 13:07:56 -08:00
4e92c060da cleanup: replace unnecessary loops (#2573) 2019-01-16 13:06:58 -08:00
dfd7708d35 cleanup: use time.Until(t) instead of t.Sub(time.Now) (#2571) 2019-01-15 16:09:50 -08:00
6cc789b34b client: make handshake required 'on' by default, not 'hybrid' (#2565) 2019-01-15 09:19:32 -08:00
98a94b0cb0 test: disable leakcheck after the first failure (#2563) 2019-01-14 15:40:20 -08:00
9e7c146356 Return nil trailer metadata, if the RPC's status is context canceled. (#2554)
* Closes the client transport stream, if context is cancelled while recvBuffer is reading.

* Passes a function pointer to recvBufferReader, instead of a Stream and an http2Client.

* Adds more descriptive error messages.

* If waitOnHeader notices the context cancelation, shouldRetry no longer returns a ContextError. Instead, it returns the error from the last try.

* Makes sure that test gets both statuses at least 5 times.

* Makse cntPermDenied a lambda function.
2019-01-14 10:59:44 -08:00
8fd063a5aa channelz: Implement GetServer method for channelz (#2537) 2019-01-09 10:50:34 -08:00
0a391ff2b7 grpctest: add new package to manage tests and support per-test setup/teardown (#2523)
- Migrate `grpc` & `grpc/test` packages to use `Teardown` support to guarantee `leakcheck` is used
2019-01-07 14:24:56 -08:00
1838dedee3 channelz: respect max_results in listing methods (#2516) 2018-12-20 13:53:39 -08:00
5da252b6a6 health check test: prevent double close of hcEnterChan (#2441) 2018-12-13 16:44:36 -08:00
29a7ac4deb client: deprecates FailFast & replaces its use by WaitForReady. 2018-12-13 15:15:11 -08:00
ca62c6b92c channelz: fix GetSecurityValue function name. (#2450) 2018-11-30 06:01:10 +08:00
55ef601361 internal: cleanup remaining x/net/context (#2470) 2018-11-28 12:46:35 -08:00