83 Commits

Author SHA1 Message Date
c0909e91a5 resolver: move dns and passthrough to internal (#3116)
Nobody should directly need to reference these packages.

This is technically a breaking change. However:

- Package dns was exporting a NewBuilder method. This should never have been necessary to use, but if so, it can be replaced by importing the "grpc" package and then using resolver.Get("dns").

- Package passthrough was not exporting any symbols and there was never a need to even blank-import it.

After as much searching as possible, it appears nobody in the open source community is referencing either of these packages.
2019-10-22 13:01:54 -07:00
ed563a02ea resolver: add State fields to support error handling (#2951) 2019-10-04 12:59:43 -07:00
31911ed09e client: add WithConnectParams to configure connection backoff and timeout (#2960)
* Implement missing pieces for connection backoff.

Spec can be found here:
https://github.com/grpc/grpc/blob/master/doc/connection-backoff.md

Summary of changes:
* Added a new type (marked experimental), ConnectParams, which contains
  the knobs defined in the spec (except for minConnectTimeout).
* Added a new API (marked experimental), WithConnectParams() to return a
  DialOption to dial with the provided parameters.
* Added new fields to the implementation of the exponential backoff in
  internal/backoff which mirror the ones in ConnectParams.
* Marked existing APIs WithBackoffMaxDelay() and WithBackoffConfig() as
  deprecated.
* Added a default exponential backoff implementation, for easy use of
  internal callers.

Added a new backoff package which defines the backoff configuration
options, and is used by both the grpc package and the internal/backoff
package. This allows us to have all backoff related options in a
separate package.
2019-10-03 16:47:13 -07:00
6d6c0413c7 test: fix race reading while incrementing (#2935)
Swap the `<-done` and `if numConns < 2` to make sure `numConns++` happens before read
2019-07-25 10:54:14 -07:00
24b2fb8959 client: remove option to send RPCs before HTTP/2 handshake is completed (#2904) 2019-07-12 13:37:27 -07:00
d40a995895 balancer/resolver: add loadBalancingConfig and pre-parsing support (#2732) 2019-05-30 09:12:58 -07:00
d37bd82db6 Fix DialContext when using a timeout (#2737)
Fixes #2736
2019-04-04 09:58:15 -07:00
ea5e6da287 service config: default service config (#2686) 2019-04-03 10:50:28 -07:00
3910b873d3 bar: add ability to update resolver state atomically and pass directly to the balancer (#2693) 2019-03-22 10:48:55 -07:00
d021e89b3f internal: fix Dial_OneBackoffPerRetryGroup (#2689)
* internal: fix Dial_OneBackoffPerRetryGroup

Instead of mutating global variables, switches getMinConnectDeadline to a
dial option.

Fixes #2687.

* rename getMinConnectTimeoutFunc to minConnectTimeout, ditto dial opt
2019-03-20 13:58:29 -06:00
5878d965b2 transport: remove RequireHandshakeHybrid support (#2529)
This removes RequireHandshakeHybrid support and changes the default behavior
to RequireHandshakeOn. Dial calls will now block and wait for a successful
handshake before proceeding. Users relying on the old hybrid behavior (cmux
users) should consult https://github.com/soheilhy/cmux/issues/64.

Also, several tests have been updated to take this into consideration by
sending settings frames.
2019-02-27 11:04:46 -07:00
ed70822b12 keepalive: apply minimum ping time of 10s to client and 1s to server (#2642)
* keepalive: apply minimum ping time of 10s to client and 1s to server

* review fixes
2019-02-21 13:09:37 -08:00
a402911c6f internal: resetTransport connect deadline is across addresses (#2540)
internal: resetTransport connect deadline is across addresses

Currently, the connect deadline is recalculated per-address. This PR amends
that behavior such that all addresses for a single connection attempt share
the same deadline.

Fixes #2462
2019-02-11 17:12:42 -07:00
ec9c18c8c6 internal: split StateRecordingBalancer in test to balancer and builder (#2578)
And instead of setting state notify channel in balancer, create a new notify
channel at Build.

fixes #2576
2019-01-18 10:21:46 -08:00
76cc50721c internal: rewrite TestDialWithMultipleBackendsNotSendingServerPreface (#2438)
- Remove the slice of servers approach, since there's specific
logic at server 2 that's different from server 1. This has the
advantage of making the test more readable without sacrificing
anything (given the previous point).

- Defer server close at initialization time instead of at the
end.

- Remove a time.Sleep(time.Second): use timeout + select around
serverDone instead.

- Use a goroutine to keep the connection reading, instead of
using a for loop in the server goroutine. This causes the
defer close(server2Done) to happen immediately after preface
is sent, which combined with the aforementioned time.Sleep
removal causes the test to go from 1.00s to ~0.05s.
2019-01-08 16:48:09 -08:00
0a391ff2b7 grpctest: add new package to manage tests and support per-test setup/teardown (#2523)
- Migrate `grpc` & `grpc/test` packages to use `Teardown` support to guarantee `leakcheck` is used
2019-01-07 14:24:56 -08:00
1b41b79fd1 internal: refactor transport to single retry mechanism (#2461)
Previously, the transport was able to reset via the retry loop,
or via the event closures calling resetTransport. This meant
a very large amount of synchronization was necessary: one
reset meant the other had to not reset; state had to be kept
at the addrconn; and very subtle interactions were hard to
reason about.

This change removes the ability for event closures to directly
reset the transport. Instead, they signal to to the retry
loop about the event, and the retry loop is always the single
place that retries occur.

This also allows us to refactor the address switching logic
into a much simpler for loop inside the retry loop instead of
using addrConn state to keep track of an index.
2018-12-17 13:10:13 -08:00
f3eb5bc06e client: add GRPC_GO_REQUIRE_HANDSHAKE options to control connection behavior (#2464)
Possible settings of this environment variable:

- "hybrid" (default; removed after the 1.17 release): do not wait for handshake before considering a connection ready, but wait before considering successful.

- "on" (default after the 1.17 release): wait for handshake before considering a connection ready/successful.

- "off": do not wait for handshake before considering a connection ready/successful.

This setting will be completely removed after the 1.18 release, and "on" will be the only supported behavior.
2018-11-26 15:06:46 -08:00
04ea82009c cleanup: replace "x/net/context" import with "context" (#2439) 2018-11-12 13:30:41 -08:00
04c0c4d299 internal: fix client send preface problems (#2380)
internal: fix client send preface problems

This CL fixes three problems:

- In clientconn_state_transitions_test.go, sometimes tests would flake because there's not enough buffer to send client side settings, causing the connection to unpredictably enter TRANSIENT FAILURE. Each time we set up a server to send SETTINGS, we should also set up the server to read. This allows the client to successfully send its SETTINGS, unflaking the test.

- In clientconn.go, we incorrectly transitioned into TRANSIENT FAILURE when creating an http2client returned an error. This should be handled in the outer resetTransport main reset loop. The reason this became a problem is that the outer resetTransport has very specific conditions around when to transition into TRANSIENT FAILURE that the egregious transition did not have. So, it could transition into TRANSIENT FAILURE after failing to dial, even if it was trying to connect to a non-final address in the list of addresses.

- In clientconn.go, we incorrectly stay in CONNECTING after `createTransport` when a server sends its connection preface but the client is not able to send its connection preface. This CL causes the addrconn to correctly enter TRANSIENT FAILURE when `createTransport` fails, even if a server preface was received. It does so by making ac.successfulHandshake to consider both server preface received as well as client preface sent.
2018-10-18 14:31:34 -07:00
1ca9df53a7 internal: clean up and unflake state transitions test (#2366)
internal: clean up and unflake state transitions test

Switches state transitions test to using a notification from a custom load
balancer, instead of relying on waiting for laggy balancer state updates.

Also generally adds more coverage around state transitions and a framework
for easily adding more of these kinds of tests.

Fixes #2348
2018-10-12 15:22:27 -07:00
cb11627444 clientconn: not panic when service config updated while closing (#2371)
Closing `ClientConn` sets `balancerWrapper` to nil.
If service config switches balancer, the new balancer will be notified of the existing addresses.

When these two happens together, there's a chance that a method will be called on the nil `balancerWrapper`. This change adds a check to make sure that never happens. 

fixes #2367
2018-10-11 11:43:16 -07:00
8d75951f9b Fix onclose state transition (#2346)
internal: fix onClose state transitions

When onClose occurs during WaitForHandshake, it should immediately
exit createTransport instead of leaving CONNECTING and entering READY.

Furthermore, when any onClose happens, the state should change to
TRANSIENT FAILURE.

Fixes #2340
Fixes #2341

Also fixes an unreported bug in which entering READY causes a
Dial call to end prematurely, instead of blocking until a READY
transport is found.
2018-10-08 16:58:54 -06:00
cc41663c52 client: fix panic caused by WithWaitForHandshake and related races (#2336) 2018-09-27 14:23:29 -07:00
82f263dc2f internal: fix race between cancel and shutdown
This fixes a race in ac.tearDown and ac.resetTransport. If ac.resetTransport
is in backoff when ac.tearDown occurs, there's a race between the state
changing to Shutdown and ac.resetTransport calling ac.createTransport.

This fixes it by returning when ac.resetTransport encounters an error
during ac.nextAddr (specifically ac.ctx.Error()). It also fixes it by
making sure that ac.tearDown changes state to Shutdown before canceling
the context.

Both fixes were implemented because they both seem to be valuable
standalone additions: the former makes ac.resetTransport more
understandable and less dependent on behavior happening elsewhere,
and the latter makes ac.tearDown more correct.

Finally, TestDialParseTargetUnknownScheme had its buffer removed; the
buffer was likely added a while ago to assuage this issue. It should
not be necessary anymore.
2018-09-24 10:46:24 -07:00
0baa06717e tests: fix goroutine leak (#2317)
tests: fix goroutine leak

If TestResetConnectBackoff fails, the resetTransport goroutine will be
stuck dialing and subsequently the goroutine will be leaked. This is
all despite the test including `defer cc.Close()`:

- defer cc.Close() will cause ac.cancel to be called
- ac.context will be appropriately cancelled
- ac.context is correctly the context that gets passed to the dialer
- However, the WithDialer throws away the context and only passes its
deadline, which is for `backoffForever{}` is math.MaxInt64. So, even
though teardown occurs, the resetTransport goroutine will still be
stuck dialing.

This CL adds a small amendment: before performing leakcheck, attempt
to take an item off the synchronous `dials` channel. Either the tests
passed and there is no item, or the tests failed and there is one.
2018-09-20 15:26:05 -07:00
d2aec4d7de client: Add ClientConn.ResetConnectBackoff to force reconnections on demand (#2273)
Fixes #1037
2018-08-27 13:21:48 -07:00
91c7ef84b5 client: fix FailOnNonTempDialError and add a test for it (#2276) 2018-08-27 10:28:41 -07:00
980d9e0348 ClientConn: add Target() returning target string (#2233) 2018-07-23 16:19:11 -07:00
cedd9131e8 Fix test: wait on server to signal successful accept. (#2183) 2018-06-28 11:19:52 -07:00
24f3cca1ff internal: move backoff to internal (#2141)
So other components such as grpclb can reuse the backoff implementation.
2018-06-13 16:07:37 -07:00
0e5a36b652 internal: move leakcheck to internal/ (#2129)
internal: move leakcheck to internal/
2018-06-07 16:57:56 -07:00
854695bef0 client: introduce WithDisableServiceConfig DialOption (#2010) 2018-05-08 10:28:26 -07:00
e09cbb70fe internal: better test names (#2043) 2018-05-03 13:17:45 -07:00
031ee13cfe Fix Test: Update the deadline since small deadlines are prone to flakes on Travis. (#1932) 2018-03-20 16:46:10 -07:00
207e2760fd Fix test race: Atomically access minConnecTimout in testing environment. (#1897) 2018-03-07 10:36:17 -08:00
7c5299d71e Fix flaky test: TestCloseConnectionWhenServerPrefaceNotReceived (#1870)
* Atomically update minConnectTimeout in test.

* Refactor the flaky test.

* Post review update

* mend
2018-03-02 13:39:55 -08:00
37346e3181 Revert "Add WithResolverUserOptions for custom resolver build options" (#1839)
This reverts commit ff1be3fcc57773d500921fad479d82ec171e2358.
2018-02-05 12:52:35 -08:00
65c901e458 Fix flakey test. (#1771) 2017-12-28 09:31:16 -08:00
6610f9a340 Fix test: Data race while resetting global var. (#1748) 2017-12-18 15:27:23 -08:00
b8cf13ea06 Make sure all goroutines have ended before restoring global vars. (#1732) 2017-12-14 13:23:05 -08:00
ff1be3fcc5 Add WithResolverUserOptions for custom resolver build options (#1711) 2017-12-06 15:54:01 -08:00
ddbb27e545 client: backoff before reconnecting if an HTTP2 server preface was not received (#1648) 2017-12-01 09:55:42 -08:00
10873b30bf Fix panics on balancer and resolver updates (#1684)
- NewAddress with empty list (addrConn with an empty address list)
 - NewServiceConfig to switch balancer before the first balancer is built
2017-11-22 13:59:20 -08:00
a353537ff5 Register and use default balancers and resolvers (#1551) 2017-10-19 11:32:06 -07:00
d46a3655c4 Add leak goroutine checking to grpc/balancer tests (#1497) 2017-09-07 14:30:05 -07:00
8233e124e4 Add new Resolver and Balancer APIs (gRFC L9) (#1408)
- Add package balancer and resolver.
 - Change ClientConn internals to new APIs and adds a wrapper for v1 balancer.
2017-08-31 10:59:09 -07:00
e60698345e Fix context warnings from govet. (#1486)
Pre-req work for #1484
2017-08-29 11:04:15 -07:00
e81b5698fd Add and use connectivity package for states (#1430)
* Add and use connectivity package
* Mark cc state APIs as experimental
2017-08-09 10:31:12 -07:00
4e56696c6c Fix a goroutine leak in DialContext (#1424)
A leak happens when DialContext times out before a balancer returns any
addresses or before a successful connection is established.

The loop in ClientConn.lbWatcher breaks and doneChan never gets closed.
2017-08-04 13:40:50 -07:00