Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: log DirectPath misconfiguration #2105

Merged
merged 9 commits into from
Oct 18, 2023
Original file line number Diff line number Diff line change
Expand Up @@ -64,6 +64,8 @@
import java.util.concurrent.Executor;
import java.util.concurrent.ScheduledExecutorService;
import java.util.concurrent.TimeUnit;
import java.util.logging.Level;
import java.util.logging.Logger;
import javax.annotation.Nullable;
import javax.net.ssl.KeyManagerFactory;
import org.threeten.bp.Duration;
Expand All @@ -82,6 +84,8 @@
*/
@InternalExtensionOnly
public final class InstantiatingGrpcChannelProvider implements TransportChannelProvider {
private static final Logger LOG =
Logger.getLogger(InstantiatingGrpcChannelProvider.class.getName());
private static final String DIRECT_PATH_ENV_DISABLE_DIRECT_PATH =
"GOOGLE_CLOUD_DISABLE_DIRECT_PATH";
private static final String DIRECT_PATH_ENV_ENABLE_XDS = "GOOGLE_CLOUD_ENABLE_DIRECT_PATH_XDS";
Expand Down Expand Up @@ -266,6 +270,43 @@ boolean isDirectPathXdsEnabled() {
return false;
}

private void logDirectPathMisconfig(String serviceAddress) {
boolean isDirectPathOptionSet = Boolean.TRUE.equals(attemptDirectPath);
boolean isDirectPathXdsEnvSet =
Boolean.parseBoolean(envProvider.getenv(DIRECT_PATH_ENV_ENABLE_XDS));
boolean isDirectPathXdsOptionSet = Boolean.TRUE.equals(attemptDirectPathXds);
if (isDirectPathOptionSet || isDirectPathXdsEnvSet || isDirectPathXdsOptionSet) {
// Case 1: use DirectPath with gRPCLB
if (isDirectPathOptionSet && !(isDirectPathXdsEnvSet || isDirectPathXdsOptionSet)) {
// TODO: Add the warning once we move traffic out from gRPCLB
}

// Case 2: just enable DirectPath xDS
if (!isDirectPathOptionSet && (isDirectPathXdsEnvSet || isDirectPathXdsOptionSet)) {
LOG.log(
Level.WARNING, "DirectPath is misconfigured. Please set the attemptDirectPath option.");
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@singhravidutt this PR is w.r.t R.1 for observability.

Would hadoop connector customers be able to see these warnings emitted when running the connector or would this require a change?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, we should be able to surface these logs.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should we add values of isDirectPathOptionSet, isDirectPathXdsEnvSet and isDirectPathXdsOptionSet as part of log statement?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In this case user enables DirectPathXds either by env or option, but DirectPath itself is not enabled. I think the log is already clear that the user needs to set attemptDirectPath and there is no need to log the values for judging the case. WDYT?

Copy link
Member

@frankyn frankyn Oct 16, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Recommend:

DirectPath is misconfigured. Please enable the attemptDirectPath option along with the attemptDirectPathXds option

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

}

// The following WARNING logs are GCS only.
if (serviceAddress.contains("storage.googleapis.com")
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is absolutely not allowed. Gax must not be aware of anything related to a specific service. Looks like we need two logging strategies for opt-in and opt-out, can we make it a flag and Storage can pass the flag into gax?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't this generally useful though? This log helps call out that attempting directpath wasn't configured correctly to developers.

Copy link
Member

@frankyn frankyn Oct 12, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Missed your comment above, about adding this into Constructor which sounds like a good path for this; just calling out this is generally useful.

Copy link
Collaborator

@blakeli0 blakeli0 Oct 12, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes I agree this log is generally useful. But hardcoding the storage specific endpoint storage.googleapis.com in Gax is not allowed, sorry if I didn't make this clear in previous comment.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have a new idea. Let’s only make the WARNING log if DirectPath xDS env/option is set. DirectPath xDS is opt-in for GCS/CBT/CS, so we do not need to have two strategies. WDYT?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think these warnings should be emitted for both cases; a user is attempting DirectPath and it would be helpful to know they're doing something wrong in configuring it. Otherwise, it's a blackbox and users are left to weeks of debugging what went wrong. The goal of this is to make it easier to understand a user did something wrong.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see. Your concern is that what if a customer set attemptDirectPath but not attemptDirectPathXds option/env, there will be no warnings.

Let me provide some context here. attemptDirectPath is originally created for DirectPath gRPCLB (gRPCLB is the name resolver). Then DirectPath xDS is developed (TD is the name resolver) and we decide to migrate DirectPath gRPCLB to DirectPath xDS. To differentiate the two DirectPath infrastructures, attemptDirectPathXds option/env are created:

  • If attemptDirectPath is NOT set, CloudPath will be used;
  • If attemptDirectPath is set AND attemptDirectPathXds option/env is NOT set, DirectPath gRPCLB is used;
  • If attemptDirectPath is set AND attemptDirectPathXds option/env is set, DirectPath xDS is used.

Currently the migration is still ongoing as CBT still has DirectPath gRPCLB traffic (CBT's timeline is Q4). After that, I will cleanup DirectPath gRPCLB code in the gax library. For example, attemptDirectPath will be removed and we just need to keep attemptDirectPathXds option/env.

With that being said, I can change the DirectPath enabling logic to:

  • If attemptDirectPath is NOT set AND attemptDirectPathXds option/env is NOT set, CloudPath will be used;
  • If attemptDirectPath is set AND attemptDirectPathXds option/env is NOT set, DirectPath gRPCLB is used;
  • If attemptDirectPathXds option/env is set, DirectPath xDS is used.

In this way, GCS customers just need to set attemptDirectPathXds option/env, and they do not need to care about attemptDirectPath. Then the current log logic should be good. WDYT?

Copy link
Member

@frankyn frankyn Oct 13, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changing the way the options work sounds like a reasonable thing to do after migration is complete.
In java-storage, the attemptDirectPath and attemptDirectPathXds issue has been addressed by enabling both, but these warnings would have helped save the team time. Moreover, we don't expose configuration of each flag attemptDirectPath and attemptDirectPathXds directly to users instead only expose 1 flag that enables both.

My original goal is to add warnings to let GCS users know they have misconfigured the client to use DirectPath. This also lends to supporting future warnings if new conditions are added.

The three main cases surfaced in this logic (ignoring gRPCLB case because it's not ready yet):

  1. attemptDirectPath vs. attemptDirectPathXds is one way;
  2. non GCE credentials; this is likely to bite a customer
  3. Not using GCE; not as much a concern, but we don't document DirectPath in public documentation so a user could do something silly and use the our clients outside GCE without understanding that DP is not available.

I think the warnings would be generally useful outside of GCS.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Moreover, we don't expose configuration of each flag attemptDirectPath and attemptDirectPathXds directly to users instead only expose 1 flag that enables both.

Sorry I am confused. What are the two cases in the previous comment I think these warnings should be emitted for both cases?

Copy link
Member

@frankyn frankyn Oct 14, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe we are speaking past each other?

I have a new idea. Let’s only make the WARNING log if DirectPath xDS env/option is set. DirectPath xDS is opt-in for GCS/CBT/CS, so we do not need to have two strategies. WDYT?

The point i was driving at is that there's nothing telling a user they configured DirectPath incorrectly when they use attemptDirectPath or attemptDirectPathXds; and selecting the cases only when attemptDirectPathXds is set to emit these warnings misses this point.

&& isDirectPathXdsEnvSet
&& (isDirectPathOptionSet || isDirectPathXdsOptionSet)) {
// Case 3: credential is not correctly set
if (!isNonDefaultServiceAccountAllowed()) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it possible to have a ComputeEngineCredentials but not on ComputeEngine? I though auth would only return ComputeEngineCredentials if on ComputeEngine, so isOnComputeEngine() may not be needed.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes it is possible, this is the original fix for this problem, googleapis/gax-java#1250.

LOG.log(
Level.WARNING,
"DirectPath is misconfigured. Please make sure the credential is an instance of"
+ " ComputeEngineCredentials.");
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

use fully qualified class name ComputeEngineCredentials.class.getName()

TIL you need GCE credentials for DirectPath; what is a customer is using Workload Identity would that work?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

GKE workload identity is compatible with DirectPath.

}
// Case 4: not running on GCE
if (!isOnComputeEngine()) {
LOG.log(
Level.WARNING, "DirectPath is misconfigured. Please run in the GCE environment. ");
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggestion:

DirectPath is misconfigured. DirectPath is only available in a GCE environment.

You also have whitespace at the end.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

}
}
}
}

private boolean isNonDefaultServiceAccountAllowed() {
if (allowNonDefaultServiceAccount != null && allowNonDefaultServiceAccount) {
return true;
Expand Down Expand Up @@ -341,6 +382,7 @@ private ManagedChannel createSingleChannel() throws IOException {
builder.keepAliveTime(DIRECT_PATH_KEEP_ALIVE_TIME_SECONDS, TimeUnit.SECONDS);
builder.keepAliveTimeout(DIRECT_PATH_KEEP_ALIVE_TIMEOUT_SECONDS, TimeUnit.SECONDS);
} else {
logDirectPathMisconfig(serviceAddress);
ChannelCredentials channelCredentials;
try {
channelCredentials = createMtlsChannelCredentials();
Expand Down