Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix error message panic in command result parsing #502

Merged
merged 4 commits into from
Feb 11, 2021
Merged

Fix error message panic in command result parsing #502

merged 4 commits into from
Feb 11, 2021

Conversation

hfurubotten
Copy link
Contributor

@hfurubotten hfurubotten commented Feb 11, 2021

When adding new mounts towards Azure Datalakes to the terraform configuration, it would sometimes cause a terraform crash. Mostly because of missing permissions on the container or the network blocked the traffic. Seemed like the error came from a HTTP 403 that came back to the commands running in databricks and it couldn't parse this when that error message came back to terraform.

Related issue filed earlier with the error message: #448

I identified that the problem in the parsing was the error message matching in the response from databricks, where with this particular kind of results gave an empty error message from the regular expression here:
https://github.com/databrickslabs/terraform-provider-databricks/blob/67c7713895d8f71a29958fe0e04901c7b5827608/compute/commands.go#L26
https://github.com/databrickslabs/terraform-provider-databricks/blob/67c7713895d8f71a29958fe0e04901c7b5827608/compute/commands.go#L105

This problem was fixed by changing the capture group from (.*) to (.+), thus changing it to only allow 1 or more characters. Then it can go down to the default error output, when no error message is found. This prevents panic and the following terraform crash.

However the error message wasn't explaining much with the default summary, so I also added a new regular expressions to capture the best explaining error message from these kind of command errors. Where it picks up on an execution error, and the status code and description.

This results in the following error message displayed:

databricks_azure_adls_gen2_mount.lake: Creating...
databricks_azure_adls_gen2_mount.lake: Still creating... [10s elapsed]

Error: An error occurred while calling o251.mount.
: HEAD https://datalake.dfs.core.windows.net/data?resource=filesystem&timeout=90
StatusCode=403
StatusDescription=This request is not authorized to perform this operation using this permission.
Example error from the command response
---------------------------------------------------------------------------
ExecutionError                            Traceback (most recent call last)
<command--1> in <module>
     14             print("Failed to unmount", e2)
     15         raise e
---> 16 mount_source = safe_mount("/mnt/data", "abfss://data@datalake.dfs.core.windows.net", {"fs.azure.account.auth.type":"OAuth","fs.azure.account.oauth.provider.type":"org.apache.hadoop.fs.azurebfs.oauth2.ClientCredsTokenProvider","fs.azure.account.oauth2.client.endpoint":"https://login.microsoftonline.com/00000000-0000-0000-0000-000000000000/oauth2/token","fs.azure.account.oauth2.client.id":"00000000-0000-0000-0000-000000000000","fs.azure.account.oauth2.client.secret":dbutils.secrets.get("keyvault-managed", "service-principal-key"),"fs.azure.createRemoteFileSystemDuringInitialization":"true"})
     17 dbutils.notebook.exit(mount_source)

<command--1> in safe_mount(mount_point, mount_source, configs)
     13         except Exception as e2:
     14             print("Failed to unmount", e2)
---> 15         raise e
     16 mount_source = safe_mount("/mnt/data", "abfss://data@datalake.dfs.core.windows.net", {"fs.azure.account.auth.type":"OAuth","fs.azure.account.oauth.provider.type":"org.apache.hadoop.fs.azurebfs.oauth2.ClientCredsTokenProvider","fs.azure.account.oauth2.client.endpoint":"https://login.microsoftonline.com/00000000-0000-0000-0000-000000000000/oauth2/token","fs.azure.account.oauth2.client.id":"00000000-0000-0000-0000-000000000000","fs.azure.account.oauth2.client.secret":dbutils.secrets.get("keyvault-managed", "service-principal-key"),"fs.azure.createRemoteFileSystemDuringInitialization":"true"})
     17 dbutils.notebook.exit(mount_source)

<command--1> in safe_mount(mount_point, mount_source, configs)
      4             return
      5     try:
----> 6         dbutils.fs.mount(mount_source, mount_point, extra_configs=configs)
      7         dbutils.fs.refreshMounts()
      8         dbutils.fs.ls(mount_point)

/local_disk0/tmp/1613045286512-0/dbutils.py in f_with_exception_handling(*args, **kwargs)
    312                     exc.__context__ = None
    313                     exc.__cause__ = None
--> 314                     raise exc
    315             return f_with_exception_handling
    316 

ExecutionError: An error occurred while calling o251.mount.
: HEAD https://datalake.dfs.core.windows.net/data?resource=filesystem&timeout=90
StatusCode=403
StatusDescription=This request is not authorized to perform this operation using this permission.
ErrorCode=
ErrorMessage=
	at shaded.databricks.v20180920_b33d810.org.apache.hadoop.fs.azurebfs.services.AbfsRestOperation.execute(AbfsRestOperation.java:134)
	at shaded.databricks.v20180920_b33d810.org.apache.hadoop.fs.azurebfs.services.AbfsClient.getFilesystemProperties(AbfsClient.java:197)
	at shaded.databricks.v20180920_b33d810.org.apache.hadoop.fs.azurebfs.AzureBlobFileSystemStore.getFilesystemProperties(AzureBlobFileSystemStore.java:239)
	at shaded.databricks.v20180920_b33d810.org.apache.hadoop.fs.azurebfs.AzureBlobFileSystem.fileSystemExists(AzureBlobFileSystem.java:823)
	at shaded.databricks.v20180920_b33d810.org.apache.hadoop.fs.azurebfs.AzureBlobFileSystem.initialize(AzureBlobFileSystem.java:144)
	at com.databricks.backend.daemon.dbutils.DBUtilsCore.verifyAzureFileSystem(DBUtilsCore.scala:497)
	at com.databricks.backend.daemon.dbutils.DBUtilsCore.mount(DBUtilsCore.scala:446)
	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
	at java.lang.reflect.Method.invoke(Method.java:498)
	at py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:244)
	at py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:380)
	at py4j.Gateway.invoke(Gateway.java:295)
	at py4j.commands.AbstractCommand.invokeMethod(AbstractCommand.java:132)
	at py4j.commands.CallCommand.execute(CallCommand.java:79)
	at py4j.GatewayConnection.run(GatewayConnection.java:251)
	at java.lang.Thread.run(Thread.java:748): timestamp=2021-02-11T13:11:15.036+0100

Hope this is an expectable way of catching this error, and give useful feedback to the user.

@alexott
Copy link
Contributor

alexott commented Feb 11, 2021

can you run make fmt && make lint on the code?

@nfx
Copy link
Contributor

nfx commented Feb 11, 2021

@hfurubotten this is great! but please update the branch and fix lint errors (probably splitting method into two will help).

@codecov
Copy link

codecov bot commented Feb 11, 2021

Codecov Report

Merging #502 (d12fb62) into master (67c7713) will decrease coverage by 0.02%.
The diff coverage is 0.00%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master     #502      +/-   ##
==========================================
- Coverage   80.37%   80.35%   -0.03%     
==========================================
  Files          77       77              
  Lines        6502     6504       +2     
==========================================
  Hits         5226     5226              
- Misses        891      893       +2     
  Partials      385      385              
Impacted Files Coverage Δ
compute/commands.go 52.77% <0.00%> (-1.00%) ⬇️

@hfurubotten
Copy link
Contributor Author

Thanks for the tip about how to fix the lint error! Split out the error message parsing into its own method, without naked returns. Then it can be extended with more search patterns later without hitting that lint rule on the next one :)

@nfx nfx merged commit 0e4a7f8 into databricks:master Feb 11, 2021
@hfurubotten hfurubotten deleted the fix/empty-error-on-mounting branch February 11, 2021 19:07
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants