Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add python_version for glue jobs #9409

Closed
wants to merge 4 commits into from

Conversation

mjuarez
Copy link
Contributor

@mjuarez mjuarez commented Jul 18, 2019

Problem

Unable to specify the version of Python to use with pythonshell glue jobs despite supported in API

Solution

Add in ability to specify the python version with the JobCommand as mentioned in AWS SDK

Signed-off-by: Mike Juarez mikejuarez@gmail.com

Community Note

  • Please vote on this pull request by adding a 👍 reaction to the original pull request comment to help the community and maintainers prioritize this request
  • Please do not leave "+1" comments, they generate extra noise for pull request followers and do not help prioritize the request

Relates #9524

Release note for CHANGELOG:

Added ability to specify python version for pythonshell in glue jobs

Output from acceptance testing:

$ make testacc TESTARGS='-run=TestAccXXX'
==> Checking that code complies with gofmt requirements...
TF_ACC=1 go test ./... -v -parallel 20 -run=TestAccXXX -timeout 120m
?   	github.com/terraform-providers/terraform-provider-aws	[no test files]
testing: warning: no tests to run
PASS
ok  	github.com/terraform-providers/terraform-provider-aws/aws	0.045s [no tests to run]

Signed-off-by: Mike Juarez <mikejuarez@gmail.com>
@mjuarez mjuarez requested a review from a team July 18, 2019 19:18
@ghost ghost added size/XS Managed by automation to categorize the size of a PR. service/glue Issues and PRs that pertain to the glue service. tests PRs: expanded test coverage. Issues: expanded coverage, enhancements to test infrastructure. documentation Introduces or discusses updates to documentation. labels Jul 18, 2019
Copy link
Contributor

@nywilken nywilken left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the quick fix.

@nywilken
Copy link
Contributor

Sorry I saw the document change and approved without looking at the rest of the PR.

@@ -44,6 +44,10 @@ func resourceAwsGlueJob() *schema.Resource {
Type: schema.TypeString,
Required: true,
},
"python_version": {
Type: schema.TypeString,
Optional: true,
Copy link
Contributor

@nywilken nywilken Jul 18, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let’s add a validation check for this attribute.

ValidateFunc: validation.StringInSlice([]string{"2","3"}, true),

Copy link
Contributor

@nywilken nywilken left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

make testacc TESTARGS=“-run=TestAccAWSGlueJob_PythonShell“ is what you need to run the modified test case. See our CONTRIBUTING for more information on writing tests.

Copy link
Contributor

@nywilken nywilken left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please see my comments above.

Signed-off-by: Michael Juarez <mikejuarez@gmail.com>
@mjuarez
Copy link
Contributor Author

mjuarez commented Jul 22, 2019

@nywilken Thanks for the guidance. Here's the latest from acceptance tests.

$ make testacc TEST=./aws TESTARGS='-run=TestAccAWSGlueJob_PythonShell'
==> Checking that code complies with gofmt requirements...
TF_ACC=1 go test ./aws -v -parallel 20 -run=TestAccAWSGlueJob_PythonShell -timeout 120m
=== RUN   TestAccAWSGlueJob_PythonShell
=== PAUSE TestAccAWSGlueJob_PythonShell
=== CONT  TestAccAWSGlueJob_PythonShell
--- PASS: TestAccAWSGlueJob_PythonShell (28.53s)
PASS
ok  	github.com/terraform-providers/terraform-provider-aws/aws	28.564s

@mjuarez
Copy link
Contributor Author

mjuarez commented Jul 24, 2019

@nywilken Looks like Travis failed due to OutOfMemory errors

@mjuarez mjuarez closed this Jul 29, 2019
@mjuarez mjuarez reopened this Jul 29, 2019
@nywilken nywilken self-assigned this Jul 30, 2019
Copy link
Contributor

@nywilken nywilken left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mjuarez this looks good. I did some additional testing and found some quirks with the support for the various Python Versions. I left some hefty comments mostly for background context with code suggestions to address the quirks. Please let me know if you have any questions.

@@ -44,6 +44,11 @@ func resourceAwsGlueJob() *schema.Resource {
Type: schema.TypeString,
Required: true,
},
"python_version": {
Type: schema.TypeString,
Optional: true,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like there are some constraints around the python version. If using Glue Version 0.9 which is what is currently being defined within this resource the default Python Version will be 2, but when using Glue Version 1.0, which we currently don't support, the default value will be 3. So we should add Computed: true as there is no good default at this time. When we add support for "glue_version" #9524 we can revisit if need be.

@@ -331,6 +336,7 @@ func expandGlueJobCommand(l []interface{}) *glue.JobCommand {
jobCommand := &glue.JobCommand{
Name: aws.String(m["name"].(string)),
ScriptLocation: aws.String(m["script_location"].(string)),
PythonVersion: aws.String(m["python_version"].(string)),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The API will error if PythonVersion is not a valid version so we should not send a value if no "python_version" was defined in the Terraform configuration.

Suggested change
PythonVersion: aws.String(m["python_version"].(string)),
}
if v, ok := m["python_version"].(string); ok && v != "" {
jobCommand.PythonVersion = aws.String(v)
}

@@ -729,6 +730,7 @@ resource "aws_glue_job" "test" {
command {
name = "pythonshell"
script_location = "testscriptlocation"
python_version = "3"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's remove the version from the configuration to validate that setting no python_version works as expected.

Suggested change
python_version = "3"

@@ -397,6 +397,7 @@ func TestAccAWSGlueJob_PythonShell(t *testing.T) {
resource.TestCheckResourceAttr(resourceName, "max_capacity", "0.0625"),
resource.TestCheckResourceAttr(resourceName, "command.#", "1"),
resource.TestCheckResourceAttr(resourceName, "command.0.script_location", "testscriptlocation"),
resource.TestCheckResourceAttr(resourceName, "command.0.python_version", "3"),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When no python_version is specified the default response from the API will be version 2. Once support is added for glue version 1.0 the default value will be 3.

Suggested change
resource.TestCheckResourceAttr(resourceName, "command.0.python_version", "3"),
resource.TestCheckResourceAttr(resourceName, "command.0.python_version", "2"),

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's update this test to account for the three PythonVersion use cases (i.e Not Set, Set to 2, Set to 3).

Start by adding the following test configuration under the existing testAccAWSGlueJobConfig_PythonShell function

func testAccAWSGlueJobConfig_PythonShellWithVersion(rName, pythonVersion string) string {                                                                                                                                                                                                            
    return fmt.Sprintf(`                                                                                                                                                                                                                                                                              
  %s                                                                                                                                                                                                                                                                                                  
                                                                                                                                                                                                                                                                                                      
  resource "aws_glue_job" "test" {                                                                                                                                                                                                                                                                    
    name         = %q                                                                                                                                                                                                                                                                                 
    role_arn     = "${aws_iam_role.test.arn}"                                                                                                                                                                                                                                                         
    max_capacity = 0.0625                                                                                                                                                                                                                                                                             
                                                                                                                                                                                                                                                                                                      
    command {                                                                                                                                                                                                                                                                                         
      name            = "pythonshell"                                                                                                                                                                                                                                                                 
      script_location = "testscriptlocation"                                                                                                                                                                                                                                                          
      python_version =  %q                                                                                                                                                                                                                                                                            
    }                                                                                                                                                                                                                                                                                                 
                                                                                                                                                                                                                                                                                                      
    depends_on = ["aws_iam_role_policy_attachment.test"]                                                                                                                                                                                                                                              
  }                                                                                                                                                                                                                                                                                                   
  `, testAccAWSGlueJobConfig_Base(rName), rName, pythonVersion)                                                                                                                                                                                                                                       
  }

Then update the test to look like the following:

  func TestAccAWSGlueJob_PythonShell(t *testing.T) {                                                                                                                                                                                                                                                  
    var job glue.Job                                                                                                                                                                                                                                                                                  
                                                                                                                                                                                                                                                                                                      
    rName := fmt.Sprintf("tf-acc-test-%s", acctest.RandString(5))                                                                                                                                                                                                                                     
    resourceName := "aws_glue_job.test"                                                                                                                                                                                                                                                               
                                                                                                                                                                                                                                                                                                      
    resource.ParallelTest(t, resource.TestCase{                                                                                                                                                                                                                                                       
      PreCheck:     func() { testAccPreCheck(t) },                                                                                                                                                                                                                                                    
      Providers:    testAccProviders,                                                                                                                                                                                                                                                                 
      CheckDestroy: testAccCheckAWSGlueJobDestroy,                                                                                                                                                                                                                                                    
      Steps: []resource.TestStep{                                                                                                                                                                                                                                                                     
        {                                                                                                                                                                                                                                                                                             
          Config: testAccAWSGlueJobConfig_PythonShell(rName),                                                                                                                                                                                                                                         
          Check: resource.ComposeTestCheckFunc(                                                                                                                                                                                                                                                       
            testAccCheckAWSGlueJobExists(resourceName, &job),                                                                                                                                                                                                                                         
            resource.TestCheckResourceAttr(resourceName, "max_capacity", "0.0625"),                                                                                                                                                                                                                   
            resource.TestCheckResourceAttr(resourceName, "command.#", "1"),                                                                                                                                                                                                                           
            resource.TestCheckResourceAttr(resourceName, "command.0.script_location", "testscriptlocation"),                                                                                                                                                                                          
            resource.TestCheckResourceAttr(resourceName, "command.0.python_version", "2"),                                                                                                                                                                                                            
            resource.TestCheckResourceAttr(resourceName, "command.0.name", "pythonshell"),                                                                                                                                                                                                            
          ),                                                                                                                                                                                                                                                                                          
        },                                                                                                                                                                                                                                                                                            
        {                                                                                                                                                                                                                                                                                             
          ResourceName:      resourceName,                                                                                                                                                                                                                                                            
          ImportState:       true,                                                                                                                                                                                                                                                                    
          ImportStateVerify: true,                                                                                                                                                                                                                                                                    
        },                                                                                                                                                                                                                                                                                            
        {                                                                                                                                                                                                                                                                                             
          Config: testAccAWSGlueJobConfig_PythonShellWithVersion(rName, "2"),                                                                                                                                                                                                                         
          Check: resource.ComposeTestCheckFunc(                                                                                                                                                                                                                                                       
            testAccCheckAWSGlueJobExists(resourceName, &job),                                                                                                                                                                                                                                         
            resource.TestCheckResourceAttr(resourceName, "max_capacity", "0.0625"),                                                                                                                                                                                                                   
            resource.TestCheckResourceAttr(resourceName, "command.#", "1"),                                                                                                                                                                                                                           
            resource.TestCheckResourceAttr(resourceName, "command.0.script_location", "testscriptlocation"),                                                                                                                                                                                          
            resource.TestCheckResourceAttr(resourceName, "command.0.python_version", "2"),                                                                                                                                                                                                            
            resource.TestCheckResourceAttr(resourceName, "command.0.name", "pythonshell"),                                                                                                                                                                                                            
          ),                                                                                                                                                                                                                                                                                          
        },                                                                                                                                                                                                                                                                                            
        {                                                                                                                                                                                                                                                                                             
          ResourceName:      resourceName,                                                                                                                                                                                                                                                            
          ImportState:       true,                                                                                                                                                                                                                                                                    
          ImportStateVerify: true,                                                                                                                                                                                                                                                                    
        },                                                                                                                                                                                                                                                                                            
        {                                                                                                                                                                                                                                                                                             
          Config: testAccAWSGlueJobConfig_PythonShellWithVersion(rName, "3"),                                                                                                                                                                                                                         
          Check: resource.ComposeTestCheckFunc(                                                                                                                                                                                                                                                       
            testAccCheckAWSGlueJobExists(resourceName, &job),                                                                                                                                                                                                                                         
            resource.TestCheckResourceAttr(resourceName, "max_capacity", "0.0625"),                                                                                                                                                                                                                   
            resource.TestCheckResourceAttr(resourceName, "command.#", "1"),                                                                                                                                                                                                                           
            resource.TestCheckResourceAttr(resourceName, "command.0.script_location", "testscriptlocation"),                                                                                                                                                                                          
            resource.TestCheckResourceAttr(resourceName, "command.0.python_version", "3"),                                                                                                                                                                                                            
            resource.TestCheckResourceAttr(resourceName, "command.0.name", "pythonshell"),                                                                                                                                                                                                            
          ),                                                                                                                                                                                                                                                                                          
        },                                                                                                                                                                                                                                                                                            
        {                                                                                                                                                                                                                                                                                             
          ResourceName:      resourceName,                                                                                                                                                                                                                                                            
          ImportState:       true,                                                                                                                                                                                                                                                                    
          ImportStateVerify: true,                                                                                                                                                                                                                                                                    
        },                                                                                                                                                                                                                                                                                            
      },                                                                                                                                                                                                                                                                                              
    })                                                                                                                                                                                                                                                                                                
  }```

@nywilken nywilken added the enhancement Requests to existing resources that expand the functionality or scope. label Aug 1, 2019
@nywilken
Copy link
Contributor

nywilken commented Aug 1, 2019

This pull request appears to be related to #9524, so I have edited the pull request description to denote the issue reference.

Signed-off-by: Michael Juarez <mikejuarez@gmail.com>
@ghost ghost added size/S Managed by automation to categorize the size of a PR. and removed size/XS Managed by automation to categorize the size of a PR. labels Aug 6, 2019
@mjuarez
Copy link
Contributor Author

mjuarez commented Aug 6, 2019

@nywilken Thanks for the guidance. I've updated with your suggestions. Not sure how to get this to fit with #9524

==> Checking that code complies with gofmt requirements...
TF_ACC=1 go test ./aws -v -parallel 20 -run=TestAccAWSGlueJob_PythonShell -timeout 120m
=== RUN   TestAccAWSGlueJob_PythonShell
=== PAUSE TestAccAWSGlueJob_PythonShell
=== CONT  TestAccAWSGlueJob_PythonShell
--- PASS: TestAccAWSGlueJob_PythonShell (68.30s)
PASS
ok  	github.com/terraform-providers/terraform-provider-aws/aws	(cached)

@stephencsmall
Copy link

Is there anything blocking this from being merged? Would love to have this for my current project, willing to help get it done if I can.

@nywilken nywilken added this to the v2.29.0 milestone Sep 18, 2019
Copy link
Contributor

@nywilken nywilken left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mjuarez apologies for the delays here. I rebased your changes onto master and all is working as expected. Apologies again. Thank you for the contribution and time.

--- PASS: TestAccAWSGlueJob_PythonShell (51.98s)

nywilken added a commit that referenced this pull request Sep 18, 2019
@nywilken
Copy link
Contributor

The enhancement to the aws_glue_job resource has been merged and will be released with version 2.29.0 of the Terraform AWS provider.

@nywilken nywilken closed this Sep 18, 2019
@nywilken nywilken reopened this Sep 18, 2019
nywilken added a commit that referenced this pull request Sep 18, 2019
@nywilken nywilken closed this Sep 18, 2019
@kabeersvohra
Copy link

@nywilken any idea when 2.29.0 is due to be released? cheers

@nywilken
Copy link
Contributor

nywilken commented Sep 20, 2019

@kabeersvohra we generally release every Thursday, but we ran into a slight issue in testing one of our recently modified resources (unrelated to this resource). We are looking into the issue and should have a release out by today. If things should change I'll follow up on this thread. Thanks for your patience with us in getting the release out.

@ghost
Copy link

ghost commented Sep 20, 2019

This has been released in version 2.29.0 of the Terraform AWS provider. Please see the Terraform documentation on provider versioning or reach out if you need any assistance upgrading.

For further feature requests or bug reports with this functionality, please create a new GitHub issue following the template for triage. Thanks!

@kabeersvohra
Copy link

I have found a slight bug in setting the python version in this PR. When I have a glue job which is set to Spark 2.2, Python 2 (Glue version 0.9) in the console and I set the python_version to 3 in terraform, after applying, the delta is gone but the job glue version is blank in the console. This issue does not occur when the job is set to Spark 2.4, Python 2 (Glue version 1.0) in the console, I am guessing it is to do with a mismatch of the python version and glue version which is not set in this PR. Thanks

@mjuarez
Copy link
Contributor Author

mjuarez commented Sep 24, 2019

@kabeersvohra I'm wondering if this is related to #9524

@kabeersvohra
Copy link

@mjuarez yes, and a PR to solve this has just been created: #10237

@ghost
Copy link

ghost commented Nov 1, 2019

I'm going to lock this issue because it has been closed for 30 days ⏳. This helps our maintainers find and focus on the active issues.

If you feel this issue should be reopened, we encourage creating a new issue linking back to this one for added context. Thanks!

@ghost ghost locked and limited conversation to collaborators Nov 1, 2019
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
documentation Introduces or discusses updates to documentation. enhancement Requests to existing resources that expand the functionality or scope. service/glue Issues and PRs that pertain to the glue service. size/S Managed by automation to categorize the size of a PR. tests PRs: expanded test coverage. Issues: expanded coverage, enhancements to test infrastructure.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants