Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enable EMR cluster configuration through the InstanceGroup param #1071

Merged
merged 6 commits into from
Aug 24, 2017
Merged

Enable EMR cluster configuration through the InstanceGroup param #1071

merged 6 commits into from
Aug 24, 2017

Conversation

apetresc
Copy link
Contributor

@apetresc apetresc commented Jul 6, 2017

This fixes #869 and #389.

EMR has three ways to configure cluster resources: the coreInstanceCount/coreInstanceType params, the InstanceGroup structure, and the InstanceFleet structure (which only exists in very new EMR releases). At the moment, Terraform only supports the first one of those (coreInstanceCount). This is extremely limiting, among other things it doesn't allow the user to set EBS options or use spot prices.

This commit adds support for InstanceGroup to aws_emr_cluster resources, and makes the masterInstanceCount optional since there is now an alternative.

I have not added any tests yet, but manual testing indicates all the new features work as expected. Existing unit tests continue to pass.

apetresc added 2 commits July 6, 2017 11:25
EMR has three ways to configure cluster resources: the coreInstanceCount/
coreInstanceType params, the InstanceGroup structure, and the InstanceFleet
structure (which only exists in very new EMR releases). At the moment,
Terraform only supports the first one of those (coreInstanceCount). This is
extremely limiting, among other things it doesn't allow the user to
set EBS options or use spot prices.

This commit adds support for InstanceGroup to aws_emr_cluster resources,
and makes the masterInstanceCount optional since there is now an
alternative.
apetresc added a commit to apetresc/terraform that referenced this pull request Jul 10, 2017
@radeksimko radeksimko added the enhancement Requests to existing resources that expand the functionality or scope. label Jul 20, 2017
@jennyfountain
Copy link

+2!!

@synhershko
Copy link

+1

@fosskers
Copy link
Contributor

If anything needs a merge, it's this PR.

@fosskers
Copy link
Contributor

I've just tested and confirmed that this works. In lieu of say:

  master_instance_type = "m3.xlarge"
  core_instance_type   = "m3.xlarge"
  core_instance_count  = 2               

We can use:

  instance_group {
    bid_price = "0.05"
    instance_count = 1
    instance_role = "MASTER"
    instance_type = "m3.xlarge"
  }

  instance_group {
    bid_price = "0.05"
    instance_count = 2
    instance_role = "CORE"
    instance_type = "m3.xlarge"
  }

Results on EMR:
didit

Note: the aws_emr_cluster resource documentation will have to be updated as well.

@jennyfountain
Copy link

What about disk size?

@fosskers
Copy link
Contributor

fosskers commented Jul 26, 2017

There seem to be more options that what I used here; check the PR's diff for everything that's available.

Otherwise, disk size of the instances is controlled by the instance_type, no?

@fosskers
Copy link
Contributor

Also, how does these changes relate to the existing aws_instance_group resource?

@apetresc
Copy link
Contributor Author

@jennyfountain: Disk size is controlled through the ebs_config block. So something like:

instance_group {
    bid_price = "0.05"
    instance_count = 2
    instance_role = "CORE"
    instance_type = "m3.xlarge"
    ebs_config [
        {
            "size" = 200
            "type" = "gp2"
        }
    ]
  }

@grubernaut
Copy link
Contributor

Hey @apetresc, thank you a million for this PR! Awesome contribution.
However, would you be able to add an additional acceptance test to the aws_emr_cluster tests, using this new configuration? Feel free to ping me afterwards, and I'll prioritize a review.

@apetresc
Copy link
Contributor Author

@grubernaut Yes, absolutely :) I'm on vacation until the end of next week, but I'll add those tests as soon as I'm back.

@apetresc
Copy link
Contributor Author

Hey @grubernaut, sorry for the delay! As requested, I've added an acceptance test for the instance_group field. The test passes successfully on my account. A few notes:

  • I opted for a single instance_group test with all of the different moving parts enabled at once (bid_price, ebs_config, etc), instead of a separate test exercising each capability. I figure since these tests cost literal money to run, it would be more prudent this way. Let me know if you want them broken up a bit.
  • I'm not doing many actual attribute checks in the test, because I can't figure out how I'm supposed to figure out the index of a particular entry in a TypeSet. I tried looking for examples in other resource tests, but they all either hardcode it (which seems super-fragile to me) or avoid it altogether and only test TypeList fields. Any advice on how to do this properly? Do I just need to write a helper function that just iterates over the resource?

Thanks!

@apetresc
Copy link
Contributor Author

Oh, one more note:

  • I didn't run the full acceptance test suite to make sure I'm not introducing a regression, since I can't really justify the 💰 cost to my company. I assume your guys' CI servers will do that and report any failures, right? :)

@grubernaut
Copy link
Contributor

@apetresc, yup, not a problem! Pulling down your branch now to run through the test suite :)

@grubernaut
Copy link
Contributor

$ make testacc TEST=./aws TESTARGS="-run=TestAccAWSEMRCluster"                                
==> Checking that code complies with gofmt requirements...                                    
TF_ACC=1 go test ./aws -v -run=TestAccAWSEMRCluster -timeout 120m                             
=== RUN   TestAccAWSEMRCluster_basic           
--- PASS: TestAccAWSEMRCluster_basic (597.51s) 
=== RUN   TestAccAWSEMRCluster_instance_group  
--- PASS: TestAccAWSEMRCluster_instance_group (643.15s)                                       
=== RUN   TestAccAWSEMRCluster_security_config 
--- PASS: TestAccAWSEMRCluster_security_config (617.01s)                                      
=== RUN   TestAccAWSEMRCluster_bootstrap_ordering                                             
--- PASS: TestAccAWSEMRCluster_bootstrap_ordering (612.38s)                                   
=== RUN   TestAccAWSEMRCluster_terminationProtected                                           
--- PASS: TestAccAWSEMRCluster_terminationProtected (600.88s)                                 
=== RUN   TestAccAWSEMRCluster_visibleToAllUsers                                              
--- PASS: TestAccAWSEMRCluster_visibleToAllUsers (528.26s)                                    
=== RUN   TestAccAWSEMRCluster_s3Logging       
--- PASS: TestAccAWSEMRCluster_s3Logging (1174.00s)                                           
=== RUN   TestAccAWSEMRCluster_tags            
--- PASS: TestAccAWSEMRCluster_tags (645.54s)  
PASS                                           
ok      github.com/terraform-providers/terraform-provider-aws/aws       5418.745s

Copy link
Contributor

@grubernaut grubernaut left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Awesome work here! Just needs documentation and it's ready to go!

@apetresc
Copy link
Contributor Author

Great. I guess this just means documenting the new fields in website/docs/r/emr_cluster.html.md, right? I can put that together very quickly.

@grubernaut
Copy link
Contributor

@apetresc yup exactly!

@apetresc
Copy link
Contributor Author

@grubernaut Done!

Copy link
Contributor

@grubernaut grubernaut left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks!

@grubernaut grubernaut merged commit 930a682 into hashicorp:master Aug 24, 2017
@grubernaut
Copy link
Contributor

@apetresc should be released with the next release of the AWS provider, which is v1.0.0, thanks!

@apetresc apetresc deleted the emr-improvements branch August 24, 2017 17:30
@dthauvin
Copy link

dthauvin commented Sep 7, 2017

Hi @apetresc
I looked your source code of that PR and i did not find anything about AutoScalingPolicy in CORE instanceGroup Or TASK instanceGroup.
Does you PR's support instanceGroup AutoScalingPolicy ?
Implement ScalingPolicy for EMR instance group can fix #1147 and #713 .

What about aws_emr_instance_group Terraform ressource ? It's now deprecated ?

Thanks for your contribution !

nbaztec pushed a commit to nbaztec/terraform-provider-aws that referenced this pull request Sep 26, 2017
Enable EMR cluster configuration through the InstanceGroup param
@vishnuravi3186
Copy link

vishnuravi3186 commented Sep 26, 2017

Is there a way we can specify the root volume size for both Master and core instances ? Thanks for all the contribution , ebs_config is working perfectly.

@apetresc
Copy link
Contributor Author

@vishnuravi3186 Yes, just create multiple instance_group entries, one for instance_role = CORE and one for instance_role = MASTER, each with their own ebs_config block. See fossker's comment for an example.

@vishnuravi3186
Copy link

vishnuravi3186 commented Sep 26, 2017

@apetresc Thanks for the quick reply , yep I implemented in the same way , the ebs volume attached to each node is fine with the size what i mentioned , My question is about the root volume which is by default set as 10gb , Is there a way we can increase , In Amazon UI we can mention root volume size upto 100gb , I saw somewhere a feature has been added to implement ebs root volume size.

Example:

ebs_root_volume_size = 100

Let me know if this can be enabled.

@apetresc
Copy link
Contributor Author

Ooh sorry, you're right, I misunderstood.

Let me take a look at the API docs and see how easy this would be to implement...

@vishnuravi3186
Copy link

@apetresc For your reference e6166eb

@vishnuravi3186
Copy link

@apetresc Any updates on this feature ?

@ghost
Copy link

ghost commented Apr 11, 2020

I'm going to lock this issue because it has been closed for 30 days ⏳. This helps our maintainers find and focus on the active issues.

If you feel this issue should be reopened, we encourage creating a new issue linking back to this one for added context. Thanks!

@ghost ghost locked and limited conversation to collaborators Apr 11, 2020
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
enhancement Requests to existing resources that expand the functionality or scope.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

EMR configuration to allow disk allocation for each node groups
8 participants