Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

13-1-vMotion-VCH-Appliance nightly test fails to install VCH #6099

Closed
rajanashok opened this issue Aug 23, 2017 · 4 comments
Closed

13-1-vMotion-VCH-Appliance nightly test fails to install VCH #6099

rajanashok opened this issue Aug 23, 2017 · 4 comments
Assignees
Labels
area/appliance component/test Tests not covered by a more specific component label priority/p0 team/foundation

Comments

@rajanashok
Copy link
Contributor

Aug 23 2017 18:08:14.129Z DEBUG [BEGIN] [github.com/vmware/vic/cmd/vic-machine/create.(*Create).processParams:336]
Aug 23 2017 18:08:14.129Z WARN  Using administrative user for VCH operation - use --ops-user to improve security (see -x for advanced help)
Aug 23 2017 18:08:14.130Z DEBUG client network: IP {<nil> <nil>} gateway <nil> dest: []
Aug 23 2017 18:08:14.130Z DEBUG public network: IP {<nil> <nil>} gateway <nil> dest: []
Aug 23 2017 18:08:14.130Z DEBUG management network: IP {<nil> <nil>} gateway <nil> dest: []
Aug 23 2017 18:08:14.130Z DEBUG VCH DNS servers: []
Aug 23 2017 18:08:14.130Z DEBUG [BEGIN] [github.com/vmware/vic/cmd/vic-machine/common.(*CertFactory).loadCertificates:208]
Aug 23 2017 18:08:14.130Z DEBUG Unable to locate existing server certificate in cert path
Aug 23 2017 18:08:14.130Z DEBUG [ END ] [github.com/vmware/vic/cmd/vic-machine/common.(*CertFactory).loadCertificates:208] [114.626µs] 
Aug 23 2017 18:08:14.130Z DEBUG [BEGIN] [github.com/vmware/vic/cmd/vic-machine/common.(*CertFactory).generateCertificates:326]
    [ Message content over the limit has been removed. ]
...g 23 2017 18:10:33.783Z DEBUG op=434.3 (delta:9.04µs): [NewOperation] op=434.3 (delta:3.824µs) [github.com/vmware/vic/pkg/vsphere/tasks.WaitForResult:68]
Aug 23 2017 18:10:33.789Z DEBUG op=434.3 (delta:6.64165ms): unexpected error on task retry: Post https://10.160.153.100/sdk: EOF
Aug 23 2017 18:10:33.980Z DEBUG [ END ] [github.com/vmware/vic/lib/install/management.(*Dispatcher).startAppliance:105] [197.770937ms] 
Aug 23 2017 18:10:33.981Z DEBUG [ END ] [github.com/vmware/vic/lib/install/management.(*Dispatcher).CreateVCH:43] [2m1.405214185s] VCH-0-6318
Aug 23 2017 18:10:33.981Z DEBUG [BEGIN] [github.com/vmware/vic/lib/install/management.(*Dispatcher).CollectDiagnosticLogs:226]
Aug 23 2017 18:10:33.981Z INFO  Collecting 52ab8601-ecd6-4680-b856-fb2a0620d05f vpxd.log
Aug 23 2017 18:10:34.045Z DEBUG [ END ] [github.com/vmware/vic/lib/install/management.(*Dispatcher).CollectDiagnosticLogs:226] [64.825276ms] 
Aug 23 2017 18:10:34.045Z ERROR Failed to power on appliance Post https://10.160.153.100/sdk: EOF. Exiting...
Aug 23 2017 18:10:34.098Z ERROR --------------------
Aug 23 2017 18:10:34.099Z ERROR vic-machine-linux create failed: Failed to power on appliance Post https://10.160.153.100/sdk: EOF. Exiting...
' does not contain 'Installer completed successfully'```
[13-1-vMotion-VCH-Appliance.zip](https://github.com/vmware/vic/files/1246954/13-1-vMotion-VCH-Appliance.zip)
@rajanashok
Copy link
Contributor Author

@rajanashok rajanashok reopened this Aug 23, 2017
@cgtexmex cgtexmex added the component/test Tests not covered by a more specific component label label Aug 23, 2017
@chengwang86
Copy link
Contributor

There are two failures in this test.

In the first sub-test (step 6-9), from the PL log

Aug 23 2017 18:07:06.353Z DEBUG op=290.48 (delta:3.651µs): [NewOperation] op=290.48 (delta:1.609µs) [github.com/vmware/vic/pkg/vsphere/tasks.WaitForResult:68]
Aug 23 2017 18:07:06.464Z ERROR op=290.48 (delta:110.572867ms): unexpected fault on task retry: &types.InvalidPowerState{InvalidState:types.InvalidState{VimFault:types.VimFault{MethodFault:types.MethodFault{FaultCause:(*types.LocalizedMethodFault)(nil), FaultMessage:[]types.LocalizableMessage{types.LocalizableMessage{DynamicData:types.DynamicData{}, Key:"vpxd.vm.poweroff.unexpectedfailure", Arg:[]types.KeyAnyValue{types.KeyAnyValue{DynamicData:types.DynamicData{}, Key:"1", Value:"clever_payne-6280ca442717"}}, Message:"An error was received from the ESX host while powering off VM clever_payne-6280ca442717."}}}}}, RequestedState:"poweredOn", ExistingState:"poweredOff"}

And the error msg from log.html shows:

'CONTAINER ID        IMAGE               COMMAND             CREATED             STATUS              PORTS               NAMES
6280ca442717        busybox             "/bin/top"          35 seconds ago      Stopped                                 clever_payne' does not contain 'Exited'

This is the same with #5803 and it happened in the regression test after vMotion the VCH. This is build 13338 whereas @sflxn 's code change for the fix is in 13356.

I think we can hold off on this failure and see if it occurs again after picking a later build which has Loc's code changes.

@chengwang86
Copy link
Contributor

The second failure occurred in the second sub-test (step 10-13) during VCH installation. From vic-machine.log:

Aug 23 2017 18:09:45.898Z INFO  Uploading images for container
Aug 23 2017 18:09:45.898Z INFO  	"bin/bootstrap.iso"
Aug 23 2017 18:09:45.898Z DEBUG [BEGIN] [github.com/vmware/vic/pkg/retry.DoWithConfig:73]
Aug 23 2017 18:09:45.900Z INFO  	"bin/appliance.iso"
Aug 23 2017 18:09:45.900Z DEBUG [BEGIN] [github.com/vmware/vic/pkg/retry.DoWithConfig:73]
Aug 23 2017 18:09:46.244Z DEBUG target delete path = d4c49d59-43f9-49b1-ade7-02001acfa4b9/V1.1.0-RC3-13338-085C61F-bootstrap.iso
Aug 23 2017 18:09:46.531Z DEBUG Failed to delete image (bin/bootstrap.iso) with error (File [vsanDatastore] d4c49d59-43f9-49b1-ade7-02001acfa4b9/V1.1.0-RC3-13338-085C61F-bootstrap.iso was not found)
Aug 23 2017 18:09:46.531Z WARN  failed an attempt to upload isos with err (File [vsanDatastore] d4c49d59-43f9-49b1-ade7-02001acfa4b9/V1.1.0-RC3-13338-085C61F-bootstrap.iso was not found), 4 retries remain
Aug 23 2017 18:09:46.531Z WARN  Will try again in 11.04660288s. Operation failed with detected error
Aug 23 2017 18:10:28.513Z DEBUG [ END ] [github.com/vmware/vic/pkg/retry.DoWithConfig:73] [42.613675214s] 
Aug 23 2017 18:10:33.783Z DEBUG [ END ] [github.com/vmware/vic/pkg/retry.DoWithConfig:73] [47.884080756s] 
Aug 23 2017 18:10:33.783Z DEBUG [ END ] [github.com/vmware/vic/lib/install/management.(*Dispatcher).uploadImages:120] [47.884504563s] 
Aug 23 2017 18:10:33.783Z DEBUG [BEGIN] [github.com/vmware/vic/lib/install/management.(*Dispatcher).startAppliance:105]
Aug 23 2017 18:10:33.783Z DEBUG op=434.3 (delta:9.04µs): [NewOperation] op=434.3 (delta:3.824µs) [github.com/vmware/vic/pkg/vsphere/tasks.WaitForResult:68]
Aug 23 2017 18:10:33.789Z DEBUG op=434.3 (delta:6.64165ms): unexpected error on task retry: Post https://10.160.153.100/sdk: EOF
Aug 23 2017 18:10:33.980Z DEBUG [ END ] [github.com/vmware/vic/lib/install/management.(*Dispatcher).startAppliance:105] [197.770937ms] 
Aug 23 2017 18:10:33.981Z DEBUG [ END ] [github.com/vmware/vic/lib/install/management.(*Dispatcher).CreateVCH:43] [2m1.405214185s] VCH-0-6318
Aug 23 2017 18:10:33.981Z DEBUG [BEGIN] [github.com/vmware/vic/lib/install/management.(*Dispatcher).CollectDiagnosticLogs:226]
Aug 23 2017 18:10:33.981Z INFO  Collecting 52ab8601-ecd6-4680-b856-fb2a0620d05f vpxd.log
Aug 23 2017 18:10:34.045Z DEBUG [ END ] [github.com/vmware/vic/lib/install/management.(*Dispatcher).CollectDiagnosticLogs:226] [64.825276ms] 
Aug 23 2017 18:10:34.045Z ERROR Failed to power on appliance Post https://10.160.153.100/sdk: EOF. Exiting...

So something went wrong when we tried to upload the isos to the vsanDatastore. Not sure yet if this is a new issue. @matthewavery Have you seen this before? Any thoughts?

@emlin emlin modified the milestones: Sprint 16 Foundation, Sprint 15 Aug 30, 2017
@cgtexmex cgtexmex self-assigned this Aug 30, 2017
@cgtexmex
Copy link
Contributor

This isn't an issue with vsan or the upload of ISOs -- we lost connectivity to nimbus during the powerOn. I would attribute this to a network hiccup during install. If we see this become a reoccurring issue we can gather additional logs, etc..but for now I'll close.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/appliance component/test Tests not covered by a more specific component label priority/p0 team/foundation
Projects
None yet
Development

No branches or pull requests

5 participants