Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Running in Google Cloud #9

Closed
Strydom opened this issue Apr 9, 2018 · 11 comments
Closed

Running in Google Cloud #9

Strydom opened this issue Apr 9, 2018 · 11 comments

Comments

@Strydom
Copy link

Strydom commented Apr 9, 2018

I've been trying to get this running in Google cloud and just wanted to check a few things that don't seem right...

First: there are 3 boot-node-setup pods and 2 monitor replicas, is that normal?

$ kubectl get pods
NAME                                    READY     STATUS    RESTARTS   AGE
geth-boot-node-pod                      1/1       Running   0          42m
geth-boot-node-setup-pod-h5wq9          1/1       Running   0          42m
geth-boot-node-setup-pod-jjnqc          1/1       Running   0          42m
geth-boot-node-setup-pod-xh7cz          1/1       Running   0          42m
geth-miner-deployment-d97db8bc6-vb9ds   1/1       Running   0          42m
monitor-deployment-ffc657f7d-zjhst      2/2       Running   0          42m

Second: I've exposed 1 mining node and the Monitor with a LoadBalancer so that I can use it as a Provider for a local testing app and view the monitor. As mentioned in #8 I'm not able to see any allocated funds only funds mined into an Etherbase, I'm not sure if this is related? More importantly I can't see the second miner in the Monitor. miner0 does have miner1 as a peer though.

apiVersion: v1
kind: Service
metadata:
  name: miner0-svc
  labels:
    app: kuberneteth
    tier: backend
    name: miner0-svc
spec:
  selector:
    app: kuberneteth
    tier: backend
  type: NodePort
  ports:
    - name: miner0-jsonrpc
      protocol: TCP
      port: 8545
      targetPort: 8545
      nodePort: 30001
    - name: miner0-wsrpc
      protocol: TCP
      port: 8547
      targetPort: 8547
    - name: miner0-ipc-listen
      protocol: UDP
      port: 30301
      targetPort: 30301
    - name: miner0-ipc-discovery
      protocol: TCP
      port: 30303
      targetPort: 30303
      nodePort: 31001

---
apiVersion: v1
kind: Service
metadata:
  name: miner0-web-svc
  labels:
    app: kuberneteth
    tier: backend
    name: miner0-web-svc
spec:
  selector:
    app: kuberneteth
    tier: backend
  type: LoadBalancer
  ports:
    - name: miner0-jsonrpc
      protocol: TCP
      port: 8545
      targetPort: 8545
    - name: miner0-wsrpc
      protocol: TCP
      port: 8547
      targetPort: 8547

---
apiVersion: v1
kind: Service
metadata:
  name: miner1-svc
  labels:
    app: kuberneteth
    tier: backend
    name: miner1-svc
spec:
  selector:
    app: kuberneteth
    tier: backend
  type: NodePort
  ports:
    - name: miner1-jsonrpc
      protocol: TCP
      port: 8545
      targetPort: 8545
      nodePort: 30002
    - name: miner1-wsrpc
      protocol: TCP
      port: 8547
      targetPort: 8547
    - name: miner1-ipc-listen
      protocol: UDP
      port: 30301
      targetPort: 30301
    - name: miner1-ipc-discovery
      protocol: TCP
      port: 30303
      targetPort: 30303
      nodePort: 31002

Monitor-config:

[
      {
        "name"              : "miner0",
        "cwd"               : ".",
        "script"            : "app.js",
        "log_date_format"   : "YYYY-MM-DD HH:mm Z",
        "merge_logs"        : false,
        "watch"             : false,
        "exec_interpreter"  : "node",
        "exec_mode"         : "fork_mode",
        "env":
        {
          "NODE_ENV"        : "production",
          "RPC_HOST"        : "miner0-rpchost",
          "RPC_PORT"        : "8545",
          "LISTENING_PORT"  : "30303",
          "INSTANCE_NAME"   : "miner0",
          "CONTACT_DETAILS" : "",
          "WS_SERVER"       : "localhost:3001",
          "WS_SECRET"       : "connectme",
          "VERBOSITY"       : 0
        }
      },
      {
        "name"              : "miner1",
        "cwd"               : ".",
        "script"            : "app.js",
        "log_date_format"   : "YYYY-MM-DD HH:mm Z",
        "merge_logs"        : false,
        "watch"             : false,
        "exec_interpreter"  : "node",
        "exec_mode"         : "fork_mode",
        "env":
        {
          "NODE_ENV"        : "production",
          "RPC_HOST"        : "miner1-rpchost",
          "RPC_PORT"        : "8545",
          "LISTENING_PORT"  : "30303",
          "INSTANCE_NAME"   : "miner1",
          "CONTACT_DETAILS" : "",
          "WS_SERVER"       : "localhost:3001",
          "WS_SECRET"       : "connectme",
          "VERBOSITY"       : 0
        }
      }
]

Here are my services:

$ kubectl get services
NAME                TYPE           CLUSTER-IP      EXTERNAL-IP     PORT(S)                                                         AGE
geth-bootnode-svc   ClusterIP      10.23.242.147   <none>          30303/UDP                                                       43m
kubernetes          ClusterIP      10.23.240.1     <none>          443/TCP                                                         11h
miner0-svc          NodePort       10.23.251.110   <none>          8545:30001/TCP,8547:30947/TCP,30301:31290/UDP,30303:31001/TCP   43m
miner0-web-svc      LoadBalancer   10.23.249.206   xx.xxx.xxx.xx   8545:30369/TCP,8547:30542/TCP                                   43m
miner1-svc          NodePort       10.23.244.106   <none>          8545:30002/TCP,8547:31385/TCP,30301:32750/UDP,30303:31002/TCP   41m
monitor-svc         LoadBalancer   10.23.254.254   xx.xxx.xx.xxx   80:30623/TCP                                                    43m

Hope you can shine some light on this,
Many thanks.

@MaximilianMeister
Copy link
Owner

First: there are 3 boot-node-setup pods and 2 monitor replicas, is that normal?

the boot-node-setup pods are running on all kubernetes nodes, as it's a DaemonSet because we need the boot node's enode address present on all nodes to re-write the geth config, and insert the bootnode address. i guess you have 3 kubernetes nodes? that's why you have 3 setup pods

the monitor deployment consists of 2 containers, eth-net-intelligence-api and eth-netstats so these are not replicas, they're different containers (back-/frontend)

regarding the second issue, i've never used a loadbalancer in this context, so i have no real experience what could be wrong there.

although your monitor config looks fine to me, in case a node doesnt appear in the monitor, can you try to scale down the monitor replicas to 0, and then to 1 again? this should re-write the config and start up the monitor again, maybe it was a race condition... not sure if that helps

@Strydom
Copy link
Author

Strydom commented Apr 10, 2018

Okay great cheers that makes sense I only started digging into the different clusters yesterday when trying to figure out how the monitoring was working.

I started from scratch as I delete my cluster once testing for the day to avoid charges, both nodes are now showing up in the monitor!

2 new problems:

  1. The two nodes haven't discovered each-other as peers. Another race condition? Anyway to quick fix this instead of rebooting?
  2. It seems that the monitor has the node names mixed up? miner 0 started mining first but it's labeled as miner 1.

P.S. I may open a PR back to here with my updated yaml.erb template which allows you to specify if you want to expose a miner or monitor 👍 I may also add in code to auto generate multiple nodes of the same type just with incremental NodePorts

@MaximilianMeister
Copy link
Owner

The two nodes haven't discovered each-other as peers. Another race condition? Anyway to quick fix this instead of rebooting?

might be, it's hard to say from here. you could check also on the geth console if it has the peer. if both are connected to the same bootnode, and have the same genesis block they should show up as peers

It seems that the monitor has the node names mixed up? miner 0 started mining first but it's labeled as miner 1.

the kubelet probably ran the other miner pod quicker depending on the node it ran on, which might have had more resources available at that time. in short, kubernetes has no notion of sequences. if you want to set up something sequentially you'd need to do that within one single unit (container)

I may also add in code to auto generate multiple nodes of the same type just with incremental NodePorts

have a look at https://github.com/MaximilianMeister/kuberneteth/blob/master/scripts/generate_nodes.sh#L24 - you'd just need to add nodePort_rpc: 3000$i after this line, for instance

@Strydom
Copy link
Author

Strydom commented Apr 11, 2018

Ah okay I see, it looks like because i had auto scaling on it messed up the normal flow of the deploy having to wait for more vm nodes to spin up. This will probably mean that you need to know roughly how many nodes you'll need minimum in your cluster before running the deploy to prevent race conditions while waiting for vm nodes to boot up.

For anyone reading this later: I found that you need a minimum Machine type of n1-standard-1 (1 vCPU, 3.75 GB memory) and 3 of them minimum. Using anything less ran out of resources and required more machines. When your miners try to generate the dag and start mining they can get stuck on smaller machines as i guess they run out of memory.

Oh i didn't realise that's what that script was for. I was going to implement it in the YAML so you didn't have to run another script. Will look into it though 👍

One more question... Do you know of any good testing tools i can run along side to push transactions through etc. I know there is ethereum test-tools but i was hoping to test along these lines:

  1. general transaction speeds
    a. how these are affected by the number of transactions -> growing ledger
  2. ledger resource requirements (Space, RAM, CPU, Network usage)​
  3. how different network configurations (how many and which nodes connect to which neighbours) would affect the speed of the network
  4. how having slow nodes / slow communication between nodes affects the network

I know for point 2 I can use stackdriver, just unsure of how to find the disk usage of the ledger only?

@MaximilianMeister
Copy link
Owner

I was going to implement it in the YAML so you didn't have to run another script. Will look into it though

good idea, feel free to submit sth. at any point you think it's usable 👍

Do you know of any good testing tools

maybe https://github.com/ethereum/go-ethereum/wiki/Metrics-and-Monitoring#querying-metrics ? looks promising but you'd need to implement your own program to get down to the specifics

or https://github.com/ethereum/wiki/wiki/Benchmarks

i'd recommend to ask on reddit or some broader channel than this repo, you will likely get a better hint from someone else in the developer's community

@Strydom
Copy link
Author

Strydom commented Apr 12, 2018

Great thank you, i was just checking to see if maybe you have spun up something along side this setup before 😄

I've got the yaml working now, generating multiple nodes. Will do a PR soon 👍

However I'm still sometimes encountering the problem of the nodes not discovering each-other as peers and when i reboot and they connect then they flicker between 1 and 0 peers 😕 I'm running 3 miners all with the same genesis and boot node... Shouldn't they all have 2 peers? and not keep changing?

@MaximilianMeister
Copy link
Owner

MaximilianMeister commented Apr 13, 2018

However I'm still sometimes encountering the problem of the nodes not discovering each-other as peers and when i reboot and they connect then they flicker between 1 and 0 peers I'm running 3 miners all with the same genesis and boot node... Shouldn't they all have 2 peers? and not keep changing?

this is likely a polling/connection issue in eth-netstats, that gets the data through net-intelligence-api. if the geth console shows it as a peer it should be all good!

EDIT: check the logs of the 2 monitor containers, maybe there are some hints

@Strydom
Copy link
Author

Strydom commented Apr 13, 2018

Sorry I forgot to mention that I'm seeing the same thing when calling get peers from web3.js.

I've just checked the geth console directly and see that miner0 and miner1 are connected but miner2 is not connected and is on a completely different block number than the others.

I checked that the system clock was correct for all of them which it is and I'm pretty sure they are all using the same genesis as I haven't changed any of the genesis settings...
They are all have the same bootnode encode address in etc/testnet/bootnode

Looks like it was just due to race conditions again! Very annoying, oh well, after rebooting they are now all connected as expected 👍

P.S. I've submitted my PR (#10) with the changes mentioned in this thread

@Strydom
Copy link
Author

Strydom commented May 1, 2018

Hi Maximilian,
I haven't had much time to make the changes to the PR, i should have time next week.

I've just encountered another problem though. All of a sudden the init container can't start because it is getting permission denied:

christopherstormstrydom@dlt-testing:~/kuberneteth$ kubectl logs geth-miner-deployment-d97db8bc6-gcxdp -c miner-genesis-init-container
+ [ ! -f /etc/testnet/miner/genesis_created ]
+ /usr/local/bin/geth --datadir /etc/testnet/miner init /etc/geth/genesis/Genesis-geth.json
INFO [05-01|03:50:28] Maximum peer count                       ETH=25 LES=0 total=25
Fatal: Failed to create the protocol stack: mkdir /etc/testnet/miner/keystore: permission denied
Fatal: Failed to create the protocol stack: mkdir /etc/testnet/miner/keystore: permission denied
+ touch /etc/testnet/miner/genesis_created
touch: /etc/testnet/miner/genesis_created: Permission denied

Any ideas? This is running on the master without changes. I was encountering it on my branch as well but seems to be a general problem.

@MaximilianMeister
Copy link
Owner

@Strydom it seems the same issue as in #11

my guess is that something in kubernetes changed regarding the security model and read only mounts. but i haven't had much time to investigate it. it seems to fail for (i guess) all kubernetes versions > 1.9.something

@Strydom
Copy link
Author

Strydom commented May 9, 2023

stale

@Strydom Strydom closed this as completed May 9, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants