Issues with OSA multiple IP support #203

rgschmi · 2019-04-29T15:21:53Z

Environment is Windows 7, Hercules with multiple IP support and CTCI-WIN 3.7

I've added a second OSA and a same-subnet VIPA to my config and I can ping all three addresses. However when I stop the VIPAOWNER OSA, I'm getting mixed results. I can still ping the stopped OSA and sometimes the VIPA responds to ping, but a TN3270 session to those two addresses fails (the ping responses turn to timeouts on those two addresses when TN3270 is trying to connect).

The VIPA TAKEOVER function worked according to the z/OS console, but I did not see the address of the VIPA being deregistered on the stopped OSA and then registered on the running OSA on the Hercules console.

What should happen is the addresses owned by the failing OSA should be taken over by a surviving OSA, and all addresses should continue to respond to ping (or TN3270).

I don't think it matters, but I am using INTERFACE definitions instead of DEVICE, LINK and HOME.

0:3001 OSA: tun0: using drive MAC address 96:7a:59:e5:d2:bf
0:3001 OSA: tun0: using drive IP address fe80::967a:59ff:fee5:d2bf
0:3001 OSA: tun0: Register guest IP address 192.168.20.12   <== first OSA
0:3001 OSA: tun0: not using MAC address 02:00:5e:a3:be:84
0:3001 OSA: tun0: not using IP address 192.168.20.12
0:3001 OSA: tun0: not using MTU 1500
0:3001 OSA: tun0: using MAC address 02:00:5e:a3:be:84
0:3001 OSA: tun0: using IP address 192.168.20.12
0:3001 OSA: tun0: using MTU 1500
0:3001 OSA: tun0: using drive MAC address 96:7a:59:e5:d2:bf
0:3001 OSA: tun0: using drive IP address fe80::967a:59ff:fee5:d2bf
0:3001 OSA: tun0: Register guest IP address 192.168.20.12
0:3005 OSA: Interface tun1, type TUN opened                 <==  adding second OSA
0:3005 OSA: tun1: using MAC address 02:00:5e:a3:be:84
0:3005 OSA: tun1: using IP address 192.168.20.13               
0:3005 OSA: tun1: using MTU 1500
0:3005 OSA: tun1: using drive MAC address 96:7a:59:e5:d2:bf
0:3005 OSA: tun1: using drive IP address fe80::967a:59ff:fee5:d2bf
0:3005 OSA: tun1: Register guest IP address 192.168.20.13   <== registering second OSA
0:3001 OSA: tun0: Register guest IP address 192.168.20.10   <== registering VIPA
0:3001 OSA: tun0: not using MAC address 02:00:5e:a3:be:84
0:3001 OSA: tun0: not using IP address 192.168.20.12        <== deregistering first OSA? 
0:3001 OSA: tun0: not using MTU 1500
                                                            <== nothing about
                                                                De-registering VIPA address

                                                            <== nothing about
                                                                Re-registering VIPA and OSA addresses
                                                                to surviving OSA

NOTE: This issue is closely related to Issue #204.

The text was updated successfully, but these errors were encountered:

Fish-Git · 2019-05-02T17:05:37Z

However when I stop the VIPAOWNER OSA ...

Forgive me Bob, but I am not very familiar with OSA devices nor how they're supposed to work. When you say "stop", what do you mean? How do you stop an OSA? For that matter, how does one define a VIPA?

When I implemented my changes (to both CTCI-WIN as well as to Hercules), I tested it by simply defining an OSA device like normal, and then used the TCPIP OBEYFILE command to, I think, add a VIPA (and then verify it could be pinged) and then deleting it (and verifying the IP address now no longer responded to pings again):

ADCD.Z21S.TCPPARMS(FISHADDV):

    INTERFACE VLINKX DEFINE
       VIRTUAL
       IPADDR 192.168.20.4

ADCD.Z21S.TCPPARMS(FISHDELV):

    INTERFACE VLINKX DELETE

VARY TCPIP,,OBEYFILE,ADCD.Z21S.TCPPARMS(FISHADDV)
VARY TCPIP,,OBEYFILE,ADCD.Z21S.TCPPARMS(FISHDELV)

Basically, I created the two members you see above: one called FISHADDV (containing the statements to, I think, define the VIPA) and one called FISHDELV to, I think, delete the VIPA. I then used the two VARY TCPIP... commands that you see to add the VIPA to the OSA and the other to delete it. That's basically the only testing I did. (And the IP address I chose for the VIPA was in the same subnet as the OSA device itself.)

What should happen is the addresses owned by the failing OSA should be taken over by a surviving OSA, and all addresses should continue to respond to ping (or TN3270).

Which brings me back to my original question: How does one "stop" an OSA? What is the command you use?

And perhaps more importantly, who is responsible for managing the IP addresses assigned to an OSA?

z/OS? Or the OSA device itself?

That is to say, when you "stop" an OSA (however the heck you do that), or the OSA otherwise "fails" (i.e. stops responding), how does the other OSA magically know that it needs to "take over responsibility for" all of the IP addresses that the failing OSA was originally responsible for? How does it magically know that?

I would have thought that z/OS would have issued the appropriate "Add IP Address" commands to the surviving OSA, which it obviously didn't (since we do not see any Hercules HHC03805I IP Address Registration messages which always occur when z/OS registers a given IP address to an OSA).

So who's at fault here? Where is the bug? Why isn't z/OS registering the IP addresses that it knows were assigned to the OSA that failed to the surviving OSA? If it would do that, then I believe things would work just fine!

If z/OS is not responsible for doing that, then I again have to ask, how the frick does the other OSA somehow magically know to take over responsibility for the failing OSA's IP Addresses?!

Thanks in advance for any enlightenment you or anyone else can provide!

rgschmi · 2019-05-02T18:20:23Z

On May 2, 2019 at 12:05 PM Fish-Git ***@***.***> wrote: > > > However when I stop the VIPAOWNER OSA ... > > > Forgive me Bob, but I am not very familiar with OSA devices nor how they're supposed to work. When you say "stop", what do you mean? How do you stop an OSA? For that matter, how does one define a VIPA? When I implemented my changes (t both CTCI-WIN as well as to Hercules), I tested it by simply defining an OSA device like normal, and then used the TCPIP OBEYFILE command to, I think, add a VIPA (and then verify it could be pinged) and then deleting it (and verifying the IP address now no longer responded to pings again): ADCD.Z21S.TCPPARMS(FISHADDV): INTERFACE VLINKX DEFINE VIRTUAL IPADDR 192.168.20.4 ADCD.Z21S.TCPPARMS(FISHDELV): INTERFACE VLINKX DELETE VARY TCPIP,,OBEYFILE,ADCD.Z21S.TCPPARMS(FISHADDV) VARY TCPIP,,OBEYFILE,ADCD.Z21S.TCPPARMS(FISHDELV) Basically, I created the two members you see above: one called FISHADDV (containing the statements to, I think, define the VIPA) and one called FISHDELV to, I think, delete the VIPA. I then used the two VARY TCPIP... commands that you see to add the VIPA to the OSA and the other to delete it. That's basically the only testing I did. (And the IP address I chose for the VIPA was in the same subnet as the OSA device itself.) > > > What should happen is the addresses owned by the failing OSA should be taken over by a surviving OSA, and all addresses should continue to respond to ping (or TN3270). > > > Which brings me back to my original question: How does one "stop" an OSA? What is the command you use?

To start or stop an OSA (of CLS for that matter), issue V TCPIP,,STOP,osalink If you are using interface statements for the OSA, like you are for the VIPA, use the name of the link in the start or stop statement. Adding and deleting the VIPA or OSA works, too! Here are my definitions: INTERFACE VLINK10 DEFINE VIRTUAL IPADDR 192.168.&ip..10 INTERFACE LNK3000 DEFINE IPAQENET PORTNAME OSA3000 ; MUST MATCH TRLE PORT NAME IPADDR 192.168.&IP..12/24 ; INTERFACE IP ADDRESS SOURCEVIPAINT VLINK10 INTERFACE LNK3004 DEFINE IPAQENET PORTNAME OSA3004 ; MUST MATCH TRLE PORT NAME IPADDR 192.168.&IP..13/24 ; INTERFACE IP ADDRESS SOURCEVIPAINT VLINK10 I'm using a z/OS system symbol for part of the IP address, so I can use the same profile on all my z/OS instances. My OSA delete looks like this, which also works for a VIPA INTERFACE OSA3004 DELETE My OSA add looks like this, which has the same syntax as in the TCPIP profile. INTERFACE LNK3004 DEFINE IPAQENET PORTNAME OSA3004 ; MUST MATCH TRLE PORT NAME IPADDR 192.168.&IP..13/24 ; INTERFACE IP ADDRESS SOURCEVIPAINT VLINK10

And perhaps more importantly, who is responsible for managing the IP addresses assigned to an OSA? z/OS? Or the OSA device itself?

The IP addresses assigned to the OSA are loaded into the OSA when the OSA is started in the IP stack, aand should be removed when the OSA is stopped or deleted. Z/OS should send the commands to the OSA to perform those functions, so z/OS initiates the actions.

That is to say, when you "stop" an OSA (however the heck you do that), or the OSA otherwise "fails" (i.e. stops responding), how does the other OSA magically know that it needs to "take over responsibility for" all of the IP addresses that the failing OSA was originally responsible for? How does it magically know that I would have thought that z/OS would have issued the appropriate "Add IP Address" commands to the surviving OSA, which it obviously didn't (since we do not see any Hercules HHC03805I IP Address Registration messages which always occur when z/OS registers a given IP address to an OSA). So who's at fault here? Where is the bug? Why isn't z/OS registering the IP addresses that it knows were assigned to the OSA that failed to the surviving OSA? If it would do that, then I believe things would work just fine!

I've looked at this a bit, but need to do more research. What I recall is that when an OSA (LCS does the same thing, but all the LCS code is in the IP stack, not in the LCS adapter), there is a local ARP (will not propagate to other subnets) sent out. If another adapter in the stack 'sees' the ARP, it knows that it can back up the newly started OSA (or LCS) if it fails. If an adapter fails that has a backup (on the same LAN or VLAN), it will take over ARP responsibility for the failing adapter and send a gratituous ARP to let the gateway switches know the mac address for the failing atapter's IP addresses has changed. I know the ARP sent at adapter start time is working because both my OSA adapters are in the same LAN group (which means they can back each other up). Here is the last few lines from a D TCPIP,,N,DEV command: LANGROUP: 00001 NAME STATUS ARPOWNER VIPAOWNER ---- ------ -------- --------- LNK3000 ACTIVE LNK3000 YES LNK3004 ACTIVE LNK3004 NO As you can see, both links are active and LNK3000 has ARP responsibility for any VIPA in the stack, so you are supporting the ARP correctly. If the OSAs were on separate LANs or the ARP wasn't working properly, they would be in separate LAN groups. If I stop LNK3000, I get the following on the z/OS console: V TCPIP,TCPIP,STOP,LNK3000 EZZ0060I PROCESSING COMMAND: VARY TCPIP,TCPIP,STOP,LNK3000 EZZ0053I COMMAND VARY STOP COMPLETED SUCCESSFULLY EZD0040I INTERFACE LNK3004 HAS TAKEN OVER ARP RESPONSIBILITY FOR INACTIVE INTERFACE LNK3000 EZZ4341I DEACTIVATION COMPLETE FOR INTERFACE LNK3000 And the D TCPIP DEV looks like this: IPV4 LAN GROUP SUMMARY LANGROUP: 00001 NAME STATUS ARPOWNER VIPAOWNER ---- ------ -------- --------- LNK3004 ACTIVE LNK3004 YES LNK3000 NOT ACTIVE LNK3004 NO This is just how it should look, so I think z/OS is doing it's job, but the IP addresses are not being deleted from the stopped OSA and registered in the surviving OSA. I suspect the commands are being sent to the OSA but not being acted upon. Hope that helps a little.

…

If z/OS is not responsible for doing that, then I again have to ask, how the frick does the other OSA somehow magically know to take over responsibility for the failing OSA's IP Addresses?! Thanks in advance for any enlightenment you or anyone else can provide! — You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub #203 (comment) , or mute the thread https://github.com/notifications/unsubscribe-auth/ACMWJAG46KH4OC7G2FBCUM3PTMNOFANCNFSM4HJEJNAQ .

Fish-Git · 2019-05-02T20:26:18Z

On May 2, 2019 at 12:05 PM Fish-Git @.***> wrote:

Bob?

FYI:

I'd appreciate it very much if you would not respond/reply to GitHub Issues via email.

I'd greatly prefer that you instead respond/reply directly via the GitHub Issues web page:

https://github.com/SDL-Hercules-390/hyperion/issues/203

When you reply directly via their web page, I can make minor edits to your reply so it is more readable (prettier) by editing the fonts being used, etc. (just like I did with your original post).

When you reply via email however, I cannot edit your reply, so oftentimes it is much harder (more difficult) to read.

It's up to you whether or not you want to take the time to reply via their web page or continue to reply via email, but I'd rather that you reply directly via their web page.

Thanks!

rgschmi · 2019-05-02T20:47:42Z

No problem responding via git-hub. I'm still learning the proper protocol here.

BTW, I see in the Hercules log, the IP address is being deregistered on the OSA I stopped, but it is not being registered on the surviving OSA, so things are mostly working.

Fish-Git · 2019-05-02T21:25:45Z

To start or stop an OSA (or LCS for that matter), issue: V TCPIP,,STOP,osalink.

Thanks.

Here are my definitions:

My OSA delete looks like this, which also works for a VIPA:

My OSA add looks like this, which has the same syntax as in the TCPIP profile:

D TCPIP,,N,DEV

Thank you for all of that too.

I'll try to update my TCPIP PROFILE to match yours so I can hopefully (maybe) reproduce your same set of tests.

If another adapter in the stack 'sees' the ARP, it knows that it can back up the newly started OSA (or LCS) if it fails.

I doubt that very much! Just because an ARP (whether normal or gratuitous or even reverse) is sent out, doesn't give any other device on the network permission to "take over" for that IP! (Besides, how could it reliably know when to "take over"?!)

No, I rather suspect that instead, some type of control command (perhaps one we're not supporting but I don't know that yet) is sent to the OSA by z/OS to specifically ask it to "take over" for a given IP.

I suspect the commands are being sent to the OSA but are not being acted upon.

You're probably right.

I would have expected the commands to be "delete ip" and "add ip" (which we do currently support, obviously), but apparently the commands are some different type (or new type) of command that, you are correct, we are either currently not supporting or not supporting properly.

I will need @mcisho Ian Shorter's help with doing some QETH driver packet tracing to see if we can maybe discover what those new control commands are (or where we're going wrong with our current set of commands).

Bottom line: there is obviously still much work to be done in order to support multiple IP addresses, both in my CTCI-WIN product as well as in (and probably especially in!) Hercules too.

Fish-Git · 2019-05-02T21:27:53Z

but it is not being registered on the surviving OSA

Right. I think that's the root of our problem. z/OS is either sending a new type of control command that we're not supporting, or we're not handling an existing control command properly.

More testing/tracing needs to be done.

rgschmi · 2019-05-02T23:00:59Z

I'm not sure I stated the purpose of the ARP at OSA startup correctly, but as you can see from the D TCPIP,,N,DEV command, TCPIP keeps track of the interfaces (OSA or LCS) that are on the same LAN or VLAN. If one of the interfaces fails, TCPIP knows, obviously and another adapter in the LAN group will take over ARP responsibility. The non-routable (I think) ARP is used by TCPIP to assign adapters to LAN groups. Interestingly (to me anyway) is that Linux fails this test. Adapters on the same LAN do not appear in the same LAN group.

I'd be happy to do any tracing or other troubleshooting you need.

For your reading enjoyment, I'll attach my full TCPIP profile. There is other fun stuff in there, too. It happens to be for LCS because all the functions in the profile work under Linux. Note port 3023 has one function that I'd like to see work under Windows. Port 3023's IP address is dynamically created when TN3270A is started, and it can be started on multiple LPARS at the same time in a sysplex environment. The syplex will determine which LPAR creates the IP address and if that LPAR fails, another LPAR will take over the address. TN3270 sessions are then distributed (round robin in my profile) to LPARS listed in the VIPADIST operand. That's a pretty good test of CTCE and LCS or OSA and CTCI-WIN functionality.

ARPAGE 5                                                                      
                                                                              
DATASETPREFIX TCPIP                                                           
;                                                                             
; -----------------------------------------------------------------------     
; AUTOLOG the following servers.                                              
;                                                                             
                                                                              
AUTOLOG 5                                                                     
    FTPD JOBNAME FTPD1   ; FTP Server                                         
    PORTMAP              ; Portmap Server                                     
    OMPROUTE             ; ROUTED SERVER                                      
;   SMTP                 ; SMTP Server                                        
;   OSNMPD               ; SNMP Agent Server                                  
;   SNMPQE               ; SNMP Client                                        
ENDAUTOLOG                                                                    
;                                                                             
; -----------------------------------------------------------------------     
;                                                                             
; Reserve ports for the following servers.                   
;                                                            
PORT                                                         
     7 UDP MISCSERV            ; Miscellaneous Server        
     7 TCP MISCSERV                                          
     9 UDP MISCSERV                                          
     9 TCP MISCSERV                                          
    19 UDP MISCSERV                                          
    19 TCP MISCSERV                                          
    20 TCP OMVS      NOAUTOLOG ; FTP Server                  
    21 TCP OMVS                ; FTP Server                  
    23 TCP TN3270    NOAUTOLOG ; TELNET SERVER               
    25 TCP SMTP                ; SMTP Server                 
    53 TCP NAMESRV             ; Domain Name Server          
    53 UDP NAMESRV             ; Domain Name Server          
    69 UDP OMVS                ; OE TFTP SERVER              
    80 TCP OMVS                ; OE WEB SERVER               
   111 TCP PORTMAP             ; Portmap Server              
   111 UDP PORTMAP             ; Portmap Server              
   135 UDP LLBD                ; NCS Location Broker         
   161 UDP OSNMPD              ; SNMP Agent                  
   162 UDP SNMPQE              ; SNMP Query Engine         
   433 TCP OMVS                ; OE WEB Server             
   443 TCP OMVS                ; Secure Server             
   512 TCP RXSERVE             ; Remote Execution Server   
   513 UDP OMVS                ; OE RLOGIN SERVER          
   514 UDP OMVS                ; OE syslog server          
   514 TCP RXSERVE             ; Remote Execution Server   
   515 TCP LPSERVE             ; LPD Server                
   520 UDP OMPROUTE            ; ROUTED SERVER             
   521 UDP OMPROUTE            ; ROUTED SERVER             
   580 UDP NCPROUT             ; NCPROUTE Server           
   750 TCP MVSKERB             ; Kerberos                  
   750 UDP MVSKERB             ; Kerberos                  
   751 TCP ADM@SRV             ; Kerberos Admin Server     
   751 UDP ADM@SRV             ; Kerberos Admin Server     
   992 TCP TN3270   NOAUTOLOG  ; SSL/AT-TLS port           
   1023 TCP OMVS                ; OE TELNET SERVER         
   1023 UDP OMVS                ; OE TELNET SERVER         
   1024 TCP OMVS                ; OE SERVICES              
   1415 TCP CSQ1CHIN            ; CSQ1 MQ TCP Listener     
   2000 TCP OMVS                ; CINET INADDRANY          
   2000 UDP OMVS                ; CINET INADDRANY                        
   3000 TCP CICSTCP             ; CICS Socket                            
   3023 TCP TN3270A SHAREPORTWLM  BIND 10.0.0.23   ; BIND PORT           
   12000 UDP VTAM               ; ENTERPRISE EXTENDER                    
   12001 UDP VTAM               ; ENTERPRISE EXTENDER                    
   12002 UDP VTAM               ; ENTERPRISE EXTENDER                    
   12003 UDP VTAM               ; ENTERPRISE EXTENDER                    
   12004 UDP VTAM               ; ENTERPRISE EXTENDER                    
   16311 TCP PAGENT NOAUTOLOG   ; CS Policy Agent ServicesConnection     
                                                                         
;PORTRANGE                                                               
;    63000 100 UDP OMVS                                                  
;    63000 100 TCP OMVS                                                  
;                                                                        
;                                                                        
VIPADYNAMIC                                                              
  VIPADEFINE 255.255.255.0   10.0.0.23                                   
; VIPADIST DEFINE DISTM SERVERWLM 10.0.0.23 PORT 3023                    
  VIPADIST DEFINE DISTM ROUNDROBIN 10.0.0.23 PORT 3023                   
      DESTIP 10.0.0.1 10.0.0.4                                           
ENDVIPADYNAMIC                                                           
                                                          
SACONFIG DISABLED                                         
                                                          
ITRACE OFF                                                
                                                          
GLOBALCONFIG SYSPLEXMONITOR RECOVERY TIMERSECS 10         
       SYSPLEXWLMPOLL 10                                  
                                                          
IPCONFIG NODATAGRAMFWD                                    
         SOURCEVIPA                                       
         SYSPLEXROUTING                                   
         MULTIPATH                                        
         PATHMTUDISCOVERY                                 
         DYNAMICXCF 172.16.0.&IP. 255.255.255.255 2       
                                                          
UDPCONFIG RESTRICTLOWPORTS                                
          UDPSENDBFRSIZE 65535                            
          UDPRCVBUFRSIZE 65535                            
                                                          
TCPCONFIG RESTRICTLOWPORTS                                
          TCPSENDBFRSIZE 65535                            
          TCPRCVBUFRSIZE 65535      
                                    
INCLUDE NUI1.USER.TCPPARMS(DEVLCS)

rgschmi · 2019-05-04T14:38:24Z

I have disabled IP routing on Windows, which changed my ping expired in transit issue to time outs. At least the transit problem is fixed!

Before disabling:

Pinging 10.0.0.2 with 32 bytes of data:
Reply from 192.168.20.1: TTL expired in transit.
Reply from 192.168.20.1: TTL expired in transit.
Reply from 192.168.20.1: TTL expired in transit.
Reply from 192.168.20.1: TTL expired in transit.

After:

Pinging 10.0.0.2 with 32 bytes of data:
Request timed out.
Request timed out.
Request timed out.
Request timed out.

Fish-Git · 2019-05-04T19:23:08Z

After:

Pinging 10.0.0.2 with 32 bytes of data:
Request timed out.
Request timed out.
Request timed out.
Request timed out.

Please see my response in Issue #204 (comment).

You need to add a route for 10.0.0.2 to your network router so that is knows to send packets for 10.0.0.2 to your Hercules z/OS guest instead (i.e. to either 192.168.20.12 or 192.168.20.13, whichever you prefer).

What's happening is, 10.0.0.2 is not within the same subnet as your Windows host (nor as your Hercules z/OS guest either for that matter), so the packet is being sent to the default gateway (i.e. to your network router).

Your router is then sending the packets to ............... who knows where!

But wherever it is sending them to, it's obviously not sending them to the same network segment that your Windows host or Hercules z/OS guest is on. That is to say, neither CTCI-WIN (which can only see packets that your Windows host sees) nor your Hercules z/OS guest is ever seeing any of the ping packets because they're being misrouted by your router to somewhere where you obviously don't want them (or need them) to go.

By adding a route to your network router telling it to send (forward?) all packets destined to 10.0.0.2 to one of your Hercules z/OS guest's IP addresses, the pings should then be seen and properly replied to.

Try that and let me know whether it resolves this issue or not. I suspect it will.

Fish-Git · 2019-05-04T20:22:33Z

(Oops!) I have removed the "Invalid" and "Waiting to close" labels and have re-added the "Bug" and "Researching" labels because the original issue (problem) has not been resolved yet!

The original problem was that when the first OSA (which was the VIPA owner) was removed (deleted), the VIPA IP address was not automatically moved over to the surviving second OSA like Bob claims it automatically should be.

We know the cause but not the resolution yet. Thus the issue has not been resolved yet.

The cause is that when the first OSA ("tun0") is deleted, Hercules receives a IPA_CMD_STOPLAN command packet for that tun device and all IP addresses associated with it are suddenly no longer "owned" by anyone as a result and thus stop responding.

As I stated earlier, I would have expected that z/OS to have followed the IPA_CMD_STOPLAN command packet for tun0 with a IPA_CMD_SETIP ("Add IP Address") command packet to tun1 for the VIPA address. If it would have done that then I think things would have worked just fine.

So we're missing some key part of the puzzle somewhere. Either there's a command packet that z/OS is sending us that we're not processing or else there's some additional information being passed in the IPA_CMD_STOPLAN command packet having to do with the VIPA that we're not handling.

(Or .... Something else entirely different (unknown) is going on.)

However, the bottom line is: IF a "Stop LAN" command packet is supposed to somehow automatically transfer the VIPA IP address over to the surviving OSA (which I'm not convinced of yet), then we need to figure out how to detect and do that. We need to figure out how to detect that and figure out how to know to do it (as well as figure out how to magically know which OSA to transfer the VIPA IP address to!).

More work (more research) obviously still needs to be done for this issue.

(@mcisho Ian Shorter? I might need your help with this one, buddy!)

rgschmi · 2019-05-06T00:01:31Z

Sorry about my wording. I did a git clone, which I called a download by mistake. I have not downloaded the zip file.

I just git cloned into a directory called hyperiongit. I used the same git clone command I use in Linux. I opened the hyperiongit directory and clicked on Hercules_VS2017.sln, which brought up Visual Studio. I did a Rebuild Solution.

Then I deleted all the files from my R4.10+ directory and copied all the files from msvc.AMD64.bin to that directory. I copied all my Hercules configuration files to that directory as well, and then ran Hercules.

I missed copying the message numbers. Here is a better copy from the Hercules log screen (don't know the official name).

HHC01603I version
HHC01413I Hercules version 4.2.0.0-SDL (4.2.0.0)
HHC01414I (C) Copyright 1999-2019 by Roger Bowler, Jan Jaeger, and
HHC01417I ** The SoftDevLabs version of Hercules **
HHC01415I Build date: May  5 2019 at 18:44:46
HHC01417I Built with: Microsoft Visual C (MSVC 191627026 1)
HHC01417I Build type: Windows MSVC AMD64 host architecture build
HHC01417I Modes: S/370 ESA/390 z/Arch
HHC01417I Built with: Microsoft Visual C (MSVC 191627026 1)
HHC01417I Build type: Windows MSVC AMD64 host architecture build
HHC01417I Modes: S/370 ESA/390 z/Arch

rgschmi · 2019-05-06T00:05:16Z

Here you go.

BuildLog-x64-Release.txt

Fish-Git · 2019-05-06T00:57:33Z

I have not downloaded the zip file.

Good.

I just git cloned into a directory called hyperiongit.

I presume you're using the command-line (Command Prompt), yes? What git client are you using? What was the command that you entered?

Did you also ensure that the directory did not already exist before entering your git clone command? (So that the git clone could then create that directory fresh, from scratch?)

I missed copying the message numbers. Here is a better copy from the Hercules log screen (don't know the official name).

The official name I guess is called the "panel" screen. It would be better however to copy messages from the Hercules logfile, not from the panel screen.

When you start Hercules, do you specify a logfile? That is to say, when you start Hercules, you first open a Command Prompt windows, yes. And then from there, you enter the command to start Hercules. What does the command that you enter look like? Does it look like:

    hercules -f myconfigfile.cnf

Or does it look like:

    hercules -f my_herc_config_file.cnf > my_herc_logfile.txt

The second format (with the > logfile string) is what creates a Hercules logfile. Every message that Hercules issues to the panel screen is also written to the logfile.

This is important since some of the messages that Hercules issues can be much longer (wider) than your panel window. Thus what you see on the screen might not be the full text of the message, whereas the messages written to the logfile are always the full text of the message.

I hope you already knew that and I apologize if you did.

Fish-Git · 2019-05-06T01:08:42Z

Here you go.

BuildLog-x64-Release.txt

Okay, I see a problem right away! For some (as-yet-unknown) reason it appears the _dynamic_version batch file is failing!

To verify, please do the following and then post the results:

Open a brand new (fresh) Command Prompt window and navigate to your hyperiongit directory.
Enter the command: dir /b /ad. Post the results. (It should only be a couple of lines.)
Enter the command: _dynamic_version.cmd. Post the results. (It should only be a couple of lines.)

I need to see the output of both commands.

Thanks.

Fish-Git · 2019-05-06T01:15:27Z

3. Enter the command: _dynamic_version.cmd. Post the results. (It should only be a couple of lines.)

(Oops!) (I forgot the most important step!)

After step 3 (enter _dynamic_version command), you need to:

Enter the command: set ver.

That is the command that will only display a couple of lines. The _dynamic_version command itself does not display anything it all. It runs silently.

BUT... It defines (sets) the all-important VERSION environment variables that Hercules uses during its build. That's what the set vers command should display: all variables that start with ver. THAT is what I need to see!

Sorry about that.

rgschmi · 2019-05-06T01:20:44Z

Here is what I get. I suspect the set ver is wrong. I did forget to clear hyperiongit the second time I did a git clone, but the first time it was empty.

C:\hyperiongit>dir /b /ad
.git
.vs
autoconf
crypto
decNumber
html
m4
man
msvc.AMD64.bin
msvc.AMD64.map
msvc.AMD64.obj
msvc.AMD64.pdb
msvc.makefile.includes
scripts
SoftFloat
telnet
tests
util

C:\hyperiongit>_dynamic_version.cmd

C:\hyperiongit>
C:\hyperiongit>set ver
VERSION="4.2.0.0-SDL"
VERS_BLD=0
VERS_INT=2
VERS_MAJ=4
VERS_MIN=0

C:\hyperiongit>

No, I haven't been using the log file, but will change my batch file ASAP to do so.

I will also delete the hyperiongit directory and try again.

This is my git:

C:\Users\HP\Documents\GitHub> git clone https://github.com/SDL-Hercules-390/hyperion.git c:\hyperiongit
Cloning into 'c:\hyperiongit'...
remote: Enumerating objects: 140, done.
remote: Counting objects: 100% (140/140), done.
remote: Compressing objects: 100% (107/107), done.
remote: Total 50674 (delta 61), reused 78 (delta 33), pack-reused 50534
Receiving objects: 100% (50674/50674), 45.27 MiB | 4.41 MiB/s, done.
Resolving deltas: 100% (39686/39686), done.
Checking out files: 100% (1106/1106), done.
C:\Users\HP\Documents\GitHub>

rgschmi · 2019-05-06T01:33:49Z

Just tried it with new hyperiongit directory - same results.

Fish-Git · 2019-05-06T02:13:28Z

This is my git:

C:\Users\HP\Documents\GitHub> git clone https://github.com/SDL-Hercules-390/hyperion.git

Cloning into 'c:\hyperiongit'...
remote: Enumerating objects: 140, done.
remote: Counting objects: 100% (140/140), done.
remote: Compressing objects: 100% (107/107), done.
remote: Total 50674 (delta 61), reused 78 (delta 33), pack-reused 50534
Receiving objects: 100% (50674/50674), 45.27 MiB | 4.41 MiB/s, done.
Resolving deltas: 100% (39686/39686), done.
Checking out files: 100% (1106/1106), done.

C:\Users\HP\Documents\GitHub>

While the way you are doing it is fine (it will work), the way you're supposed to do a git clone is as follows:

Open a Command Prompt window.
Navigate to the parent directory of where you want your clone to be created. That is to say, if you wish your clone to be created in the "C:\MyHercStuff\SDL-Hyperion" directory and wanted the clone directory to be called "clone42", then you would navigate to the "C:\MyHercStuff\SDL-Hyperion" directory and enter your git clone command from there: cd C:\MyHercStuff\SDL-Hyperion && git clone ...).
Use the command format: git clone <URL> clone-directory-name

For example:

Microsoft Windows [Version 6.1.7601]
Copyright (c) 2009 Microsoft Corporation.  All rights reserved.

C:\Users\Fish> cd "C:\Users\Fish\Projects\Hercules"

C:\Users\Fish\Projects\Hercules> dir /b herc42*

herc42repo

C:\Users\Fish\Projects\Hercules> rmdir herc42repo

C:\Users\Fish\Projects\Hercules> dir /b herc42*

File Not Found

C:\Users\Fish\Projects\Hercules> git clone  https://github.com/SDL-Hercules-390/hyperion.git  herc42repo

Cloning into 'herc42repo'...
remote: Enumerating objects: 140, done.
remote: Counting objects: 100% (140/140), done.
remote: Compressing objects: 100% (107/107), done.
remote: Total 50674 (delta 61), reused 78 (delta 33), pack-reused 50534
Receiving objects: 100% (50674/50674), 45.28 MiB | 8.06 MiB/s, done.
Resolving deltas: 100% (39686/39686), done.
Checking out files: 100% (1106/1106), done.

C:\Users\Fish\Projects\Hercules> dir /b /ad herc42*

herc42repo

C:\Users\Fish\Projects\Hercules> cd herc42repo

C:\Users\Fish\Projects\Hercules\herc42repo> dir /b /ad

.git
autoconf
crypto
decNumber
html
m4
man
msvc.makefile.includes
scripts
SoftFloat
telnet
tests
util

That's the way it's normally done. But as I said, the way you're doing it should also work just fine. I'm just passing on some helpful information here that allows you to create a git clone wherever you want (instead of letting it default to some weird directory name in the root of your drive).

Fish-Git · 2019-05-06T02:18:30Z

Do this:

Open a Command Prompt window and navigate to your c:\hyperiongit directory.
Enter the command: set traceon=1
Enter the command: set debug=1
Enter the command: _dynamic_version.cmd > dyn.txt 2>&1

Attach the dyn.txt file to your GitHub comment.

Thanks.

rgschmi · 2019-05-06T02:28:57Z

dyn.txt

rgschmi · 2019-05-06T03:16:06Z

Now I'm even more confused. I started Hercules:

C:\Hercules\R4.10+>hercules -f z210osa.txt > herc_logfile.txt

IPLed the guest OS and here is a dir of the logfile:

C:\Hercules\R4.10+>dir herc_logfile.txt
 Volume in drive C is OS
 Volume Serial Number is CBB7-181C

 Directory of C:\Hercules\R4.10+

05/05/2019  10:05 PM                 0 herc_logfile.txt
               1 File(s)              0 bytes
               0 Dir(s)  1,645,714,042,880 bytes free

It is in fact empty. I did a dir >dir.txt just to prove I was doing it right and that worked fine.

Not too worried about it now, but puzzling.

Fish-Git · 2019-05-06T06:18:49Z

IPLed the guest OS and here is a dir of the logfile:

C:\Hercules\R4.10+> dir herc_logfile.txt
 Volume in drive C is OS
 Volume Serial Number is CBB7-181C

 Directory of C:\Hercules\R4.10+

05/05/2019  10:05 PM                 0 herc_logfile.txt
               1 File(s)              0 bytes
               0 Dir(s)  1,645,714,042,880 bytes free

It is in fact empty.

Did you do the dir after Hercules exited? Or did you do it while it was still up and running? Because if you did it while it was still up and running that would explain it. Hercules (at least on Windows) opens the logfile with exclusive access, so you can't edit it and a dir will always show 0 bytes.

Exit completely from Hercules first, and then do your dir and you will then see that it now actually has some data in it (and you will now be able to edit the file too, whereas before, while Hercules was still up and running, you couldn't).

Fish-Git · 2019-05-06T06:27:17Z

The dir.txt you attached is no good. But it's not your fault. It's mine. I somehow left a rogue debugging statement in the script that shouldn't be there, so it's taking an early exit whenever DEBUG=1.

I have since committed a fix(*) for this, so please re-clone Hercules and try the above _dynamic_version.cmd test again (i.e. set traceon=1, set debug=1, _dynamic_version.cmd > dyn.txt 2>&1) and attach the file to your GitHub comment.

Thanks.

(*) The fix is for the early debug exit, not for whatever is causing it to fail for you. I still don't know why it is failing for you. That's why I need to see the dyn.txt debugging output with trace/debug enabled. It will hopefully tell me where your _dynamic_version script is taking its wrong turn.

rgschmi · 2019-05-06T14:04:21Z

I looked at the log file both during and after Hercules was running.

Just for kicks, I'm going to clone Hercules with linux and copy the source to Windows, then build Hercules in both environments from the same source just to see if the version line is different between the two. I've always gotten the long character string under linux.

OK that doesn't work. git must build folder names when cloning.

Fish-Git · 2019-05-06T14:33:52Z

Please re-clone and do the _dynamic_version test again!

rgschmi · 2019-05-06T23:42:07Z

Please re-clone and do the _dynamic_version test again!

dyn.txt

What are you using for git?

My Powershell git seems to be really old. Deprecated, in fact:

WARNING: posh-git support for PowerShell 2.0 is deprecated; you have version
2.0.
To download version 5.0, please visit
https://www.microsoft.com/en-us/download/details.aspx?id=50395
For more information and to discuss this, please visit
https://github.com/dahlbyk/posh-git/issues/163
To suppress this warning, change your profile to include 'Import-Module
posh-git -Args $true'.
WARNING: posh-git's profile.example.ps1 will be removed in a future version. To
 avoid a change in behavior, copy its contents into your
C:\Users\HP\Documents\WindowsPowerShell\Microsoft.PowerShell_profile.ps1.

Fish-Git · 2019-05-07T04:43:06Z

Please re-clone and do the _dynamic_version test again!

dyn.txt

Thank you.

Looking at the output it appears "git.exe" is nowhere to be found anywhere in your PATH. That is the cause for the default non-git Hercules VERSION string that you're getting.

What are you using for git?

I was about to ask you the same thing!

My Powershell git seems to be really old. Deprecated, in fact:

Powershell git? ((groan))

WARNING: posh-git support for PowerShell 2.0 is deprecated; you have version
2.0.

That's your problem: the only version of git that you have is a Powershell version of git, not a standard command-line version.

And not only that, it's nowhere in your Windows PATH either.

You need to install a quality git client. Do that, and I'm sure things will work much better for you.

I'm using both Git for Windows as well as TortoiseGit, but that's only because, as a Hercules developer, I use git a LOT.

It's up to you (optional) whether you want to also install TortoiseGit or not, but I would highly recommend installing at least Git for Windows. (and getting rid of that weird posh-git that you have)

So try this:

Get rid of your Powershell posh-git.
Install Git for Windows instead.
(optional) Also install TortoiseGit.

For someone like you, I would in fact recommend not installing TortoiseGit. Doing so would be vast overkill for someone like you and would probably only confuse you further. To keep things simple, I would recommend installing only Git for Windows.

Once you install Git for Windows your Hercules VERSION should then be correct.

rgschmi · 2019-05-07T14:48:20Z

OK! I installed git for Windows and now have the proper version info!

Hercules version 4.2.0.9679-SDL-geb81b19a (4.2.0.9679)
(C) Copyright 1999-2019 by Roger Bowler, Jan Jaeger, and others
** The SoftDevLabs version of Hercules **
Build date: May  7 2019 at 09:39:32

Sorry about all the fuss. I didn't know the version of git would have an effect as long as everything (or so I thought!) downloaded correctly.

Will check the cross-subnet ping later today.

Fish-Git · 2019-05-07T18:42:34Z

Will check the cross-subnet ping later today.

And when you do, please remember to post your comment to Issue #204.

mcisho · 2019-05-09T23:12:06Z

Ages ago you said:

z/OS is either sending a new type of control command that we're not supporting, or we're not handling an existing control command properly.

I very much doubt the former, but the latter is quite possible. However, as we don't know what commands are, or are not, being sent, it's a moot point.

Fish-Git · 2019-07-10T06:45:17Z

Just checking for a status update: Is this issue still a problem?

rgschmi · 2019-07-11T15:18:46Z

There are multiple problems associated with having more than one OSA defined. I think most, if not all, are related to the lack of ARP support in QETH and CTCI-WIN (I think) sending gratuitous ARPs for all registered addresses for all OSAs, even though those addresses are registered to every OSA.

I have no problem calling this a restriction until a possible future enhancement to QETH includes proper ARP support.

Fish-Git · 2019-07-11T20:57:34Z

I am going to close this GitHub Issue at this time due to it getting way off track from the original reported problem and would like to request that you create a brand new GitHub Issue describing your perceived unresolved problem in more specific detail so we can look into it (and pray that we can keep on track in that new issue!).

Thanks.

Fish-Git self-assigned this May 4, 2019

Fish-Git added the Related This issue is closely related to another issue. Consider this issue a "sub-issue" of the other. label May 4, 2019

rgschmi mentioned this issue May 4, 2019

Multiple OSA IP support Ping to other-subnet VIPA expires in transit #204

Closed

Fish-Git closed this as completed Jul 11, 2019

Fish-Git mentioned this issue Oct 24, 2019

Transactional-Execution Facility design / implementation #263

Closed

11 tasks

Fish-Git mentioned this issue Dec 31, 2019

dev->hnd->halt in CSCH functionality not called for CTCE devices #273

Closed

Fish-Git mentioned this issue Jan 1, 2021

Facility Enable command should not check for validity of the bit #353

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Issues with OSA multiple IP support #203

Issues with OSA multiple IP support #203

rgschmi commented Apr 29, 2019 •

edited by Fish-Git

Loading

Fish-Git commented May 2, 2019

rgschmi commented May 2, 2019 via email

Fish-Git commented May 2, 2019

rgschmi commented May 2, 2019 •

edited by Fish-Git

Loading

Fish-Git commented May 2, 2019

Fish-Git commented May 2, 2019

rgschmi commented May 2, 2019 •

edited by Fish-Git

Loading

rgschmi commented May 4, 2019 •

edited by Fish-Git

Loading

Fish-Git commented May 4, 2019 •

edited

Loading

Fish-Git commented May 4, 2019

rgschmi commented May 6, 2019 •

edited by Fish-Git

Loading

rgschmi commented May 6, 2019 •

edited by Fish-Git

Loading

Fish-Git commented May 6, 2019 •

edited

Loading

Fish-Git commented May 6, 2019

Fish-Git commented May 6, 2019 •

edited

Loading

rgschmi commented May 6, 2019 •

edited

Loading

rgschmi commented May 6, 2019

Fish-Git commented May 6, 2019

Fish-Git commented May 6, 2019

rgschmi commented May 6, 2019

rgschmi commented May 6, 2019 •

edited by Fish-Git

Loading

Fish-Git commented May 6, 2019

Fish-Git commented May 6, 2019

rgschmi commented May 6, 2019 •

edited

Loading

Fish-Git commented May 6, 2019

rgschmi commented May 6, 2019 •

edited by Fish-Git

Loading

Fish-Git commented May 7, 2019

rgschmi commented May 7, 2019 •

edited by Fish-Git

Loading

Fish-Git commented May 7, 2019

mcisho commented May 9, 2019 •

edited by Fish-Git

Loading

Fish-Git commented Jul 10, 2019

rgschmi commented Jul 11, 2019 •

edited by Fish-Git

Loading

Fish-Git commented Jul 11, 2019

Issues with OSA multiple IP support #203

Issues with OSA multiple IP support #203

Comments

rgschmi commented Apr 29, 2019 • edited by Fish-Git Loading

Fish-Git commented May 2, 2019

rgschmi commented May 2, 2019 via email

Fish-Git commented May 2, 2019

rgschmi commented May 2, 2019 • edited by Fish-Git Loading

Fish-Git commented May 2, 2019

Fish-Git commented May 2, 2019

rgschmi commented May 2, 2019 • edited by Fish-Git Loading

rgschmi commented May 4, 2019 • edited by Fish-Git Loading

Fish-Git commented May 4, 2019 • edited Loading

Fish-Git commented May 4, 2019

rgschmi commented May 6, 2019 • edited by Fish-Git Loading

rgschmi commented May 6, 2019 • edited by Fish-Git Loading

Fish-Git commented May 6, 2019 • edited Loading

Fish-Git commented May 6, 2019

Fish-Git commented May 6, 2019 • edited Loading

rgschmi commented May 6, 2019 • edited Loading

rgschmi commented May 6, 2019

Fish-Git commented May 6, 2019

Fish-Git commented May 6, 2019

rgschmi commented May 6, 2019

rgschmi commented May 6, 2019 • edited by Fish-Git Loading

Fish-Git commented May 6, 2019

Fish-Git commented May 6, 2019

rgschmi commented May 6, 2019 • edited Loading

Fish-Git commented May 6, 2019

rgschmi commented May 6, 2019 • edited by Fish-Git Loading

Fish-Git commented May 7, 2019

rgschmi commented May 7, 2019 • edited by Fish-Git Loading

Fish-Git commented May 7, 2019

mcisho commented May 9, 2019 • edited by Fish-Git Loading

Fish-Git commented Jul 10, 2019

rgschmi commented Jul 11, 2019 • edited by Fish-Git Loading

Fish-Git commented Jul 11, 2019

rgschmi commented Apr 29, 2019 •

edited by Fish-Git

Loading

rgschmi commented May 2, 2019 •

edited by Fish-Git

Loading

rgschmi commented May 2, 2019 •

edited by Fish-Git

Loading

rgschmi commented May 4, 2019 •

edited by Fish-Git

Loading

Fish-Git commented May 4, 2019 •

edited

Loading

rgschmi commented May 6, 2019 •

edited by Fish-Git

Loading

rgschmi commented May 6, 2019 •

edited by Fish-Git

Loading

Fish-Git commented May 6, 2019 •

edited

Loading

Fish-Git commented May 6, 2019 •

edited

Loading

rgschmi commented May 6, 2019 •

edited

Loading

rgschmi commented May 6, 2019 •

edited by Fish-Git

Loading

rgschmi commented May 6, 2019 •

edited

Loading

rgschmi commented May 6, 2019 •

edited by Fish-Git

Loading

rgschmi commented May 7, 2019 •

edited by Fish-Git

Loading

mcisho commented May 9, 2019 •

edited by Fish-Git

Loading

rgschmi commented Jul 11, 2019 •

edited by Fish-Git

Loading