Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Release 2018.04 - RC2 #62

Closed
34 of 46 tasks
kaspar030 opened this issue Apr 23, 2018 · 46 comments
Closed
34 of 46 tasks

Release 2018.04 - RC2 #62

kaspar030 opened this issue Apr 23, 2018 · 46 comments

Comments

@kaspar030
Copy link
Contributor Author

FYI, nightly build results are here.

The changes to -RC1 are quite minimal (arduino and gcoap related only). What do you think, can we re-use the tests that were done (mostly by @aabadie) in #61 for -RC1? How did we handle that for the last releases?

@cladmi
Copy link
Contributor

cladmi commented Apr 24, 2018

I will try to run the simple tests on IoT-LAB. Both M3 and wsn430. I will see how it goes.

@aabadie
Copy link
Contributor

aabadie commented Apr 24, 2018

@kaspar030, HIL reports are still pointing to a non existing page, see here for example.

@cladmi
Copy link
Contributor

cladmi commented Apr 25, 2018

I am testing all tests that have TESTS_ON_CI_WHITELIST

The most important problem for the moment is that unittests fail for iotlab-m3 and iotlab-a8-m3 in the tests-qDSAI get a hardfault in the keypair function.

wsn430 have errors too, some tests to fix and also other broken things. I will do a more detailed report.

@kaspar030
Copy link
Contributor Author

unittests fail for iotlab-m3 and iotlab-a8-m3 in the tests-qDSAI get a hardfault in the keypair function.

Can you try removing the LARGE_STACK_TESTS avr condition (so tests-qDSA is always "large stack")?

@cladmi
Copy link
Contributor

cladmi commented Apr 26, 2018

I was testing them with all tests included so should already have been a LARGE_STACK_TESTS.
It is still broken with the following:

-ifneq (, $(filter $(AVR_BOARDS), $(BOARD)))
-  LARGE_STACK_TESTS += tests-qDSA
-endif
+LARGE_STACK_TESTS += tests-qDSA

I will do the test report, a few fix PRs and I can start investigating.

@cladmi
Copy link
Contributor

cladmi commented Apr 26, 2018

I ran test on tests/directory that are supported, with enough memory, and had the 'TEST_ON_CI_WHITELIST' tag.
I did run on wsn430-v1_4, iotlab-m3, and iotlab-a8-m3.
I was not using @kYc0o PR to add TEST_ON_CI_WHITELIST on other tests yet. It can be a next step to test more cases.

Tests fixed on wsn430:

  • sizeof_tcb: needed different size as 16bit
  • mutex_unlock_and_sleep: increase timeout (test takes 10 minutes...)
  • bloom_bytes: increase timeeout
  • float: increase timeout

Test broken on iotlab-m3 iotlab-a8-m3

  • unittests in the qDSA test

Tests I have trouble running because the interactive test start before the board is ready

TODO: make the start condition more robust

  • iotlab-a8-m3:
    • rng
    • ps_schedsctatistics
    • shell
  • wsn430:
    • struct_tm_utility, re-running it other time worked

Tests not working on board

I think I should simply blacklist the wsn430 boards here

Board gets stuck during test

  • wsn430
    • bitarith_timings: stuck after first output of bitarithm_msb
    • rng: stuck at 'dump 10'
    • thread_flags_xtimer: never goes after "timeout 1"
    • all but one 'pthread_*" tests are broken, never gets after the first output line (pthread_rwlock works)
      • pthread
      • pthread_cooperation
      • pthread_cleanup
      • pthread_condition_variable
      • pthread_barrier
      • pthread_tls

@kaspar030
Copy link
Contributor Author

wsn430

That doesn't look very good. I guess msp430 needs quite some work overall...

@cladmi
Copy link
Contributor

cladmi commented Apr 26, 2018

I put the PRs for wsn430 in the release milestone https://github.com/RIOT-OS/RIOT/milestone/22

@kaspar030
Copy link
Contributor Author

I'm starting with the 05- tests on native now.

@kYc0o
Copy link
Contributor

kYc0o commented May 2, 2018

I'm on 6.

@kYc0o
Copy link
Contributor

kYc0o commented May 2, 2018

I'm on 7. For now the last three tasks (2, 3 and 4)

@kaspar030
Copy link
Contributor Author

05- on native had some problems, see #63 and RIOT-OS/RIOT#9061.

@kYc0o
Copy link
Contributor

kYc0o commented May 3, 2018

I'm struggling a lot with the RPL tests. I don't have a real clue what might be wrong, but then, should we stop the release until the problem is found?

@kYc0o
Copy link
Contributor

kYc0o commented May 3, 2018

For info, it seems broken since the last release. I'm having good results with 2017.10 with the same set of nodes and the same tests. The RPL graph also seems to be the same, but I'm not 100% sure.

@kaspar030
Copy link
Contributor Author

I'm struggling a lot with the RPL tests.

Can you elaborate what does not work? Maybe you're hitting a similar problem as in 05- on native. Try adding CFLAGS += -DGNRC_NETIF_IPV6_GROUPS_NUMOF=8 to the Makefile.

@kYc0o
Copy link
Contributor

kYc0o commented May 3, 2018

Can you elaborate what does not work?

Single hop routes work sometimes, more than one hop it definitely doesn't work.

Try adding CFLAGS += -DGNRC_NETIF_IPV6_GROUPS_NUMOF=8 to the Makefile.

I'm almost sure that's not the problem but I'll try and report back.

@kYc0o
Copy link
Contributor

kYc0o commented May 3, 2018

Try adding CFLAGS += -DGNRC_NETIF_IPV6_GROUPS_NUMOF=8 to the Makefile.

I'm almost sure that's not the problem but I'll try and report back.

Just tested, the (unsuccessful) result is the same.

@cladmi
Copy link
Contributor

cladmi commented May 4, 2018

My tests failing only on iotlab-a8-m3 where an issue in my test runner (RIOT-OS/RIOT#9011 (comment)). They are now just failing on the same as m3 nodes.

@miri64
Copy link
Member

miri64 commented May 4, 2018

@kYc0o can you test with RIOT-OS/RIOT#9073. Also make sure both the NIB off-link store (GNRC_IPV6_NIB_OFFL_NUMOF, for route destinations) and NIB on-link store (GNRC_IPV6_NIB_NUMOF, for route next-hops) is sufficiently large. For RPL to work properly GNRC_IPV6_NIB_OFFL_NUMOF should be at least #nodes+1.

@aabadie
Copy link
Contributor

aabadie commented May 4, 2018

Tasks 4.7 and 4.8 are marked failed but I re-ran them on iotlab and got:

  • 4.7: 1000 packets transmitted, 973 received, 3% packet loss, time 291.06369993 s
  • 4.8: 1000 packets transmitted, 992 received, 1% packet loss, time 441.06100079 s

So they pass. Who tested them ?
I'm using the following command lines on the Arduino Zero node:

  • for 4.7:
> ifconfig 7 set chan 17   # also on the samr21-xpro
> ping6 1000 ff02::1 50 100
  • for 4.8:
> ifconfig 7 set chan 26   # also on the samr21-xpro
> ping6 1000 fe80::xx:xx:xx::1 100 100

@bergzand
Copy link
Member

bergzand commented May 4, 2018

@miri64 I've tested a bit with 3 native instances (gnrc_networking) in a chain-like topology. Traffic between tap0 and tap2 is dropped to simulate a topology where node0 and node2 can communicate with node1, but node0 and node2 can't communicate.

Pinging between node0 and node2 doesn't work with both RPL and manual routing. What I observe is that when pinging from node0 to node2 is that node0 starts doing neighbour solicitations for the address of node2.

As soon as I remove the prefix from all nodes with nib prefix del 6 2001:db8::/64 ping6 works again with multiple hops.

@miri64
Copy link
Member

miri64 commented May 4, 2018

@bergzand @cgundogan I analyzed this with a debugger. Problem is, that routers in Ethernet configuration (IMHO correctly because we usually don't have a weird "mesh wants to share a prefix multihop"-situation there) advertise prefixes they configured as on-link using the L-flag in the PIO (this happens here). This causes the node to look up the address in the neighbor cache (as described in RFC 4861) instead of the forwarding table. So I rather would say that either the scenario as described by you is invalid or we should have some extra state for RPL(-like) network interfaces, so that mesh-wide prefixing is allowed. But that is more a feature than a bug fix.

@miri64
Copy link
Member

miri64 commented May 4, 2018

So won't fix, when it comes to that ;-).

@cgundogan
Copy link
Member

So won't fix, when it comes to that ;-).

Then we should say that we do not support multi-hop in native anymore, be it with RPL or not? It at least worked in previous RIOT versions so we are dropping that feature (for now)?

@bergzand
Copy link
Member

bergzand commented May 4, 2018

Thanks for clarifying! Indeed a wontfix then. I can replace the info on the wiki to build these topologies with something zep_mesh based as soon as I have a real keyboard again.

@miri64
Copy link
Member

miri64 commented May 4, 2018

Then we should say that we do not support multi-hop in native anymore, be it with RPL or not? It at least worked in previous RIOT versions so we are dropping that feature (for now)?

we support Multihop in native. Just not multihop prefix delegation. And that afaik was the case before as well and if not it was most definitely acting erroneously (so it was a bug, not a feature :P).

@kaspar030
Copy link
Contributor Author

I've lost track. Can someone sum up what's the state with RPL and multihop regarding the release?

@kYc0o
Copy link
Contributor

kYc0o commented May 7, 2018

The two of RPL tests passed successfully. It was a matter of configuration since the max nib entries was too small for my experiment. Moreover, since the topology was a bit dense and maybe noisy some of the DAOs were lost during initialisation, therefore I decreased the DAOs send interval. Also Martine's #9073 helped to keep a coherent nib for that small interval (10 secs).

@kYc0o
Copy link
Contributor

kYc0o commented May 7, 2018

Tasks 4.7 and 4.8 are marked failed but I re-ran them on iotlab and got:

For some strange reason, in the tag 2018.04-RC2 I have the following

2018-05-07 12:26:16,104 - INFO #  main(): This is RIOT! (Version: 2018.04-snake.local-HEAD)
2018-05-07 12:26:16,108 - INFO # RIOT network stack example application
2018-05-07 12:26:16,110 - INFO # All up, running the shell now
ifconfig
2018-05-07 12:26:19,982 - INFO #  ifconfig
2018-05-07 12:26:20,031 - INFO # Iface  7  HWaddr: f8:54  Channel: 26  NID: 0x23
2018-05-07 12:26:20,033 - INFO #           Long HWaddr: 00:13:a2:00:40:a9:f8:d4 
2018-05-07 12:26:20,034 - INFO #           MTU:100  HL:64  RTR  
2018-05-07 12:26:20,036 - INFO #           Source address length: 8
2018-05-07 12:26:20,038 - INFO #           Link type: wireless
2018-05-07 12:26:20,043 - INFO #           inet6 addr: fe80::  scope: local  VAL
2018-05-07 12:26:20,046 - INFO #           inet6 group: ff02::2
2018-05-07 12:26:20,048 - INFO #           inet6 group: ff02::1
2018-05-07 12:26:20,052 - INFO #           inet6 group: ff02::301:ff00:0
2018-05-07 12:26:20,055 - INFO #           inet6 group: ff02::1a
2018-05-07 12:26:20,056 - INFO #           
2018-05-07 12:26:20,061 - INFO #            Protocol or device doesn't provide statistics.
2018-05-07 12:26:20,063 - INFO #           Statistics for IPv6
2018-05-07 12:26:20,071 - INFO #             RX packets 0  bytes 0
2018-05-07 12:26:20,076 - INFO #             TX packets 3 (Multicast: 3)  bytes 162
2018-05-07 12:26:20,081 - INFO #             TX succeeded 3 errors 0
2018-05-07 12:26:20,082 - INFO # 

And on branch 2018.04-devel

2018-05-07 12:24:20,237 - INFO #  main(): This is RIOT! (Version: 2018.07-devel-1000-g7ea2b-snake.local-HEAD)
2018-05-07 12:24:20,240 - INFO # RIOT network stack example application
2018-05-07 12:24:20,243 - INFO # All up, running the shell now
> ifconfig
2018-05-07 12:24:25,500 - INFO #  ifconfig
2018-05-07 12:24:25,541 - INFO # Iface  7  HWaddr: f8:54  Channel: 26  NID: 0x23
2018-05-07 12:24:25,544 - INFO #           Long HWaddr: 00:13:a2:00:40:a9:f8:d4 
2018-05-07 12:24:25,548 - INFO #           MTU:1280  HL:64  RTR  
2018-05-07 12:24:25,549 - INFO #           IPHC  
2018-05-07 12:24:25,553 - INFO #           Source address length: 8
2018-05-07 12:24:25,555 - INFO #           Link type: wireless
2018-05-07 12:24:25,564 - INFO #           inet6 addr: fe80::213:a200:40a9:f8d4  scope: local  VAL
2018-05-07 12:24:25,565 - INFO #           inet6 group: ff02::2
2018-05-07 12:24:25,567 - INFO #           inet6 group: ff02::1
2018-05-07 12:24:25,570 - INFO #           inet6 group: ff02::301:ffa9:f8d4
2018-05-07 12:24:25,574 - INFO #           inet6 group: ff02::1a
2018-05-07 12:24:25,577 - INFO #           
2018-05-07 12:24:25,589 - INFO #            Protocol or device doesn't provide statistics.
2018-05-07 12:24:25,590 - INFO #           Statistics for IPv6
2018-05-07 12:24:25,591 - INFO #             RX packets 0  bytes 0
2018-05-07 12:24:25,594 - INFO #             TX packets 3 (Multicast: 3)  bytes 178
2018-05-07 12:24:25,595 - INFO #             TX succeeded 3 errors 0
2018-05-07 12:24:25,595 - INFO # 

That's why my tests failed.

@kaspar030
Copy link
Contributor Author

So RPL basically works, but with the default settings, only with up to 8 nodes?

@kYc0o
Copy link
Contributor

kYc0o commented May 7, 2018

So RPL basically works, but with the default settings, only with up to 8 nodes?

Exactly, the maximum number of neighbours is defined as:

sys/include/net/gnrc/ipv6/nib/conf.h:#define GNRC_IPV6_NIB_OFFL_NUMOF (8)

@kaspar030
Copy link
Contributor Author

Ok, apart from that, we're missing:

  • 02.01 run all tests on native. CI runs only the simple ones.
  • 02.02 run some tests != 02.03 on iotlab-m3
  • 05.02
  • 07.01
  • 07.02
  • 08

gogogo! ;)

@kaspar030
Copy link
Contributor Author

@kaspar030, HIL reports are still pointing to a non existing page, see here for example.

That's fixed now, see master or 2018.04-branch.

@cladmi
Copy link
Contributor

cladmi commented May 8, 2018

For task 2, I added RIOT-OS/RIOT#9007 to enable more TEST_ON_CI_WHITELIST.

I re-ran all tests in tests/directory that are supported, with enough memory, and had the 'TEST_ON_CI_WHITELIST' tag (even 'native').
I did run on wsn430-v1_4, iotlab-m3, and iotlab-a8-m3, samr21-xpro, arduino-zero using IoT-LAB.

For wsn430-v1_4 I have the same errors as before plus posix_semaphore which requires a different configuration for the CPU and should be fixed after RIOT-OS/RIOT#9081.

For others, I have the following errors:

@kaspar030
Copy link
Contributor Author

For wsn430-v1_4 I have the same errors as before

didn't we fix some of the tests?

@cladmi
Copy link
Contributor

cladmi commented May 9, 2018

Yeah I was not precise enough in my report, I wanted to give more details on the other ones, my bad.
Indeed the ones noted before as Tests fixed on wsn430, and the pkg_tiny-asn1 where fixed and backported.
My report should more have been: micro-ecc is not merged in master, and none of the Board gets stuck during test have been fixed.

@cladmi
Copy link
Contributor

cladmi commented May 11, 2018

Task 5.2

The test description says "default ipv6 stack" I do not know if it meant using examples/default but I could not as it does not have the nib shell commands so I ran it with gnrc_networking.

I had 100% ping delivery so for me it's a success.

That what I understood I needed to run:

Node Receiver

> ifconfig
   # get long inet6 addr: fe80::1711:6b10:65f6:bb1a  scope: local  VAL
> ifconfig 7 add unicast beef::1/64
ifconfig 7 add unicast beef::1/64
success: added beef::1/64 to interface 7
> nib route add 7 :: fe80::1711:6b10:65fc:b406
nib route add 7 :: fe80::1711:6b10:65fc:b406
> nib route
nib route
beef::/64 dev #7
default* via fe80::1711:6b10:65fc:b406 dev #7

Node Transmitter

> ifconfig
   # get long inet6 addr: fe80::1711:6b10:65fc:b406  scope: local  VAL
> ifconfig 7 add unicast affe::1/120
ifconfig 7 add unicast affe::1/120
success: added affe::1/120 to interface 7
> nib route add 7 :: fe80::1711:6b10:65f6:bb1a
nib route add 7 :: fe80::1711:6b10:65f6:bb1a
> nib route
nib route
affe::/120 dev #7
default* via fe80::1711:6b10:65f6:bb1a dev #7
> ping6 100 beef::1 1024 10
...
--- beef::1 ping statistics ---
100 packets transmitted, 100 received, 0% packet loss, time 15.06791147 s
rtt min/avg/max = 131.097/145.808/160.756 ms

Is this what was expected and should I add this output example to the release spec ?

@cladmi
Copy link
Contributor

cladmi commented May 11, 2018

The automated scripts for task7 do not work out of the box, I try to find what happens. Something with an expect line.

@cladmi
Copy link
Contributor

cladmi commented May 11, 2018

Problem is that when doing ifconfig, addresses are now displayed without the /prefix_len (I hacked the script to not expect it)

Also, the test is using fibroute in setFibRoutesInARow
shell: command not found: fibroute.

I see to fix it.

@cladmi
Copy link
Contributor

cladmi commented May 11, 2018

Task 7.01

I adapted the IOTLABHelper.py script to the new commands and output to make the test work.

Successfully pinged with 3 hops
SUCCESS

@cladmi
Copy link
Contributor

cladmi commented May 11, 2018

Task 7.02

Failed with first selected node but worked on the second with the same IOTLABHelper.py changes.

Sent successfully with packet loss of 0.0%
Successfully communicated with UDP over 3 hops
SUCCESS

@kYc0o
Copy link
Contributor

kYc0o commented May 11, 2018

I'm on 8.2

@kYc0o
Copy link
Contributor

kYc0o commented May 11, 2018

8.9 done successfully.

cladmi added a commit to cladmi/RIOT that referenced this issue May 11, 2018
Having `prev_now` initialized to 0 breaks tests on `arduino-zero` and
`arduino-mega2560` as `xtimer_now_usec` is way bigger (72k on `arduino-zero`).

Issue found in:
* RIOT-OS#9052 and proposed fix by ZetaR60
* RIOT-OS/Release-Specs#62 (comment)
cladmi added a commit to cladmi/RIOT that referenced this issue May 11, 2018
Having `prev_now` initialized to 0 breaks tests on `arduino-zero` and
`arduino-mega2560` as `xtimer_now_usec` is way bigger (72k on `arduino-zero`).

Issue found in:
* RIOT-OS#9052 and proposed fix by ZetaR60
* RIOT-OS/Release-Specs#62 (comment)
@cladmi
Copy link
Contributor

cladmi commented May 11, 2018

Thank all for testing. Please remind me next week if I forget to upstream the changes for task 07.

jcarrano pushed a commit to jcarrano/RIOT that referenced this issue May 15, 2018
Having `prev_now` initialized to 0 breaks tests on `arduino-zero` and
`arduino-mega2560` as `xtimer_now_usec` is way bigger (72k on `arduino-zero`).

Issue found in:
* RIOT-OS#9052 and proposed fix by ZetaR60
* RIOT-OS/Release-Specs#62 (comment)
jia200x pushed a commit to jia200x/RIOT that referenced this issue Jun 11, 2018
Having `prev_now` initialized to 0 breaks tests on `arduino-zero` and
`arduino-mega2560` as `xtimer_now_usec` is way bigger (72k on `arduino-zero`).

Issue found in:
* RIOT-OS#9052 and proposed fix by ZetaR60
* RIOT-OS/Release-Specs#62 (comment)
@miri64
Copy link
Member

miri64 commented Aug 1, 2018

(closing this one, since we are currently testing 2018.07 ;-))

@miri64 miri64 closed this as completed Aug 1, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

7 participants