Printer FriendlyEmail Article Link

Spirent TestCenter: 5.19 - 400G Appliances go Out Of Sync in daisy chain setup when running rfc2544 (Lost manchester clock message)

Symptoms

A customer has a testbed with an N11U chassis as Master and 4x 400G Appliances connected as slave. Users are trying to run RFC-2544 and it fails as the setup goes Out of Sync in the middle of the test, showing following messages:

chassis <ip address> event: out-of-sync: Lost manchester lock
chassis <ip address> is out of sync.

 
Rebooting them again brings the setup in Sync for some time, then it again fails during the RFC2544 test.

There is no specific time for the test to be running for this issue to occurs, is kind of random, before the meeting the test was running for 2 hours without any issue. The customer has noticed the issue happens more often when running that specific test (RFC2544), however yesterday (June 5th) they were just connected to the chassis (Not running anything) and the "Lost manchester clock" message appeared

When this issue happens they have seen that eventually all the chassis on the chain (master and slaves) go OUT OF SYNC (As per GUI LOG messages)
 
Environment
 
  • STC 5.19 release
  • 400G appliances
  • RFC2544 TEST
  • Out-of-sync
  • out of sync
  • Lost Manchester Lock
Explanation/Resolution
 
  • 5.24 release has the fix for this issue
    • CR-01499446
    • CIPCD-17067
    • Date/Time Opened : 6/11/2021 9:16 AM          
    • Date/Time Closed: 8/17/2021
    • Target Release: 5.24 (August 2021 release)
Root Cause

From CIPCD-17067:

As part of sync verification Eng team periodically latch the incoming timestamp and the local timestamp and check that they match. If they don't match two latches in a row then it is declared as out of sync and we recalibrated. Due to some reason on the chassis we observed mismatches intermittently and they seemed to be independent events, if they happen twice in a row then it would cause out of sync. Eng team determined the root cause was with a register read timing issue. The current fix is to read the timestamp registers immediately after seeing a mismatch. For a true mismatch, they would continue to be wrong, but if it was caused by a register read timing issue the values would correct themselves and we would know we are still in sync. Eng team is planning to fix the root cause in the register read logic in the future.
 

Product : Spirent TestCenter