Moderator Control Panel ]

Continuity Counter Errors

Continuity Counter Errors

Postby josmi » Mon Mar 04, 2013 3:24 pm

I am suffering from continuity counter errors despite having a seemingly fine signal on the affected tuner (SNR > 90%). I can reproduce one failure mode using the test script below. When one tuner is in use there are no cc errors, when the other three tuners are enabled I start getting errors as can be seen in the log on pastebin (search for "continuity error"). The signal still seems fine on the affected tuner and BER=0. The failure is reproducible when tuned to this specific mux, 10903 V on Thor 0.8W (http://www.lyngsat.com/packages/canaldigital.html).

What are the possible causes of these cc errors?

I am running Ubuntu Server 12.04 x64 with drivers v130127.

Test script:
Code: Select All Code
uname -a
date
echo "Enable adapter 0"
szap-s2 -H -a 0 -n 1 -S 1 -r &
sleep 2
dvbsnoop -adapter 0 -s feinfo
dvbsnoop -adapter 0 -s signal -n 2
dvbsnoop -adapter 0 -s bandwidth -tsraw -n 2000
dvbsnoop -adapter 0 -s pidscan
dvbsnoop -adapter 0 -nohexdumpbuffer -pd 3 -s ts -tsraw | awk -v a=adapter0 '/continuity error/ { "date +%T" | getline d; print d, ": ", a, ": ", $0; close("date +%T") }' &
sleep 200
date
echo "Enable remaining adapters"
szap-s2 -a 1 -n 3 -r > /dev/null &
sleep 2
szap-s2 -a 2 -n 4 -r > /dev/null &
sleep 2
szap-s2 -a 3 -n 8 -r > /dev/null &
sleep 2
echo "Adapter 0 signal:"
dvbsnoop -adapter 0 -s signal -n 2
echo "Adapter 1 signal:"
dvbsnoop -adapter 1 -s signal -n 2
echo "Adapter 2 signal:"
dvbsnoop -adapter 2 -s signal -n 2
echo "Adapter 3 signal:"
dvbsnoop -adapter 3 -s signal -n 2
sleep 200
killall -PIPE dvbsnoop
killall szap-s2


The output log can be found on pastebin:
http://pastebin.com/e7V8CJ1L
josmi
 
Posts: 8
Joined: Mon Mar 04, 2013 4:19 am

Re: Continuity Counter Errors

Postby cody » Wed Mar 06, 2013 4:42 am

what you're describing sounds to me like insufficient data-transfer performance of your computer system - i mean what works fine as size of the transferred data chunks in general case is not working fine on your computer. so, saying in other words - if the problem occurs only when you use all 4 adapters at the same time, it sounds to me and my best guess is that your motherboard PCIe latency is higher than the current size of the data transfer buffers in the drivers allows to compensate. so, please, try with the following driver:

http://www.basicupload.com/jazljwuler1p

and let us know if it solves the issue for you or not.
cody
 
Posts: 627
Joined: Tue Apr 13, 2010 11:20 pm

Re: Continuity Counter Errors

Postby josmi » Wed Mar 06, 2013 8:36 am

I agree, it does sound like insufficient data transfer performance somewhere in the system and I've been examining IRQ timings and PCIe buffer transfers for hours today. Btw, my computer is a Dell PowerEdge 2950 III with dual Xeon L5420 @ 2.5GHz.

I tried your proposed driver but the problem remained.

I assume the significant change you wanted me to test was to lower the number of packets per dma buffer to 120 from 348? With the original setting of 348 I had interrupts coming about every 9ms on the failing channel when monitoring using module param verbose=4. After your proposed driver change I had about 3ms between interrupts. These nrs match the bandwidth of the mux. I have not been able to see any strange timing fluctuations that could explain the issue.

I believe I have confirmed today that it is the actual stream read from the PCIe HW that has the cc error. This was done by using the TS continuity check in the dvb-core module (dvb_demux_tscheck = 1). It also helps to apply this patch:
http://article.gmane.org/gmane.linux.dr ... ture/61440

I also believe I have confirmed that the Transport Error Indicator flag is not set when the error occurs (by the absence of prints from dvb-core).

I have noticed something that looks like a common pattern. It _seems_ like there's an extra (bad) packet in one PID stream and one packet missing in another. I don't know if this is significant and the errors definitely don't always follow this pattern but I thought it might be worth mentioning.

I've attached log snippets of a captured error that follows this pattern. It looks just as if packet [PID=0x203,cc=8] is lost and an extra packet has been received on PID 0x285 while waiting for cc=0.

Syslog (search "TS packet counter mismatch"):
http://pastebin.com/xF0zRKE2

Same error caught by dvbsnoop of PID 0x203 (search "continuity error!"):
http://pastebin.com/ykaRnpHs
josmi
 
Posts: 8
Joined: Mon Mar 04, 2013 4:19 am

Re: Continuity Counter Errors

Postby josmi » Thu Mar 07, 2013 5:37 am

I've now captured the failing stream on two tuners simultaneously for comparison. The two captured streams were identical except for a few bytes in one packet that caused the continuity counter errors.

The bytes that differed were bytes at offset 2-7. The bad packet started with 47 02 03 93 01 95 51 66. When I search through a 1GB dvbsnoop output this sequence was only found in one more location, at the beginning of packet #00344025. The bad packet was #00346809. Interestingly this should be the exact packet that last used the same physical DMA memory buffer. The delta in #packets is 346809 - 344025 = 2784. There are 8 DMA buffers used per FGPI channel and each buffer is configured to hold 348 packets, 8 * 348 = 2784. Surely this can't be a coincidence? The next question is, why is this happening?

Good packet:
Code: Select All Code
------------------------------------------------------------
TS-Packet: 00272533   PID: (Unkown PID), Length: 188 (0x00bc)
from file: /temp/new2ac
------------------------------------------------------------
  0000:  47 02 00 de 32 72 24 d1  a5 cd 67 f4 c7 91 1d dd   G...2r$...g.....
  0010:  33 49 5a 39 0f 6e 34 3e  e0 e8 cb d7 6f f4 21 bd   3IZ9.n4>....o.!.
  0020:  16 49 c1 96 7b c7 18 9d  ef c0 f9 f7 84 b3 20 0d   .I..{......... .
  0030:  24 04 ee 2a 39 66 dc fe  a1 ed 2c 60 fe 6a 4b ef   $..*9f....,`.jK.
  0040:  c8 13 9d f9 7d 1b 34 0c  90 e3 a0 c7 47 61 88 c6   ....}.4.....Ga..
  0050:  e0 49 d1 8d 50 d5 33 83  4c 9e 08 77 eb 76 b8 af   .I..P.3.L..w.v..
  0060:  4e 11 98 c1 8d 84 56 04  38 71 be 14 55 40 28 9e   N.....V.8q..U@(.
  0070:  31 a0 23 de 8b b4 9c 50  8a 56 07 ad 19 62 b0 61   1.#....P.V...b.a
  0080:  ac 9b 3c c0 0d 43 d4 03  d3 30 46 38 1b 52 85 e3   ..<..C...0F8.R..
  0090:  19 f7 71 1a a1 7f 15 a4  df d5 84 ac 2f 86 6c 8d   ..q........./.l.
  00a0:  32 4b 2f 71 91 aa 4f 1c  f6 9b f7 50 77 bd 24 51   2K/q..O....Pw.$Q
  00b0:  69 43 59 21 db 4f 7a 96  7b 33 cc 70               iCY!.Oz.{3.p


Bad packet:
Code: Select All Code
------------------------------------------------------------
TS-Packet: 00346809   PID: (Unkown PID), Length: 188 (0x00bc)
from file: /temp/new0ac
------------------------------------------------------------
  0000:  47 02 03 93 01 95 51 66  a5 cd 67 f4 c7 91 1d dd   G.....Qf..g.....
  0010:  33 49 5a 39 0f 6e 34 3e  e0 e8 cb d7 6f f4 21 bd   3IZ9.n4>....o.!.
  0020:  16 49 c1 96 7b c7 18 9d  ef c0 f9 f7 84 b3 20 0d   .I..{......... .
  0030:  24 04 ee 2a 39 66 dc fe  a1 ed 2c 60 fe 6a 4b ef   $..*9f....,`.jK.
  0040:  c8 13 9d f9 7d 1b 34 0c  90 e3 a0 c7 47 61 88 c6   ....}.4.....Ga..
  0050:  e0 49 d1 8d 50 d5 33 83  4c 9e 08 77 eb 76 b8 af   .I..P.3.L..w.v..
  0060:  4e 11 98 c1 8d 84 56 04  38 71 be 14 55 40 28 9e   N.....V.8q..U@(.
  0070:  31 a0 23 de 8b b4 9c 50  8a 56 07 ad 19 62 b0 61   1.#....P.V...b.a
  0080:  ac 9b 3c c0 0d 43 d4 03  d3 30 46 38 1b 52 85 e3   ..<..C...0F8.R..
  0090:  19 f7 71 1a a1 7f 15 a4  df d5 84 ac 2f 86 6c 8d   ..q........./.l.
  00a0:  32 4b 2f 71 91 aa 4f 1c  f6 9b f7 50 77 bd 24 51   2K/q..O....Pw.$Q
  00b0:  69 43 59 21 db 4f 7a 96  7b 33 cc 70               iCY!.Oz.{3.p


Packet with sequence matching bad bytes:
Code: Select All Code
------------------------------------------------------------
TS-Packet: 00344025   PID: (Unkown PID), Length: 188 (0x00bc)
from file: /temp/new0ac
------------------------------------------------------------
  0000:  47 02 03 93 01 95 51 66  7b 34 dd 9a 7d 1f cc 6e   G.....Qf{4..}..n
  0010:  91 0a ac b3 e9 f3 7e 83  1b df 4c 36 57 95 48 14   ......~...L6W.H.
  0020:  5f 13 c4 09 ac 83 15 ed  3e 4f 24 24 da 2f d2 16   _.......>O$$./..
  0030:  60 91 8f ab 24 c5 92 b5  c6 72 13 c7 e8 b5 a2 9c   `...$....r......
  0040:  60 a1 99 fe e4 e9 1d cc  42 2a 86 24 ff c7 fc 67   `.......B*.$...g
  0050:  30 20 30 4e ee f7 70 fb  f5 7f 80 46 58 5e e1 94   0 0N..p....FX^..
  0060:  6e c9 91 c3 ff cf c9 f0  29 7d 8e c5 eb 20 d6 43   n.......)}... .C
  0070:  4e 42 ad ff 2b 6d 02 44  d2 9f b8 9b 77 61 a8 7b   NB..+m.D....wa.{
  0080:  c7 69 db a7 b0 4f e9 84  49 dc 0b ba 11 da 02 ff   .i...O..I.......
  0090:  4b 26 a1 5c fc ef ef 8e  63 f3 4f 92 d0 a0 11 ba   K&.\....c.O.....
  00a0:  ee 62 6e 18 c1 10 e1 40  09 d0 80 aa 15 a1 c5 41   .bn....@.......A
  00b0:  2d ae 78 82 6c 53 e5 7a  94 82 40 8c               -.x.lS.z..@.
josmi
 
Posts: 8
Joined: Mon Mar 04, 2013 4:19 am

Re: Continuity Counter Errors

Postby josmi » Fri Mar 08, 2013 2:28 am

Btw, I forgot to mention that the bad packet seems to be located at an arbitrary position in the DMA buffer, not at a boundary. I had added a packet index counter to syslog, see below. Count = 204 means packet nr 348 - 204 = 144 in DMA buffer.

Code: Select All Code
Mar  6 18:27:36 PE2950 kernel: [267060.428630] TS packet counter mismatch. PID=0x203 expected 0xc got 0x3  count = 204
Mar  6 18:27:36 PE2950 kernel: [267060.428633] TS packet counter mismatch. PID=0x200 expected 0xe got 0xf  count = 202
Mar  6 18:27:36 PE2950 kernel: [267060.428636] TS packet counter mismatch. PID=0x203 expected 0x4 got 0xc  count = 201
josmi
 
Posts: 8
Joined: Mon Mar 04, 2013 4:19 am

Re: Continuity Counter Errors

Postby josmi » Wed Mar 13, 2013 1:46 am

Please let me know if anyone is looking into this.
josmi
 
Posts: 8
Joined: Mon Mar 04, 2013 4:19 am

Re: Continuity Counter Errors

Postby cody » Fri Mar 15, 2013 6:05 pm

unfortunately, there's nothing that can be looked more into it - your analysis is fine, but the main point is that no such error in the DMA occurs on any of our test computers, nor there are other customers described something like that. also, apparently changing number of transfer packets doesn't help. so, i believe what's going on is on hardware level, i.e. either there is some component of your 6984 board out of specification that causes such arbitrary lost of data on the board itself between TS output of the demodulator and TS input of the PCI-Express interface or there is something causing incompatibility between PCIe bus in your system and PCI-Express interface of 6984. in any of those cases i think the best next step is to contact us (or your seller) and get 6985 for tests - as there are design changes between 6984 and 6985 that hopefully will prevent that to happen in your system.
cody
 
Posts: 627
Joined: Tue Apr 13, 2010 11:20 pm

Re: Continuity Counter Errors

Postby Derrick » Sun Mar 17, 2013 1:09 am

excellent analysis :)

The bytes that differed were bytes at offset 2-7

byte 0 is always 0x47 and byte 1 on this mux is the same for PID 0x200 and 0x203. Apparently as you wrote the 1st byte is swapped. Could you check this with windows as well?
Derrick
 
Posts: 91
Joined: Sun Jul 04, 2010 6:49 pm

Re: Continuity Counter Errors

Postby josmi » Sun Mar 17, 2013 7:54 am

Derrick Wrote:Could you check this with windows as well?

Possibly, if Windows could be run from a CD or USB drive and I could get some suggestions on SW to use for tuning and capturing full TS.
josmi
 
Posts: 8
Joined: Mon Mar 04, 2013 4:19 am

Re: Continuity Counter Errors

Postby Derrick » Sun Mar 17, 2013 8:47 pm

..I've made a mistake. It should read that the 1st 8 bytes are swapped -> in pid stream 0x0200 one ts_packet is missing and in pid stream 0x0203 there is one ts_packet (with wrong payload) too many.

Could be driver related (buffer management).
Derrick
 
Posts: 91
Joined: Sun Jul 04, 2010 6:49 pm

Next

Return to Linux

Who is online

Users browsing this forum: No registered users and 2 guests