Jump to content
Dataton Forum
Klangreich

Latency on running cues

Recommended Posts

We have go a strange problem. We have some auxilary timelines with media and cues in them. Content is graphics, HAP and h.264. System is running smooth (Play cues run without any delay) exept for one problem which occures random after some time running through the show. Then we have about 2 seconds delay on all the outputs when starting a timeline. This latency holds on until we restart watchpoint on the display computers. Before we start an auxilary timeline we preload it by pressing the stop button.

We thought it is a network issue so we changed the whole setups from big switches to small dumb switches and replayed the fiber line with cat5.

 

Display computer's specs:

E5-1630

Firepro W8100 (outputs synced with S400)

SSD Raid

10Gbit/s ethernet on PCIe

1Gbit/s ethernet onboard

(Same issue no matter which network connection I use)

 

Production PC:

Lenovo i7 notebook

 

I have two of these setups running in two separate networks. Both have the same issue from time to time.

Does anyone have an idea about that?

Share this post


Link to post
Share on other sites

I usually put a half second at the top of my aux timelines to allow load times, and extend the preroll time to allow loading if a cue has a lot of processing in it. Always better to play into something than to start right at it. Try that and see how it goes.

Share this post


Link to post
Share on other sites

You might want to check your power conservation settings, sounds like something is going to sleep.

Share this post


Link to post
Share on other sites

Just double checking.. is the wifi adapter on the laptop off? No firewalls? Other clients had similar problems with switches that had IGMP snooping enabled which caused a delay of the multicast packages sent by WATCHOUT but that mostly occurs while going online.

 

As jfk pointed out make sure you are running the laptop in high performance power mode and not in the balanced mode. Which version of WATCHOUT are you running? What do you mean with after some time. Are we talking minutes, hours, days or weeks?

Share this post


Link to post
Share on other sites

I can provide some additional context on this show.

 

WiFi is off; firewalls are off. We are controlling the production PC with Christie Widget Designer via TCP/IP on the same . The content in each timeline starts 0.1 seconds in, and we are manually loading (with "halt") each aux timeline several seconds (usually 10 or more) before playing them (with "run").

 

We haven't narrowed down the exact cause, but something is happening that makes the displays suddenly respond after approximately 2 seconds' delay. All displays respond immediately to the TCP-based control (like scrubbing the timeline or jumping around cues), but on a 2-second delay to all UDP-based control (hitting the play button!). Closing WATCHPOINT on the displays and relaunching it gets us back to real-time performance.

 

The delay seems to happen most often after a crash of the production PC, which has happened several times after a lot of editing. But again, I've been unable to narrow down a specific cause, and whether the production PC crash is the cause or another symptom. 

 

I can provide support with dump files from the production PCs. There doesn't seem to be anything interesting in the logs. I've never seen anything like this before, but if anyone has any insight as to what the cause may be or what other steps we should take to troubleshoot the cause, I'd be most appreciative. We are doing another show with WATCHOUT next week, and it would be very helpful to be able to explain to production management what's really going on, and how we can avoid it on future productions.

 

Thanks!

 

w.

Share this post


Link to post
Share on other sites

We are running WATCHOUT 6.1.5

At the manged switches I used first igmp snooping was switched off. Then I changed to complety dump switches which are not capable of this feature but it did not change anything concerning this issue.

Share this post


Link to post
Share on other sites

I'd try it on another Production PC other than a Lenovo. Better yet, try with a desktop without 'bloatware' as Prodn PC to see if the problem persists.

 

I have an i5 Lenovo, and in the initial weeks of ownership, something from it was interfering when I went online with Watchout. Could be that IGMP snooping that Miro mentioned. In any case, I found that my problem went away after un-installing some of the pre-installed apps that came with the laptop. Can't exactly remember which app(s).

 

Thomas

Share this post


Link to post
Share on other sites

Is it an option to switch to TCP instead of UDP. In some cases, I've found TCP to be more reliable in "getting through" than UDP. WATCHOUT will do either one, so assuming that Widget Designer can output TCP, that's what I would try.

 

Mike

Share this post


Link to post
Share on other sites

Is it an option to switch to TCP instead of UDP. In some cases, I've found TCP to be more reliable in "getting through" than UDP. WATCHOUT will do either one, so assuming that Widget Designer can output TCP, that's what I would try.

 

Mike

 

Thanks, Mike. We were already using TCP control of the production PC from Widget Designer. The production PC was responding immediately to Widget control; the displays were responding to control from the production PC on a delay. Once the issue occurs, the delay is consistent until WP.exe is restarted, and occurs whether the cue was fired indirectly from WD or directly from the production PC.

 

With respect to Klangreich's post from yesterday: he has a a completely new show exhibiting the same issue, without WD in the mix yet. After an hour or two of programming and frequent updates, the latency issue pops up and the display PCs follow the production PC's cues on a delay.

 

w.

Share this post


Link to post
Share on other sites

Interesting to read this. I had a delay of exactly that kind on a show about 10 days ago. It happened two or three times. I didn´t follow it up because I had lots of other issues (human errors) with the show and in the end it went ok. It was 6.1.5 on a rig that is used often. First time ever I saw this kind of thing, didn´t believe it at first. I think it went away after going offline and then online again. 

Share this post


Link to post
Share on other sites

One more addtion: In the show I mentioned, I had a production PC and two Display-PCs with one output each. What struck me was that the two display-PCs where in sync with each other, but the production PC was one or two seconds off. I don´t remember wich one was ahead.....

Share this post


Link to post
Share on other sites

I have experienced this as well and am working to discover the culprit. If anyone has any relative info related to this "bug", please chime in. This post is over a year old so I hope someone has been able to shed more light on the problem. Likewise, I will share anything I find. Cheers!

Share this post


Link to post
Share on other sites

At our office, we are officially referring to this bug as the "UDP latency bug", in the assumption that it is related to the UDP play commands from WO used to launch cue activity relating to a timeline. We are in a heavy test/R&D mode currently in an effort to create a very reproducible case to provide support, and in hopes of learning solutions for work arounds, etc... We were able to invoke the bug back at the shop using a very recent show file. We capture the bug in play on video. Here is a link to that:

https://www.dropbox.com/s/pybjd45zvinw9h4/IMG_1351.m4v?dl=0

Pardon my excessive rambling over the capture, and thus impeding on the reference to lost A/V sync (hindsight = 20/20).

It took a couple days to get it to invoke and we are still very unsure what exact steps of behavior led to the invocation. So, we are going to continue to test, collect data, etc... in the hope we will have more answers and feedback soon. In the meantime, I am happy to answer any questions anyone may have as to the video at the link above. I am holding off on providing too much information on the above test until I have gained a better understanding of the "why". All in effort to keep information on this issue as relevant as possible, and allow future inquiries for information to result in as efficient, streamlined, and factual of a process as possible.

Share this post


Link to post
Share on other sites

So, we just discovered this today:

https://www.dropbox.com/s/fksxu98ls771lni/Test 1 Capture.mp4?dl=0

What you are looking at is a WO project that contains 2x 2D Displays (-y area of stage) and 16x virtual displays (+y area of stage). The VDs are named "VD A" - "VD P". There is a single mov file in the project that we use to measure latency. It is a 60fps file that displays current frame. There is one cue of the file mapped directly to the rightmost 2D display (Display "2"). The other is mapped to "VD A". "VD A" then maps to "VD B", which maps to "VD C", and so on and so on, until "VD P" finally maps to Display "1". As you can see, there is about 7-8 frames of latency between Displays "1" and "2". That would suggest that each VD is adding around 1/2 frame of latency in this particular arrangement.

What I find incredibly fascinating is that by simply changing the arrangement of 2D Displays and VDs on the stage, I get completely different results - both in the way Watchmaker behaves and the amount of latency between Displays "1" and "2" (2-3 frames), as seen in the 2nd test here:

https://www.dropbox.com/s/kfsimgefz76nil4/Test 2 Capture.mp4?dl=0

To me, that would suggest the order in which the stage data is sampled in the application code (eg. L>R, T>B) impacts the result of latency. But I am just grasping at straws here not knowing exactly how everything works "under the hood". Suffice to say, this discovery has me re-thinking our current strategies and usage of VDs. Until now, we had found tremendously powerful and unique uses for them. Usage that provides very dynamic and randomly accessible manners to cue a show ; usage that allowed us to build convenient "virtual multi viewers" for other client and crew resources ; usage that allowed us to be incredibly efficient and effective in creating mappings that could be used like "presets", which also mitigated risk of programmatic errors. It would be great to get a response from Dataton here, so we can gain an understanding of how cascading VD mappings can impact things like latency. Sort of a vague open-ended invitation I know, but if we understood more of what is happening "under the hood", that would help influence our thinking when inventing creative ways to use these tools.

Anyways - circling this back to this topic... The creation of the above 2 show files resulted in my inability to get the latency bug to rear its ugly head again over the last 2 days. And the reason I started exploring VD mappings as a potential contributor, I realized at one point that all the shows we have encountered the latency issue of which this topic was founded upon, were shows in which we were using VDs in similar fashion. Don't get me wrong - we would never cascade a cue through 16 VDs before reaching an ultimate display destination as its end point. However, we have had cases where we would map a cue through 2 VDs before its end point. And there is a common thread here: latency. So without knowing more "under the hood" information, I think it is entirely plausible that what is witnessed in the projects above is potentially related to the latency bug in which this topic addresses.

My research will continue. I will post more findings as I come across. For anyone interested in exploring the projects above, here is a link to the entire package (including the captures of the tests):

https://www.dropbox.com/sh/ktb3rk6f8q9029p/AABHjpQWh2UGz4D_TRwobIiWa?dl=0 

 

Share this post


Link to post
Share on other sites

Although virtual displays were not mentioned in the original post of this thread, I am commenting on them here.

If you read the WATCHOUT manual, you will see that virtual displays were originally meant mainly for two usages:

  • Management of LED wall modules
  • Dynamic texturing of 3d objects

Using them for other things, especially when using virtual display media on other virtual displays, may not yield the result you are expecting.

When using a virtual display media on other virtual displays you get one frame delay on the result. This is due to how the virtual displays work internally. Think of virtual displays as cameras, with displays connected to them. If you have two cameras, and camera 1 films some moving object, and then camera 2 is filming on the display showing the image from camera 1. Then if you view the images of the two cameras on two displays next to each other, you will see the image from camera 2 has a delay compared to the image from camera 1. The attached image is a sketch of a top-down view on this scenario. If you keep adding a camera 3, filming on display 2, and a camera 4 filming on display 3 and so on, each display will get a more and more delayed image. Virtual displays work the same way.

I hope this explanation makes sense.

As a side note, virtual displays are quite heavy to render, each of them requiring a separate render-loop. Therefore, having too many of them may have an impact on your systems performance.

vd.png

Share this post


Link to post
Share on other sites

Wouldn't the virtual display rendering delay be dependent on the order of render target processing? I.e., if a particular virtual display is rendered to before or after another one? If virtual display A comes before virtual display B in the rendering sequence, A would have bee updated before its content is potentially rendered to B, resulting in no delay between the two. However, if A comes after B in the rendering sequence, there will be 1 WO frame delay between the two. Or am I missing something here?

Assuming things work as I think they do here, what's needed would be some way to control the rendering sequence order. Then this would be more predictable. Perhaps Justin's idea of sorting them top/bottom and left/right would be  good starting point. Then one could put virtual displays at negative Y coordinates, in the order on prefers them to render.

My 2c anyway...

Mike - http://pixilab.se

Share this post


Link to post
Share on other sites

Another thought - perhaps WO should ultimately add a feature that subscribes to the popular industry term "slicing"? That is essentially what I end up using VDs for quite a bit. I totally understand there was a particular usage Dataton had in mind for VDs, and that users like myself creating alternate usages for them may not be what was intended. Many media servers have a slicing feature where latency is negligible. I will go ahead and make an official request in the requests topic.

Share this post


Link to post
Share on other sites

In case the use of virtual displays impacts the issue we're chasing, I'll describe our setup on the show that started this topic.

We used three virtual displays to enforce a compositing order with three different kinds of "Always on top" auxiliary timelines:

  1. Content
  2. Logos
  3. Masking for moving screens

Each kind played in their own dedicated space on stage. The media cues for the VDs were on the main timeline and output to projectors: two straight-up, two rotated with geometry correction and pixel density adjustment. The stacking order of the VD media cues composited content on the bottom, logos in the middle, and screen masking on top.

We had a fourth virtual display on that project, too. We had a performance group that projected the distortion-corrected output from an overhead projector seen through a live input fisheye camera onto the main screen. Because WO does not offer geometry correction on virtual displays, we used that virtual display on a 3D object — a simple subdivided plane with UVs tweaked to grid as seen through the fisheye — to rectify the distortion.

As a programmer, I knew I was not using virtual displays as designed, but I loved the elegance and flexibility of this setup. The compositing VDs game me the flexibility to run any cue at any time without having to track or manage state, and the 3D object VD allowed me to keep WATCHOUT on the show instead of switching to a more complex server product that would allow geometry correction on input, not output.

w.

Share this post


Link to post
Share on other sites

Ever thought of using the stacking order? Not many people probably use this but this could maybe eliminate the need for the three content VDs.

Share this post


Link to post
Share on other sites
5 minutes ago, RBeddig said:

Ever thought of using the stacking order? Not many people probably use this but this could maybe eliminate the need for the three content VDs.

Yes, but then I couldn't run any cue at any time, because a lower cue couldn't overtake a higher cue. I needed to combine features of "Always on top" and "Task list order," so I used VDs to enforce the compositing order my show required.

w.

Share this post


Link to post
Share on other sites

Thanks for the sample! Now I get what you were describing -- using the Z position to enforce a compositing order. I really liked being able to see under the screen masking with my compositing VD setup, but this is a good workflow for a lot of other shows!

And my apologies for going OT. I think that virtual displays are a flexible, important tool, but if my use of them contributed in any way to the cueing latency bug, I thought it might be good information for JJ and the developers to know.

w.

Share this post


Link to post
Share on other sites

Hey Walter. Many thanks for the information on your set up. Certainly helps on our journey regarding unveiling answers to this bug. Our next efforts are going to be full show reconstruction that completely eliminates the use of VDs. Our thought is if we go a week of continual usage in a non-VD show-like environment and do not invoke the bug, it gives us great plausibility that VDs are a factor. And then we perhaps take it a step further - single mapping instances of VDs. And so on.... Hey - not much more we can do when you are dealing with a highly consequential, hard-to-reproduce type of bug without having the benefit of extensive knowledge of what is occurring under the hood.

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now

×