Need hardware/software configuration advices in light of test result

Claude.Rivet · September 29, 2014

Good day,

I just built two Watchout display machine using the knowledge found on these forums so I come back here to get your inputs regarding the fine tuning.

I ran some test on both machine and some results are puzzling. The machine are:

-X99 chipset

-i7 5960x

-AMD Firepro 9100

-AMD S400 sync module

-Datapath VisionSDI2

-16GB DDR4 ram

-128GB Samsung EVO ssd for system

-512GB Samsung XP941 M2 ssd for shows

-Encrypted WD 250GB HD used solely by Acronis with b/u images

-Customized win7 64 installation (many features removed, services disabled and of course all settings configured according to the manual and past experience with building WO stations)

My goal was to be able to run 24 mpeg2s at 40mbps concurently, done deal but according to my maths I would even be able to run more so I started intel XTU and made the graph show the last 1 hour results. I started the display and ran some content. I will spare you the details but essentially when running only videos I was able to mix 12 videos into 12 other videos with no hiccup, when including live camera feed trough the visionSDI2 I was able to run the same content but during the mix the live feed would greatly stutter and lag, to be able to run the live feed and videos and mix without any issue whatsoever I could mix 3 videos with 3 videos. I haven't tested many file format and I am sure they were not optimal but I had to figure where my bottleneck was or what part of my system was not performing well.

According to intel XTU my processor never got used by more than 30%, spent most of it's time with only 4 out of 8 cores running and the most ram that got used was barely above 4GB... weird no?

Other than that I realized that my processor speed was all over the place and that got me thinking, could speedstep and turboboost actually be detrimental to WO? Since the processor keep adjusting itself as the cues are sent maybe the stutter and lag comes from this adjustment?

I am not able to test right now but in the meanwhile can you guys provide me with your knowledge regarding hardware and software settings that could help WO make the most out of these machines?

thanks much and sorry for the long post

regards

Jonas Dannert · September 30, 2014

Good day,

I just built two Watchout display machine using the knowledge found on these forums so I come back here to get your inputs regarding the fine tuning.

I ran some test on both machine and some results are puzzling. The machine are:

-X99 chipset

Motherboard?

Last BIOS installed?

-i7 5960x

-AMD Firepro 9100

-AMD S400 sync module

-Datapath VisionSDI2

-16GB DDR4 ram

It's a longshot, but I wonder if it would help with 32GB of RAM, due to the graphics card has 16GB itself?

-128GB Samsung EVO ssd for system

-512GB Samsung XP941 M2 ssd for shows

-Encrypted WD 250GB HD used solely by Acronis with b/u images

Sounds like a more than capable system, from the hardware specs point of view.

-Customized win7 64 installation (many features removed, services disabled and of course all settings configured according to the manual and past experience with building WO stations)

I would recommend to use the Tweaklist as the starting point:

http://dataton.com/forum/topic/288-mpeg-settings/page__view__findpost__p__2369

My goal was to be able to run 24 mpeg2s at 40mbps concurently, done deal but according to my maths I would even be able to run more so I started intel XTU and made the graph show the last 1 hour results. I started the display and ran some content. I will spare you the details but essentially when running only videos I was able to mix 12 videos into 12 other videos with no hiccup, when including live camera feed trough the visionSDI2 I was able to run the same content but during the mix the live feed would greatly stutter and lag,

What is you're source signal/signals here?

Progressive or interlaced?

Slot used on the motherboard?

Capture settings in WATCHOUT?

to be able to run the live feed and videos and mix without any issue whatsoever I could mix 3 videos with 3 videos.

I haven't tested many file format and I am sure they were not optimal but I had to figure where my bottleneck was

or what part of my system was not performing well.

According to intel XTU my processor never got used by more than 30%, spent most of it's time with only 4 out of 8 cores running

and the most ram that got used was barely above 4GB... weird no?

In theory you have a capable system, but in practice you'll need to find the bottleneck of the system.

4GB of RAM is not surprising, WATCHOUT is a 32-bit application, it can not use more than that for itself.

But capture card and graphics card drivers (64bit) can make use of more memory on a 64-bit OS.

Other than that I realized that my processor speed was all over the place and that got me thinking,

could speedstep and turboboost actually be detrimental to WO?

No.

Since the processor keep adjusting itself as the cues are sent maybe the stutter and lag comes from this adjustment?

I don't think so, but if you do, maybe turn that off? If possible, that is.

I am not able to test right now but in the meanwhile can you guys provide me with your knowledge regarding

hardware and software settings that could help WO make the most out of these machines?

thanks much and sorry for the long post

regards

geogen · September 30, 2014

Had same issue, you need to keep input sources open all the time when mixing the videos, what I mean that don't start input source in time frame of video mixing, secondly 24 different videos even for your system from my opinion is too much, because you have only 16 threads on 8 cores on cpu, that mean when you mixing the videos including live input the system use 24 tasks across 16 threads plus 1-2 tasks for live input, so for live input you have at least one core, so for mixing videos you remains with 7 cores

I have WO with 5930k cpu which have 6 cores 12 thread and msi 7970 graphics with 6 outputs and I was able to mix without lag 14 videos with 2 live sdi inputs, the clips were encoded WMV 15mbps

Jonas Dannert · September 30, 2014

Also, keep in mind that looping a video clip in WATCHOUT',

means it plays twice ie consumes more resources.

/jonas

jonasf · September 22, 2016

Hello, I know this might be a dead thread but since the topic/issue is the same as mine, I have a few questions. I've just finished a show where I had 26 video layers different resolutions spread across 4 displays using a composition in watchout 6.1.3 I initially tried mpeg2 and h264 where both didnt even allow to me to play the clip as both display and production machines will crash everytime i play the 26 clip cue. I then used wmv 10mbps bitrate which surprisingly let me play the chunk on free running and looping with no problem. I'm fine with that until i crossfade the 26 layer comp with another 26 layer comp. I will get a few seconds of delay and unsynced video before they all catches up and play insync again. Question is, am i pushing the software with my 26 layers? I tried playing 4 layer comps but i still get the same syncing problem everytime i do the crossfade from 1 comp to the next. Any advice how to optimize my show?

PC Specs:

Windows 7 Pro

Watchout 6.1.3

6-core processor

amd firepro w7100

S400 sync card

128 ssd Samsung evo Raid-0

Ram 16GB

Tweak list and codec guidelines were strictly followed.

PS. I also have about 15 points of geometry correction on each display screen.

Any advice or harsh comments are very much appreciated.

Cheers

jfk · September 22, 2016

Hello, I know this might be a dead thread but since the topic/issue is the same as mine, I have a few questions. I've just finished a show where I had 26 video layers different resolutions

26 movie files at the same time is a lot of video clips for just 4 displays.

Are you downscaling the videos in WATCHOUT or are they just really, really small videos?

(Downscaling video in WATCHOUT is a HUGE resource waste).

spread across 4 displays using a composition in watchout 6.1.3 I initially tried mpeg2 and h264 where both didnt even allow to me to play the clip as both display and production machines will crash everytime i play the 26 clip cue. I then used wmv 10mbps bitrate which surprisingly let me play the chunk on free running and looping with no problem. I'm fine with that until i crossfade the 26 layer comp with another 26 layer comp.

I will get a few seconds of delay and unsynced video before they all catches up and play insync again. Question is, am i pushing the software with my 26 layers?

No, you are not pushing the software.

Yes, you are pushing the hardware.

System throughput is the most likely bottleneck.

Also, double check to make certain that B-frames were not used in your encoding

Use of B-frames in movie encoding can reduce system efficiency by as much as 75%!

Furthermore, B-frames are known to cause crashes when the system is pushed too far.

If wmv improved performance,

you may see even greater performance benefit with standard HAP encoding.

I tried playing 4 layer comps but i still get the same syncing problem everytime i do the crossfade from 1 comp to the next. Any advice how to optimize my show?

PC Specs:

Windows 7 Pro

Watchout 6.1.3

6-core processor

amd firepro w7100

S400 sync card

128 ssd Samsung evo Raid-0

Ram 16GB

Tweak list and codec guidelines were strictly followed.

Another key area not clear from your description is memory throughput.

Quad-channel memory and the faster the better are important for what you are attempting.

More common dual-channel memory is unlikely up to this task.

weak list and codec guidelines were strictly followed.

PS. I also have about 15 points of geometry correction on each display screen.

Any advice or harsh comments are very much appreciated.

Cheers

Geometry correction has little impact on resources needed for movie playback.

Since it is a fixed transform executed in the GPU, it has no negative impact on

the critical areas for movie playback - disc drive throughput, memory throughput, and CPU capacity.

ken · September 23, 2016

Hello Jonas

Thank you for uploading the show for my examination. Nice show, good graphics, lots of work.

There are lots of videos. When you crossfade two compositions, you have 52 videos running at the same time! The WMVs are running well. You also have a large composition, 4826x2524, encompassing all 4 displays with borders. The pixel count is almost 50% more than if you use 4 compositions of 1920x1080 each.

Since the 4 displays are separated, I suggest you use 1 composition for each display, then you would have less videos in each composition and a smaller composition. This workload will be lighter for the hardware system.

Best

jonasf · September 26, 2016

Hello Jonas

Thank you for uploading the show for my examination. Nice show, good graphics, lots of work.

There are lots of videos. When you crossfade two compositions, you have 52 videos running at the same time! The WMVs are running well. You also have a large composition, 4826x2524, encompassing all 4 displays with borders. The pixel count is almost 50% more than if you use 4 compositions of 1920x1080 each.

Since the 4 displays are separated, I suggest you use 1 composition for each display, then you would have less videos in each composition and a smaller composition. This workload will be lighter for the hardware system.

Best

Hey Thanks Ken. Understood on the crossfade, it makes sense. I actually made the crossfade as a workaround for the 2 second lags. Coz they were actually meant to be played continuously. But since I am getting sync issues with the 26 videos, I crossfade them so that the next 26 videos will get a chance to pre-roll and sync before they fade in as a next cue.

For the composition distribution, separating them into 4 comps for each display came across my mind but I thought since the 4 displays are coming from 1 graphics card/cpu, that it wouldn't matter. I should've tried that previously.

With this, do I conclude that I should avoid running 26 videos next time? If so, with my maxed hardware specs, what shoud be the optimum number of videos I should play with? I've also just order watchPax and will run the same show to see if there is any difference just to eliminate some questions in my head.

Thanks again.

jonasf · September 26, 2016

26 movie files at the same time is a lot of video clips for just 4 displays.

Are you downscaling the videos in WATCHOUT or are they just really, really small videos?

(Downscaling video in WATCHOUT is a HUGE resource waste).

No, you are not pushing the software.

Yes, you are pushing the hardware.

System throughput is the most likely bottleneck.

Also, double check to make certain that B-frames were not used in your encoding

Use of B-frames in movie encoding can reduce system efficiency by as much as 75%!

Furthermore, B-frames are known to cause crashes when the system is pushed too far.

If wmv improved performance,

you may see even greater performance benefit with standard HAP encoding.

Another key area not clear from your description is memory throughput.

Quad-channel memory and the faster the better are important for what you are attempting.

More common dual-channel memory is unlikely up to this task.

Geometry correction has little impact on resources needed for movie playback.

Since it is a fixed transform executed in the GPU, it has no negative impact on

the critical areas for movie playback - disc drive throughput, memory throughput, and CPU capacity.

Thanks for the kind reply. I have rendered my clips pixel per pixel to optimize file size since i know i'd be dealing with multiple layers. Video sizes from 216x216pixels (smallest) up to 216x916pixels (largest). All clips have different sizes. I had to do this method due to the setup we have.

B-Frame, I am using Adobe Media Encoder straight from After Effects export. I didn't come across any option to turn off or untick the use of B-Frames. Can you point me where or how to disable B-Frames?

Memory, I am on a 32gb Dual-Channel setup. I will make it a quad-channel asap.

Geometry, thanks for clarifying. I shall use 1 video next time and just do the point to point geometry correction as opposed to the 26 video files.

Finally, does using Windows 10 on my production pc to communicate to windows 7 display setup as mentioned affect video layers from syncing? I had no issues controlling the show with windows 10 production pc at the moment, but if it would also possibly affect anything, then I shall also downgrade my control laptop later to Win 7.

Thanks so much! I hope this also helps some people having the same scenario.

Miro · September 26, 2016

Hi,

Most video tools including Adobe Media Encoder are using codec settings for mainstream usage, which nowadays are optimized to deliver minimal size with best possible quality. This is great when streaming stuff over the internet but makes often the decoding process more hardware intensive. In WATCHOUT, best decoding performance is more important than smallest file size for a certain quality. To really optimize your videos you will need to use more hardcore tools like FFmpeg, which is the most flexible, fastest and free video tool available. It can convert to any format WATCHOUT uses like HAP, H264, MPEG-2, ProRes, still image sequences etc..

As jfk mentioned, getting rid of b-frames is one good step on the way. B-frames are great for reducing size but they require more processing since each b-frame need bi-directional references to the closest intra frames (i-frames). Also make the GOP (group-of-pictures), the distance between i-frames smaller may also help to improve performance. Also it will make seeking faster. It's common for some codecs to use GOP sizes around 300 frames but in WATCHOUT I would recommend a size around around 30 or even smaller if you need to do a lot of seeking/jumping. For more complex codecs like H264, its also possible to disable all post processing and loop-back filtering to gain some performance.

You can download a FFmpeg build for windows here. Easiest is to use a static build because then you only need the "ffmpeg.exe" file. FFmpeg is a command line tool which means that either have to use the terminal/cmd or write a bat-file. A bat file is basically a text file with a ".bat" extension which will be executed when double-clicked. I created this one for an other customer for conversion to h264: https://www.dropbox.com/s/phkf16ialcm4ynn/convert.zip?dl=0

You can use the same, but modify the bat file in a text editor so it suits your needs to something like this:

set bitrate=20M

ffmpeg.exe -i "input.any" -c:a copy -c:v mpeg2video -g 30 -bf 0 -b:v %bitrate% -minrate %bitrate% -maxrate %bitrate% -bufsize %bitrate% -y "output.mpg"

pause

Change the input to the path and name of your input file. Easiest is to place the input file in the same directory. Also set the output name to something useful. You will need to modify the bit-rate of 20 Mbit/sec to something that matches your resolution.

The -bf flag is set to zero which means zero b-frames. The -g flag is the GOP size. Since MPEG-2 is a very simple codec it hasn't that many options. You can also set the pixelformat if needed. Default is 4:2:0 but WATCHOUT but in some cases 4:2:2 i needed. This is done my inserting "-pix_fmt yuv422p" before the "-b:v" flag.

Or if you prefer HAP_Q the process i very simple too.

ffmpeg.exe -i "input.any" -an -c:v hap -format hap_q -chunks 8 -y "output.mov"

pause

Where chunks is the number of threads that will be used in decoding.. -an is to disable audio.

Regarding Windows tweeking, make sure you use "High performance" power plan and if you are using intel RAID then install the latest Intel Rapid Storage drivers.

Best Regards,

Miro

jonasf · September 30, 2016

Hi,

Most video tools including Adobe Media Encoder are using codec settings for mainstream usage, which nowadays are optimized to deliver minimal size with best possible quality. This is great when streaming stuff over the internet but makes often the decoding process more hardware intensive. In WATCHOUT, best decoding performance is more important than smallest file size for a certain quality. To really optimize your videos you will need to use more hardcore tools like FFmpeg, which is the most flexible, fastest and free video tool available. It can convert to any format WATCHOUT uses like HAP, H264, MPEG-2, ProRes, still image sequences etc..

As jfk mentioned, getting rid of b-frames is one good step on the way. B-frames are great for reducing size but they require more processing since each b-frame need bi-directional references to the closest intra frames (i-frames). Also make the GOP (group-of-pictures), the distance between i-frames smaller may also help to improve performance. Also it will make seeking faster. It's common for some codecs to use GOP sizes around 300 frames but in WATCHOUT I would recommend a size around around 30 or even smaller if you need to do a lot of seeking/jumping. For more complex codecs like H264, its also possible to disable all post processing and loop-back filtering to gain some performance.

You can download a FFmpeg build for windows here. Easiest is to use a static build because then you only need the "ffmpeg.exe" file. FFmpeg is a command line tool which means that either have to use the terminal/cmd or write a bat-file. A bat file is basically a text file with a ".bat" extension which will be executed when double-clicked. I created this one for an other customer for conversion to h264: https://www.dropbox.com/s/phkf16ialcm4ynn/convert.zip?dl=0

You can use the same, but modify the bat file in a text editor so it suits your needs to something like this:

set bitrate=20M

ffmpeg.exe -i "input.any" -c:a copy -c:v mpeg2video -g 30 -bf 0 -b:v %bitrate% -minrate %bitrate% -maxrate %bitrate% -bufsize %bitrate% -y "output.mpg"

pause

Change the input to the path and name of your input file. Easiest is to place the input file in the same directory. Also set the output name to something useful. You will need to modify the bit-rate of 20 Mbit/sec to something that matches your resolution.

The -bf flag is set to zero which means zero b-frames. The -g flag is the GOP size. Since MPEG-2 is a very simple codec it hasn't that many options. You can also set the pixelformat if needed. Default is 4:2:0 but WATCHOUT but in some cases 4:2:2 i needed. This is done my inserting "-pix_fmt yuv422p" before the "-b:v" flag.

Or if you prefer HAP_Q the process i very simple too.

ffmpeg.exe -i "input.any" -an -c:v hap -format hap_q -chunks 8 -y "output.mov"

pause

Where chunks is the number of threads that will be used in decoding.. -an is to disable audio.

Regarding Windows tweeking, make sure you use "High performance" power plan and if you are using intel RAID then install the latest Intel Rapid Storage drivers.

Best Regards,

Miro

Hi,

After 3 days of intensive tests, I will conclude for now that 26 videos mp4 stresses the hardware so much that it causes watchout to drift connection, random restarts and occasional freeze frames. 26 wmv's played well at 30fps 10,000kbps CBR @ different resolutions.

My specs:

Intel Core i7 6800K @ 3.4GHz

Quad Channel DDR4 64GB Ram

AMD FirePro W7100

X99 Board

S400 Sync Card

128GB Raid 0 SSD Samsung Pro

Tweak List followed

Windows 7 Pro Installed on Display Servers

Windows 10 Installed on Production PC

Setup 1 (Built-up hardware)

1 to 1 file per display, both mp4 and wmv played fine on a looping & Free running for a few hours not jerks etc.

26 files mp4 - unstable, lost network connection, freezes

26 files wmv - played on looping and free running. Jerks during transitions but catches up sync and plays loop with no problem

Setup 2 (Watchpax)

1 to 1 file per display (Same result as above)

26 files mp4 - Plays the files but some video layers will freeze and needed to be refreshed (CTRL D) to make all the layers pop up and play.

26 files wmv - (Same result as above)

Note: mp4s with B-Frame and without B-frame tested and rendered same results. Tested both on WO 5.5.2, 6.1.3 and 6.1.4b2 (All same results) 50 points Geometry correction on each display (no issue), I tested with and without geometry correction, same results.

For now, I will limit the video layers to anything less than 10 to ensure smooth playbacks. I still stand to be corrected.

Thanks for your inputs. Cheers.

Miro · September 30, 2016

Just to clarify.. mp4 is a container that can contain different video streams encoded with codecs like H265, H264, H263/MPEG-4, etc. Decoding H264/AVC compared to H263 on a CPU is usually a bit heavier since H264 is a more complex codec (better compression). What's the resolution and frame rate of the 26 video streams? Is the quality the same using wmv compared to H264?

Also we don't supply any wmv codecs so this depends on whats installed on your system.

Need hardware/software configuration advices in light of test result

Recommended Posts

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Join the conversation