PDA

View Full Version : New GPU vs. new CPU



smitbret
09-24-2019, 01:31 AM
My AMD 8350 is getting a little long in tooth and I was thinking of upgrading to a new Ryzen setup with a 3600 or better. That would mean CPU, Motherboard, Memory and an OS reinstallation which seemed a little expensive and troublesome so I am also considering just upgrading the GPU (currently GT730). I was leaning toward an nVidia GTX1650 since it supports NVENC HEVC encoding and decoding.... or maybe a GTX1660 for the HEVC B Frame Support.

I am basing my choices off of nVidia's Video Encode and Decode GPU Support Matrix:
https://developer.nvidia.com/video-encode-decode-gpu-support-matrix

Unfortunately, I don't know much about the inner workings of FFMPEG decode/encode so I am looking for clarification. Maybe HEVC B Frame support is a non-issue...... ?

Also, my impression is that VP9 encoding would still be limited by the CPU so remote streaming would be a problem, correct? Currently, I am only getting about 9fps (1080p) with my AMD 8350 so it limits how much I stream out of home.

Any suggestions or tips would be helpful.

Peter
09-24-2019, 09:17 AM
Currently Mezzmo only supports encoding h264 using hardware encoding so HEVC and VP9 are not supported. We may support these in the future so having encoding support for these in the GPU would be a good idea or for use with Handbrake or other encoding software.

mrgenie
09-25-2019, 09:56 PM
It IS always the opinion of the observer of course.. but i tested the NVIDIA encoding and the image quality isn't the best I'd argue.

There is noticeable DIFFERENCE in quality between the various hardware encoders!

Go on the internet, find out if it's acceptable to you.

I know 99,99% of the users ignore the quality and just care about the "performance" indeed one or the other encoder might be faster..

But you should really compare quality of the image.. personally I think that's more important!

especially when doing movies/images with low contrast.. like dark scenes.. the differences can be HUGE!

NVENC performs better if done in pre-transcoding due to the multiple passes.

Quicksync has lower latency encoding and deblocking filter is IMHO slightly faster so for streaming you'd rather use the
Quicksync from Intel.

For Handbrake you mostly would prefer NVIDIA

but it's NOT like it's a day & night difference especially the most recent Quicksync implementations are MUCH better than the older
which are usually used for comparison.

So if you only care for "remote streaming" i'd rather go with the Intel

If you use handbrake and want to make the most of it, I'd go with NVIDIA..

But don't take my word for it, this is really "personal impression" so compare yourself.. google is your friend here.

but remember, a lot of older posts you find with google for comparison compare older standards..

filter out all results older then 12 months!

mrgenie
09-25-2019, 09:58 PM
Currently Mezzmo only supports encoding h264 using hardware encoding so HEVC and VP9 are not supported. We may support these in the future so having encoding support for these in the GPU would be a good idea or for use with Handbrake or other encoding software.

I don't get it? FOr streaming Quicksync is definately the faster option for various reasons.

why would you have users use something else?

The question about VP9 or HEVC is not important for Mezzmo, as Mezzmo has as task STREAMING..

why would you want a user to do that over an NVIDIA? doesn't make sense to me.

jbinkley60
09-26-2019, 10:54 AM
Here's some results from testing I did awhile back with IQS, nVidia and CPU transcoding.

http://www.thebinks.com/jeff/transcoding_results.html

This testing was done using Mezzmo, various clients and different formats. For web the browser makes a difference since certain browsers support certain video file formats. I am in the process of upgrading one of my computers to a Ryzen 7 3700X. I might be able to do some testing with it in the future. I also have an nVidia 1060 card now in my Mezzmo server in case anyone wants a specific test.

smitbret
09-27-2019, 08:28 AM
I stay with AMD for my server because I can use ECC memory with Asus motherboards and not have step up to server grade motherboards like I would with Intel.

I know that NVENC encoding is inferior to CPU encoding. I remember experimenting with CUDA for transcode/encode duty and that was garbage. NVENC is good enough so additional PQ is way down the list of priorities since most of the time I will be streaming to a cell phone or tablet and good enough will be good enough. In fact, the only time I will probably use transcoding is when I am remote streaming and it seems like there's no good way to escape VP9. For now, I just need to play around and see if I can get Mezzmo/FFMPEG to use more than 30% of my CPU and GPU during transcoding. That's the most frustrating part.

smitbret
10-09-2019, 06:37 AM
So, I have been messing around with my current setup, in lieu of future hardware changes, and have a few questions about the transcoding setup. When I am transcoding with Mezzmo, my CPU usage rarely climbs above 30% (AMD FX-8350). It is an 8-core CPU (4/4 I know, I know) so that math says that it is using 3 or 4 cores at most. Is there something I can do to the FFMPEG file to get that usage up closer to 80% or so particularly towards VP9 transcoding? Is FFMPEG limiting how many cores to activate and can it be tweaked?

Peter
10-09-2019, 10:00 AM
The ffmpeg commandline parameters are set to allow auto configuration of the number of threads to use when transcoding and this affects the number of cores used. Using extra cores does not necessarily equal faster transcoding because the bottleneck can be elsewhere such as disk read/write speed. Also if the media files are stored on a NAS this can also affect transcoding speed.

smitbret
10-09-2019, 01:12 PM
The ffmpeg commandline parameters are set to allow auto configuration of the number of threads to use when transcoding and this affects the number of cores used. Using extra cores does not necessarily equal faster transcoding because the bottleneck can be elsewhere such as disk read/write speed. Also if the media files are stored on a NAS this can also affect transcoding speed.

I can peg the CPU when it transcodes to h264 (plenty of fps) so it's not a HDD bottleneck, but if it is transcoding to VP9 for web streaming it just crawls at 33-34%.

smitbret
10-10-2019, 12:05 AM
I recently bought AMD E3 V2 and GeForce GTX 1050 Ti but the graphic card is not working on that machine. Is this card not compatible with that machine?

What is an AMD E3 V2?

smitbret
10-18-2019, 10:36 AM
Peter, I looked on Reddit for some help on my hardware roadmap and someone with a lot more knowledge on video encoding mentioned that libvpx has been updated to support 8 threads instead of just 4 when encoding to VP9. Is this something that can be updated by getting a newer version of FFMPEG?

Link to thread:
https://www.reddit.com/r/Amd/comments/dj909q/best_value_for_vp9_transcoding_thinking_3600_or/

Peter
10-18-2019, 02:05 PM
Yes getting a newer version of ffmpeg could help but this may also break transcoding in Mezzmo if the commandline parameters to ffmpeg have changed. You can get a newer ffmpeg version from https://ffmpeg.zeranoe.com/builds/ and then change the path in Transcoding settings to the path to the new ffmpeg to see if it works.

jbinkley60
11-10-2019, 10:05 PM
I've completed a couple of my AMD hardware upgrades to Ryzen 3000 CPUs. Here's some comparisons using FFMPEG doing 24 mbps H264 encoding.


27 fps - AMD FX-8350 1080P H.264
48 fps - Intel i7-4790k 1080P H.264
123 fps - Ryzen 3800X 1080P H.264
354 fps - Intel i7-4790K 1080P H.264 w/Nvidia 970 GPU

jbinkley60
12-06-2019, 08:20 AM
I've done one more upgrade, my video card from a GTX 970 to a GTX 2070 Super card. Here's the updated results:

27 fps - AMD FX-8350 1080P H.264
48 fps - Intel i7-4790k 1080P H.264
123 fps - Ryzen 3800X 1080P H.264
354 fps - Intel i7-4790K 1080P H.264 w/Nvidia 970 GPU
535 fps - Intel i7-4790K 1080P H.264 w/Nvidia 2070 Super GPU

The bottleneck becomes how fast the GPU can decode the source file. The decode process is running the GPU at 100%. The source is a Blu-Ray ISO. If I move down to DVDs I can get fps rates over 1300fps.

jbinkley60
01-04-2020, 06:59 AM
I've completed a hardware upgrade of my main Mezzmo server to a Ryzen 3700X CPU and completed some additional software only transcoding testing using the new hardware. The results can be found at:

Transcoding test results (http://www.thebinks.com/jeff/transcoding_results.html)

I've decided that the software only results are good enough that I am not putting an nVidia card back in the server. One odd thing I noticed was that the VP8 results had low CPU utilization due to ffmpeg only leveraging one core on the CPU. I checked the FFmpegAdditional.xml file and the VP8 line is:

<ffmpegadditional id="vpx+encoding">-threads 4 -cpu-used 4 -crf 22 -qmin 10 -qmax 42</ffmpegadditional>

which I think should leverage 4 threads instead of 1. I'll need to research this further and if I can get ffmpeg to leverage more threads for VP8 I will republish the results.

smitbret
01-05-2020, 06:36 AM
I've complete a hardware upgrade of my main Mezzmo server to a Ryzen 3700X CPU and completed some additional software only transcoding testing using the new hardware. The results can be found at:

Transcoding test results (http://www.thebinks.com/jeff/transcoding_results.html)

I've decided that the software only results are good enough that I am not putting an nVidia card back in the server. One odd thing I noticed was that the VP8 results had low CPU utilization due to ffmpeg only leveraging one core on the CPU. I checked the FFmpegAdditional.xml file and the VP8 line is:

<ffmpegadditional id="vpx+encoding">-threads 4 -cpu-used 4 -crf 22 -qmin 10 -qmax 42</ffmpegadditional>

which I think should leverage 4 threads instead of 1. I'll need to research this further and if I can get ffmpeg to leverage more threads for VP8 I will republish the results.

Yep, you are getting the same thing with VP8 that I am getting. I am not an encoding genius but I would be really interested in any way to get the thread count up.

jbinkley60
01-05-2020, 11:35 AM
I'll let Peter or Paul weigh in here. I had worked with Peter before on this and adjusting the threads and CPU parameters in the vpx encoding line worked previously on my FX-8350 CPU. If you want to try it yourself find the FFmpegAdditional.xml file located in your device profiles directory.

The default line is: <ffmpegadditional id="vpx+encoding">-threads 4 -cpu-used 4 -crf 22 -qmin 10 -qmax 42</ffmpegadditional>

I had changed mine to: <ffmpegadditional id="vpx+encoding">-threads 4 -cpu-used 8 -crf 22 -qmin 10 -qmax 42</ffmpegadditional>

and was seeing higher CPU utilization and VP8 encoding rates. It doesn't seem to be working with my new Ryzen CPU. One interesting thing is that the ASUS motherboards I am using have CPU virtualization disabled by default. I don't think ffmpeg leverages CPU virtualization. This may have something to do with the version of ffmpeg in Mezzmo vs. how new the Ryzen Zen 2 based CPUs are. For me I can watch the videos fine (i.e. transcoding fps rate is high enough) but I would like to see better CPU utilization, like I did with my FX-8350.

Update:

I enabled CPU virtualization for my Ryzen 3700X and no difference in VP8 encoding using more threads. I didn't expect it would but wanted to rule that out.

smitbret
01-07-2020, 12:34 AM
I'll let Peter or Paul weigh in here. I had worked with Peter before on this and adjusting the threads and CPU parameters in the vpx encoding line worked previously on my FX-8350 CPU. If you want to try it yourself find the FFmpegAdditional.xml file located in your device profiles directory.

The default line is: <ffmpegadditional id="vpx+encoding">-threads 4 -cpu-used 4 -crf 22 -qmin 10 -qmax 42</ffmpegadditional>

I had changed mine to: <ffmpegadditional id="vpx+encoding">-threads 4 -cpu-used 8 -crf 22 -qmin 10 -qmax 42</ffmpegadditional>

and was seeing higher CPU utilization and VP8 encoding rates. It doesn't seem to be working with my new Ryzen CPU. One interesting thing is that the ASUS motherboards I am using have CPU virtualization disabled by default. I don't think ffmpeg leverages CPU virtualization. This may have something to do with the version of ffmpeg in Mezzmo vs. how new the Ryzen Zen 2 based CPUs are. For me I can watch the videos fine (i.e. transcoding fps rate is high enough) but I would like to see better CPU utilization, like I did with my FX-8350.

Update:

I enabled CPU virtualization for my Ryzen 3700X and no difference in VP8 encoding using more threads. I didn't expect it would but wanted to rule that out.

Dissappointing, since my new system is based on a Ryzen 3500x. I'm still gonna try it, though. Thanks for the tips.

Peter
01-15-2020, 10:15 AM
Looks like encoding is limited by libvpx based upon the resolution (from https://stackoverflow.com/questions/41372045/vp9-encoding-limited-to-4-threads):

Libvpx uses tile threading, which means you can at most have as many threads as the number of tiles. The -tile-columns option is in log2 format (so -tile-columns 6 means 64 tiles), but is also limited by the framesize. The exact details are here, it basically means that max_tiles = max(1, exp2(floor(log2(sb_cols)) - 2)), where sb_cols = ceil(width / 64.0). You can write a small script to calculate the number of tiles for a given horizontal resolution:

Width: 320 (sb_cols: 5), min tiles: 1, max tiles: 1
Width: 640 (sb_cols: 10), min tiles: 1, max tiles: 2
Width: 1280 (sb_cols: 20), min tiles: 1, max tiles: 4
Width: 1920 (sb_cols: 30), min tiles: 1, max tiles: 4
Width: 3840 (sb_cols: 60), min tiles: 1, max tiles: 8
So even for 1080p (1920 horizontal pixels), you only get 4 tiles max, so 4 threads max, i.e. a bitstream limitation. To get 8 tiles, you need at least a width of 1985 pixels (2048-64+1, which gives sb_cols=32). To get more threads than the max. number of tiles at a given resolution, you need frame-level multithreading, which libvpx doesn't implement. Other encoders, like x265/x264, do implement this.

[edit] as some people in comments and below have already commented, more recent versions of libvpx support -row-mt 1 to enable tile row multi-threading. This can increase the number of tiles by up to 4x in VP9 (since the max number of tile rows is 4, regardless of video height). To enable this, use -tile-rows N where N is the number of tile rows in log2 units (so -tile-rows 1 means 2 tile rows and -tile-rows 2 means 4 tile rows). The total number of active threads will then be equal to $tile_rows * $tile_columns.

It appears that 16 threads is the maximum and this uses 4 cores https://askubuntu.com/questions/691283/multi-core-encoding-with-vp9-ffmpeg

johnmond
11-17-2021, 05:58 PM
What is an AMD E3 V2?
Amd Is a good and fast motherboard for gaming and ales. if you check review visit site.
https://pccustombuilder.com/best-amd-cpu-for-gaming-and-streaming/

Nirandy
12-09-2021, 08:06 PM
These E3 standard instances are based on the AMD EPYC 7742 processor, with a base clock frequency of 2.25 GHz and max boost of up to 3.4 GHz. The bare metal E3 standard compute instance supports 128 OCPUs (128 cores, 256 threads) and 2 TB of RAM and has 100 Gbps of overall network bandwidth