Saturday, August 5, 2017

First impressions of VEGA on LN2


This Friday I ran some very quick tests with VEGA FE on LN2. I don't have many screen shots but I do have some intresting information.

Does it work under LN2 at all?

Yes it does. All the way to -185C. No issues what so ever. No cold slow no cold bug or boot bug and no black screens under load either. I did only test at stock voltage and the black screen bug tends to show up at high core voltages but for now it looks like smooth sailing even at full pot.

Does it scale with temperature?

Kinda. My card on stock volts on water does 1680/1100 at best for 3Dmark timespy while bouncing of the power limit. Under LN2 at stock Vcore I was perfectly stable at 1800/1100. I was in a rush due to the card not being properly insulated so I didn't try 1850/1100 but 1900/1100 crashed. 2000/1100 would pass Timespy but the score was lower than at stock clock so it look like VEGA can pull that same stunt like Pascal where it seems to run at very high clocks while under performing horrifically. I'm sure extra Vcore will solve this as evidently I didn't have enough Vcore to run 1900 properly. Another intresting side effect of the LN2 is that VEGA's power draw fell of a cliff. Like 100W less than air cooled on +50% power limit. So the GPU core evidently is very happy with LN2 and could probably go well in excess of 2GHz with more voltage(I did all my testing at stock which is 1.2V).

The HBM2 on the other hand seems to have some major issues. I could get GPU-z's render test to run just fine at 1230MHz HBM2 clock however if I tried to run any real workloads like 3Dmark Timespy it would crash even 1MHz above 1100MHz. I can think of a number of things that could be causing this and all need more testing. First of all the HBM might just need more voltage on either VDD or VPP to sustain load. There might be some kind of issue with the memory timings just being too tight for any clock above 1100Mhz. It could also have been a thermal problem. Pulling the card down I heard something that sounded very much like the thermal paste failing however GPU core temperatures were still bellow 0 and the LN2 pot side temperature probe was responding under load as it should. However the HBM2 stacks don't have any accessible temperature readings and aren't exactly one with the GPU core silicon so the HBM stacks could have lost contact while the GPU core was doing just fine. So at low loads the HBM would stay cool but under full load it would warm up and crash. Either way this needs more testing.

The display outputs freeze over pretty quick as the only things between them and the GPU core are the VPP, VDDCI and display drive VRMs. None of which put out any significant amount of heat so the cold from the GPU core gets the display outputs pretty quick.
The back of the card. I used the mounting bracket from the Raijintek Morpheus II cooler to get a more secure mount for the Der8auer Raptor 4 LN2 pot.
The red wire is hooked up to the Vcore plain so I could check voltages with a DMM. I also added some 2.5V SMD polymer caps to both Vcore and the VDD rail. At ambient those did nothing for overclocking capabilities and I haven't done a before and after test on LN2 either. So I have no idea how much or if they are even helping.
This pic of the GPU frozen just looks cool. You can kinda see the infill around the HBM2 dies. I'm really glad that it's been added as it should make it much much more difficult to damage the HBM interposer when replacing cooling systems on the GPU core.




And here we can see something I found pretty interesting. For some reason and small piece of the thermal paste stayed on the GPU core while all the rest of the paste stayed on the LN2 pot.
With the card on LN2 the VRMs would all freeze over at idle as they don't produce and appreciable heat. Under load the ice on the Vcore VRM would very quickly melt. This might cause some major water problems for extended sessions as keeping the VRM from cycling between sub 0 and positive is pretty much impossible.





















Overall I must say I'm excited to try run VEGA on LN2 seriously once we get some proper drivers for the cards and I get more LN2.


The only score I saved from the session: http://www.3dmark.com/3dm/21393617

Sunday, April 23, 2017

Notes on modding HD7990's BIOS

Sharing is caring so here's some info of questionable quality on HD 7990 BIOS modding. I haven't tested any of the modifications yet and even it I did the usual applies to BIOS modding. If you kill your stuff due to a mod gone wrong it's not my fault so mod responsibly.

The HD 7990 Buildzoid edition. Since this photo I've replaced all the thermal pads. Changed thermal paste and changed the mounting hardware. I still haven't done the cap mods. In it current state it did this score:

 So it's already pretty fast. However pretty fast is just not enough. Ideally I would like to push the card beyond 1200MHz core maybe all the way to 1300MHz. Now I haven't really tested the limits of this card. Really that 5.9K Heaven score was me taking it "easy" during some late night testing. I expect to hit the card's power limit pretty hard some time soon so I figured I would look into modding the BIOS for MOAR POWER. Since I started digging in the BIOS I figure that I might as well also look into seeing if I could figure out how memory timings on the card work. Here are the notes on that:

HD 7990 reference
Power limit is set with 2 values
Min power
Max power
these are X% apart from the stock TDP. X% is the overdrive power limit slider

EXP
300W BIOS 20% = 360W max and 240W min
300W BIOS 50% = 450W max and 150W min

Power limits are found in this:


D9 00 05 00 E8 03 58 00 00 80 07 00 10 00 00 02 0A 2C 00 00 69 00 DB 00 05 23 01 0A 00 32 01
42 01 3D 03 00 00 C6 16 00 00 58 01 6D 01 73 01 00 00 A1 01 00 00 B3 00 00 00 77 00 00 00 60
EA 00 00 88 01 20 03 00 00 14 00 40

In this stock 7990 BIOS B3(179W) is the Hi limit and 77(119W) the Lo limit. I assume the current limit is also somewhere in that block. If you want to set a value greater than 255W it would look like this:

D9 00 05 00 E8 03 58 00 00 80 07 00 10 00 00 02 0A 2C 00 00 69 00 DB 00 05 23 01 0A 00 32 01
42 01 3D 03 00 00 C6 16 00 00 58 01 6D 01 73 01 00 00 A1 01 00 00 2C 01 00 00 04 01 00 00 60
EA 00 00 88 01 20 03 00 00 14 00 40

This gives you a 300W high limit and a 260W low limit. Which I think will translate into 280W stock.
I find that the original 179W maximum power limit of the reference HD 7990 is kinda concerning especially once you consider that the Vcore VRM for each of the 2 GPUs is only 4 phases. On stock settings the maximum current through them works out to only ~149A at 1.2V. I think 200A is probably safe however without a datasheet for the Volterra power stages I would recommend proceeding with extreme caution while carefully monitoring VRM temperatures(GPU-z supports VRM temps for the ref HD 7990).

MEMORY STUFF

Timing Straps
A timing strap is composed of 48 bytes
98 AB 02 = 02AB98 = 1750Mhz
77 71 33 20 00 00 00 00 31 62 7C 47 80 55 11 11 30 A7 1A 07 00 4C 06 01 22 22 9D 00 6C 0F 14 20 6A 89 00 A0 00 00 01 20 19 12 2F 36 48 28 31 15

C4 7A 02 = 027AC4 = 1625MHz
77 71 33 20 00 00 00 00 10 5A 7B 41 80 55 11 11 2E A5 99 06 00 4C 06 01 22 11 9D 00 6C 0F 14 20 6A 89 00 A0 00 00 01 20 17 11 2B 31 42 26 2F 15

F0 49 02 = 0249F0 = 1500MHz
77 71 33 20 00 00 00 00 CE 51 6A 3B 70 55 10 10 2B A2 18 06 00 4A E6 00 22 00 9D 00 64 0E 14 20 6A 89 00 A0 00 00 01 20 15 0F 27 2D 3C 23 2C 14

1C 19 02 = 02191C = 1375MHz
77 71 33 20 00 00 00 00 AD CD 69 37 70 55 0F 10 29 21 98 05 00 4A E5 00 22 EE 1C 00 64 0D 14 20 5A 89 00 A0 00 00 01 20 14 0E 24 2A 38 22 2A 14

48 E8 01 = 01E848 = 1250MHz
77 71 33 20 00 00 00 00 8C C5 58 31 60 55 0F 0F 25 1E 17 05 00 48 C4 00 22 CC 1C 00 5C 0B 14 20 4A 89 00 A0 00 00 01 20 12 0D 20 25 32 1F 26 13

74 B7 01 = 01B774 = 1125MHz
55 51 33 20 00 00 00 00 6B BD 57 2D 60 55 0D 0E 22 9C 96 04 00 28 C3 00 22 BB 1C 00 53 0A 14 20 BA 88 00 A0 00 00 01 20 10 0C 1E 22 2E 1D 23 12

A0 86 01 = 0186A0 = 1000MHz
55 51 33 20 00 00 00 00 29 B5 46 27 50 55 0C 0D 1E 99 05 04 00 26 A2 00 22 AA 1C 00 4B 08 14 20 AA 88 00 A0 00 00 01 20 0E 0A 1A 1E 28 1A 1F 11

90 5F 01 = 015F90 = 900MHz
55 51 33 20 00 00 00 00 29 31 46 24 50 55 0C 0D 1C 18 A5 03 00 26 A1 00 22 AA 1C 00 4B 07 14 20 9A 88 00 A0 00 00 01 20 0D 0A 18 1B 25 19 1D 11

80 38 01 = 013880 = 800MHz
55 51 33 20 00 00 00 00 E7 AC 35 20 50 55 0B 0D 1A 97 34 03 00 24 81 00 22 AA 1C 00 4B 06 14 20 9A 88 00 A0 00 00 01 20 0C 08 15 19 21 18 1B 11

40 9C 00 = 009C40 = 400MHz
33 31 33 20 00 00 00 00 84 94 22 10 F0 54 09 06 0F 0B A2 01 00 23 80 00 22 AA 1C 00 12 01 14 20 8A 88 00 A0 00 00 01 20 06 05 0B 0C 11 0C 10 0D

Well that's all for this. Hopefully you find this useful for your own BIOS modding. Though to be completely honest I mostly wanted to post this so I would have an easily accessible online backup of my BIOS modding notes because I keep forgetting how I did stuff the last time I modded a certain BIOS. I will probably be posting more notes like this for other cards.

Thursday, April 13, 2017

Some Ryzen power draw data

Ok so I finally have a operational Ryzen system and while the BCLK controls in the BIOS are being stupid I can still do other testing so here is that other testing.

The goal of this data is to figure out how the power draw of Ryzen is split between the cores and everything else like the SOC. For my testing I'm using Asrock's X370 Taichi and a Ryzen 7 1700. I'm not 100% sure how the 12V of the single 8pin CPU power connector is distributed but for the most part it doesn't matter because the focus here is figuring out core power draw in order to be able to gauge Vcore VRM requirements for overclocking.

So here is the data! Do note it is rather rough as far as error margin goes because CPU temperature was not maintained across test runs and higher temperatures do lead to elevated power draw. I didn't bother with more accurate measurement methods because chip to chip variance will cause larger power draw discrepancies than my measurement methods for this data.

Clock Voltage Core Config Power Draw
3.95Ghz 1.45V 4+4 170W
3.95Ghz 1.45V 3+3 135W
3.95Ghz 1.45V 2+2 100W
3.95Ghz 1.45V 1+1 60W
3.95Ghz 1.45V 4+0 100W
3.95Ghz 1.45V 3+0 75W
3.95Ghz 1.45V 2+0 65W
All tests were done with SMT turned on.

From this we can see that 1 core with SMT at 3.95GHz 1.45V consumes roughly 17W. The other things hooked to the 8pin connector pull a constant 30W. I suspect that most of this 30W would be the SOC portion of a Ryzen CPU.

This means that for an 8 core Ryzen chip Vcore current draw at 4Ghz/1.45V is only about 96A. 6 core CPUs would be about 72A and 4 cores only about 48A. Basically that means the Vcore VRM current through put required for your motherboard to not go up in flames is absolutely minuscule. Basically any AM4 motherboard should be capable of doing 4Ghz or more on 6 core CPUs. Motherboards with good 4 phase VRM designs should also have no issue pushing 4Ghz on 8 core CPUs.

Now of course this is only looking at current capability. Better VRMs also have better voltage regulation as well as current through put which means that they may clock a little better for any given voltage just because it won't fluctuate as much as it does on weaker VRMs. However it does mean that for daily OCs you really don't need the insane VRMs that come on boards like the Gigabyte Gaming K7, Asrock X370 Taichi/Professional Gaming or Asus Crosshair 6 Hero.

Also thanks to all the Patreons and shirt buyers for making this article possible!

Tuesday, February 21, 2017

ASUS RX 480 STRIX OC review

Thank you to ASUS for loaning the RX 480 STRIX OC









The backplate adds rigidity and is separate from the main heasink so it's not necessary to remove both if you want to get to the front of the PCB. It does not help cool the PCB but the holes over the VRM section stop it from causing unnecessary heat build up in the VRM area.




Specs
2304 stream processors clocked at 1310mhz
8GB of GDDR5 clocked at 2000mhz on a 256bit bus
1 8pin power connector
Core clock throttling temperature: 85C°
2 HDMI port, 2 DisplayPorts, 1 DVI

Physical Specs
length: 280mm
width: 124mm
height: 2 slots



Cooling


The noise measurements were taken 20cm away from the card on an open air test bench.
The GPU was kept under load by the Graphics Test 1 from 3Dmark Firestrike running windowed using the regular preset windowed at1440x900p with 8xMSAA.
The OC test is 1370Mhz(higher core clocks simply were not stable at high temperatures) core with +50% power limit +100mv using sapphire Trixx.
I test from the lowest fan speed to the highest this means that I actually wait for temperatures to come down this prevents me from getting low temperature measurements by taking measurements too early.

Notes:

The RX 480 Strix I have has a 86% ASIC quality and the core pulls about 20W more than the core of the RX 480 GTR I last reviewed. So even though this card delta temperatures aren't stellar that is mostly due to the power draw discrepancy between the cards.

In the OC test with stock thermal paste the card started throttling at around 86C. This is not due the temperature limit which is at 90C for this card like very other RX 480 I've tested but due to the fact that higher temperatures cause higher power draw and at 86C core temperature that card was pulling well over the 165W power limit that an RX 480 with +50% power has.


An important thing to note is that due to the way the STRIX's heatsink dumps hot air you will want to put some airflow in front of the card's fans or they will end up re-using air that already went through the heatsink which leads to sub optimal thermal performance. If you have good case airflow this should be none issue. If you're not sure I recommend that you try playing with fan placements a little. If you don't want to mess with airflow you'll have to live with the noise of higher fan speed because as far as I know most of the custom 480s have the same problem and the reference card is really really loud if you want get some real OCing done.






The STRIX heatsink is an obvious case of too many fins and too much airflow without enough heat transfer from the core as increasing the fan speed doesn't lead to a very significant drop in temperatures. I suspect this may have something to do with the ASUS direct contact heatpipe design as only 3 heatpipes get proper contact. I went on to try test this theory by attaching thermal probes to the heat pipes and at maximum fan speed there is a 5°C delta between the heat pipe with the most die contact and the outer most heatpipe.

It also worth noting the absolutely massive drop in temperatures going from the stock thermal paste(which judging by the lack of damage to the warranty stickers on the screws was not tampered with) to my current favorite paste Thermal Grizzly Kryonaut. So I highly recommend that you replace the thermal paste on your RX 480 STRIX if the heatsinks seems to under performing. An 8°C temperature drop on RX 480 can make about a 10MHz difference in stable core clock.

AHOC RX 480 Heatsink ranking at a glance:
ASUS RX 480 STRIX
XFX RX 480 GTR
AMD Reference

VRM


Core
- 6 phases controlled by Digi+ ASP1300 (re-branded IR3567B)
- IR3555 PowIRstanges
- Unknown switching frequency
- 360A, 1.5V @ 125°C and 304KHz switching frequency

VRAM
- 1 phase controlled by uP1541R
- 2 low side QM3056 MOSFETs
- 1 high side QM3054 MOSFET
- 40A, 1.5V @ 125°C assuming 500KHz switching frequency
- 67A, 1.5V @ 100°C assuming 500KHz switching frequency

AUX
- 1 phase controlled by uP1541R
- 2 low side QM3056 MOSFETs
- 1 high side QM3054 MOSFET
- 40A, 1.5V @ 125°C assuming 500KHz switching frequency
- 67A, 1.5V @ 100°C assuming 500KHz switching frequency
The RX 480 STRIX ties with the GTR for most powerful Vcore VRM. The GTR achieves it's current rating by using IR Direct FETs. The STRIX with IR3555 PowIRPowerstages. The end result is the same. The VRM on both is ridiculous overkill for an RX 480 which even when pushed into the 1.4V range will only draw around 200A. So the STRIX's Vcore VRM is perfectly good for any kind of overclocking endeavor be that on the card's air cooler, a water cooler or LN2. The VRAM and AUX VRMs are both single phase designs which is the same as you would find on any other RX 480(at least as far as I'm aware). Both are capable of delivering 40A and such are of no concern in any kind of overclocking scenario.

Extras:

0RPM fans in idle 
RGB heatsink lighting
Backplate

Overclocking Tips
The STRIX is compatible with both the Elmor RX 480 Air BIOS and the Elmor RX 480 LN2 BIOS which makes really pushing the card a lot easier as both those BIOSs are signed to work with the AMD drivers and come with a 1.4V and 2V Vcore limit. Which is a significant improvement over the 1.3-1.35V limits that regular RX 480 BIOSs come with. The LN2 BIOS does not have GPU core temperature enabled so do not use it on cards unless you are using LN2.

The STRIX PCB comes with soldering points to mod Vcore Vmem and Vaux on the top edge of the PCB. Vcore control is well supported in software so I wouldn't bother with that one however Vmem and Vaux both lack software support so for those the solder points are very useful. The card does not however have a solder point for the 0.95V VRM so you will have to mod that one the usual way if you want to get rid of the black screens(the card is still running the benchmarks just not outputing to your monitor) that RX 480s get when on running benchmarks on LN2.

Conlusion



Once you replace the thermal paste the card has excellent cooling performance at low fan speeds and noise levels. The VRM quality is also far beyond what any gamer would ever need. So this is certainly a good choice if the card is to spend it's life in your daily system.


The VRM are far beyond what an RX 480 will ever need even on LN2 and as such the card is completely capable of smashing HW high scores sitting on a test bench with an LN2 pot where the heatsink would normally be. The volt mod points are especially a nice touch as they make modify the voltages without software support much easier. The only down side this card has is that it lacks dual BIOS functionality. Which is only really a problem if you intended to do a lot of BIOS modding which carries a high risk of flashing a BIOS that won't allow the card to boot.