Recurrent NVIDIA Driver Crashes and Erratic GPU Usage

Discussion in 'Technical & Support' started by ECAR_Tracks, May 9, 2016.

  1. Paul Jeffrey

    Paul Jeffrey Member Staff Member

    Joined:
    May 21, 2016
    Messages:
    10
    Likes Received:
    50
    I've been having these driver crashes for a while now, but only on certain tracks for some reason (Spa and Lemans).

    I've done pretty much everything I can think to rectify the problem, without success.
    I use a GTX90ti card with triple screens. All settings set to low or high makes no difference, same error.

    Specs: AMD FX-9370 8 Core - 32GB DDR3 - Win 7 64bit - Nvidia GTX 980ti - Fanatec CSWv2 - CSPv2 - GT Omega Supreme - Asus 27" x3

    Things I've done:

    The crashes happen at random, no part of the track in particular and no length of time.
    •Sometime it just happens, sometime i get a stutter slightly before the crash
    •Fresh install of rF2 both Steam and non Steam
    •No additional content
    •Happens with and without plugins
    •Happens with "copy & paste" and new controller file
    •With and without vSync
    •With and without locked to 60fps
    •On tripples and on single screen with all settings at lowest (I got loooooads of FPS when i did that!!)
    •Taken out a stick of ram (on recomendation from @Miroslav Davidovic )
    •Did the Max Headlights, Garage Detail & Rearview_Back_Clip adjustments
    •Replays on and off (i always run off)
    •All different graphic settings (i run it very low just to be FPS safe in the race)
    •HDR on
    •Tried disabling onboard graphics card
    •Power management set to Max
    •Literally dozens of drivers :))) installed properly using DDU
    •fullscreen and windowed mode
    •using multiview. Happened also without it enabled
    •deleted shaders
    •lowered res
    •uninstalled anything from my pc i dont need (in case of conflicts)
    •took out the card and put it back in (in case it was seated wrong)

    I think that's about it off the top of my head :)
     
  2. #55

    #55 Registered

    Joined:
    May 23, 2016
    Messages:
    7
    Likes Received:
    3
    I've seen these black screen and TDR Timeouts too - a sim centre I frequent and help out at has 15 simulators, all running NVIDIA GTX 980 and 980Tis. Having 15 PCs means we can run them with AI driving to see what happens.

    We ran a 24 hour race at Estoril and the week before was a nightmare trying to get the pods stable - in particular going into dust/ night time. At night, the failure rate shot up (no crashes during daylight with the sims running for 12 hours, as soon as night time started to appear, we saw 3 pods crash immediately) within seconds of each other. This was the NVIDIA Driver crashing - nvlddmkm stopped responding Event 4101.


    We spent a lot of time working with the simulators isolating as much as possible - some of the changes we made.

    - Windows 7
    - Disable GeForce Experience
    - Disable NVidia Streaming
    - Run card in factory mode
    - 2 different models GTX 980's + 980Tis.
    - TDR Timeout Fix
    - Disable HDR
    - Disable Windows Aero
    - Disable all plugins
    - Disable replays
    - Fresh install of rFactor 2 lite with just the car and the track
    - Update to every driver we could find that was reported stable.
    - Enable max performance and power modes on the GPUs
    - Disable any power/economy modes across the PC.
    - Max Pre-rendered frames to 1
    - Frame rate locks
    - Full screen and Windowed modes
    - Disable all graphical options

    The Bugatti track was tested the other day with the URD BMW Z4 and we saw the same issues relatively quickly - but haven't on many tracks since.

    I'd love to get to the bottom of this, our next 24 hour event is in November, and we'd hate to have to do to 24 hours of daylight again.
     
  3. Christopher Elliott

    Christopher Elliott Administrator Staff Member

    Joined:
    Jul 31, 2014
    Messages:
    4,404
    Likes Received:
    7,092
    Thanks again for the detailed feedback that's being added.
     
  4. hexagramme

    hexagramme Member

    Joined:
    May 25, 2013
    Messages:
    4,242
    Likes Received:
    194
    I haven't had GPU crashes during night time racing like #55 describes, but sometimes the performance drops a lot for me at night though.
    In some conditions (particular car/track combos run at high settings) I would call the GPU performance a bit erratic.
     
  5. Christopher Elliott

    Christopher Elliott Administrator Staff Member

    Joined:
    Jul 31, 2014
    Messages:
    4,404
    Likes Received:
    7,092
    For those having this issue, does it tend to happen more in Multi-player or in Single player sessions?
     
  6. Ho3n3r

    Ho3n3r Registered

    Joined:
    Feb 18, 2012
    Messages:
    527
    Likes Received:
    96
    I had this issue before (GTX 970), and it turned out it was MSI Afterburner causing it. Read somewhere back then that any GPU monitoring software caused it. Uninstalled Afterburner and it went fine - for a while.

    Then I started getting it again (even on rFactor 1), then I noticed a trend: night tracks - Bahrain, Singapore and Abu Dhabi, nowhere else. Disabled some settings, and somehow turning off soft particles and special effects worked. For rFactor 1 I disabled special effects, and it has worked. But others with the same problems weren't able to solve it, however, so it feels a bit random still - but worth a shot for anyone probably.
     
  7. #55

    #55 Registered

    Joined:
    May 23, 2016
    Messages:
    7
    Likes Received:
    3
    Most of my testing was in multiplayer sessions, and we would normally see one PC crash, followed by at least another one pretty quickly.

    I did some testing at home in single player just with AI on the same track, and after 45 mins or so I would see an identical crash especially at night time. I also have a 980 Ti.
     
  8. WhiteShadow

    WhiteShadow Registered

    Joined:
    Feb 16, 2015
    Messages:
    681
    Likes Received:
    3
    Chirstopher from OT`s post post > Max GPU Power usage 3.3 which is maximum power usage to GTX 980Ti and it is as much as any stress test is using (GTX 980Ti) to Max GPU Power. When rFactor2 is using as high Power usage and GPU loads as stress tests are using it is reason why some are experiencing problems with rFactor2.
     
    Last edited by a moderator: May 24, 2016
  9. stonec

    stonec Member

    Joined:
    Jun 19, 2012
    Messages:
    3,238
    Likes Received:
    1,365
    Are you saying that a software which utilizes this GPU to maximum will cause it to crash? It sounds odd to me, shouldn't GPU's be designed to stay stable on max stress level? Otherwise they would probably crash everytime you run a heavy benchmark tool like 3DMark.

    I still think it's something with particular tracks. The Le Mans crash happened since day one on GTX 900 series cards and some other tracks have later appeared with the same problem.
     
  10. WhiteShadow

    WhiteShadow Registered

    Joined:
    Feb 16, 2015
    Messages:
    681
    Likes Received:
    3
    GTX 980Ti has maximum power usage 3.30W. rFactor2 GPU usage is 90-99% and GPU power usage is 3.30W all the time. 3Dmark GPU usage is 70-99% average power usage 2.24W. I suspect that Load spikes is causing rFactor2 crashes with some systems when power usage is maximum all the time.
     
  11. Nuno Lourenço

    Nuno Lourenço Registered

    Joined:
    Oct 19, 2010
    Messages:
    588
    Likes Received:
    57
    I don't know why it happens but what he says make some sense. As I said in other thread, my old PSU caused my system to completely reboot only running some games and rF2 was one of them. I always have fps limit to 60 so my GPU is never at 100% load, and some other games, even taking GPU usage to 99% never crashed... I don't know explain why but, is a fact that a GPU may need more energy running at 80% usage than at 99%...

    Edit: I was using a 780 Ti with a LC Power 850W and now, with a Corsair CS750M I'm running a 980Ti without any problems.

    Edit 2 (Important feedback) : I can remember that, some months ago, when I was trying to find out why computer reboot for itself, I was using a Nordschleife conversion in DevMode and sometimes, after 10 seconds it causes PC to reboot. Initially I suspected that was some kind of object that was causing that so, I started removing them all in groups. After a lot of time, I find out that reboots stoped after I remove all reflections settings in the end of SCN file. So, without reflection on roads and cars, I could play it without any problem. Now, with new PSU, the exact same track, play smooth for hours with a bunch of cars in it and reflections working well.... Even having something not good in that LC Power, something in that reflections make the problem a lot worst...
     
    Last edited by a moderator: May 24, 2016
  12. Natureboy

    Natureboy Registered

    Joined:
    Jan 13, 2013
    Messages:
    117
    Likes Received:
    0
    I don't know if this is could help, but both of my computers (one with 2 770s other with single 660) will always crash in rF2 if I do not run a custom fan profile. I set fan to reach max speed at 60 deg in afterburner with no other changes and start that profile before running rF2. Then, temps stay in the low 60s and GPU usage stays at 100 all day long. If I do not do this it's like the fans don't spin up until it's too late and the 660 will crash after a few minutes, the 770s will crash at some random point even after a few hours. For the 660 it is very apparent that once the GPU starts throttling back on power due to high temp it becomes very unstable. The 770 problem is a bit more difficult to find because it works fine for a long time then very quickly crashes in a way that I could not log data.

    The 660 error always shows up as an nvidia driver failure and I may even lose the monitor and have to go through that annoyance. The 770 crashes can shut the whole computer off and look like a PSU problem. It took some time to figure this out especially since I was using only the 2 770s at first. When I started using the 660 the temps and crash were very easy to see and the same fan fix worked for that computer also.
     
  13. kermit

    kermit Registered

    Joined:
    Jan 11, 2012
    Messages:
    19
    Likes Received:
    0
    Just to chime in on this thread.

    I have yet to experience any crashes in RF2 on any track combo so far. no TDRs. I originally noticed a lot of stuttering but found that using a frame lock in the PLR of 62fps (no Vsync) seem to cure my stutters.

    I do wonder if anyone is running in windowed mode or any fps monitors such a MSI afterburner and so fourth(mentioned above), as in the past i have noticed these software can cause problems.

    I am currently a few version behind on the Nvidia drivers as i know some were causing issue in general, however i will upgrade for testing purposes.

    the only time i notice a spike in performance is when 30 cars come in view with track side cameras (lod issue?)

    If anyone can recommend a test scenario for me to try to induce a crash i am more then happy to try.

    My specs are as below

    GPU: GTX970 G1 @stock
    CPU: I7-4770k @4ghz
    MOBO:Z87x-ud3h
    PSU:850 gold rated.
    OS: WIN10

    Cheers.
     
  14. WiZPER

    WiZPER Member

    Joined:
    Oct 5, 2010
    Messages:
    1,521
    Likes Received:
    186
    I'm sure of it, many track have absurd LOD-multipliers for TV CAMs - VLMs Le Mans uses up to 8x, say you have the LODA of a car to be rendered at 0-50 meters, that will suddenly be 400 meters instead, now add up how many cars you can potentially have within this distance, being rendered as HIGH poly objects. Not to mention that all max LOD values will also be multiplied, say LODC is 500 meters and 8x that...

    Changing these does NOT fix these 900-series issues though, we've done extensive testing at VEC, so far there is absolutely no solid pattern in the crashes.
     
  15. WhiteShadow

    WhiteShadow Registered

    Joined:
    Feb 16, 2015
    Messages:
    681
    Likes Received:
    3
    Crashes happen also with ISI tracks not only with 3pa tracks. I have two rigs, Z87 MBO, i7-4970k which never crashes with GTX 980Ti and X99 MBO, i7-5960X which crashes every in track ISI or 3pa with GTX 980Ti. Both with same driver and win10 64 bit. To me it looks like crashes are hardware related and it seem to be GPU power issue not PSU issue.
     
  16. WiZPER

    WiZPER Member

    Joined:
    Oct 5, 2010
    Messages:
    1,521
    Likes Received:
    186
    Also, not only RF2 users are affected by this, 9xx is FUBAR, sad to hear reports about 1080 carrying over the issue - very happy Zotac 780ti AMP user !! But was hoping for an upgrade...
     
  17. kermit

    kermit Registered

    Joined:
    Jan 11, 2012
    Messages:
    19
    Likes Received:
    0
    Yeah i can see how this could cripple performance on any card really.

    I do admit the 900 series is probably the worst series in regards to general issues i have seen in awhile. drivers have been a mess of recent, and not to mention the 3.5GB farce that nvidia threw down our throats with the 970..

    Surely there has to be some common pattern in this.

    Just did a quick test to check GPU loads Most it hits is 98% but it does fluctuate to the 80 and lower when frames are more steady. with a 3.1gb usage.the main difference i have compare to OP is that my GPU power usage is somewhere in the region of 65-70%, i guess this could be due to my FPS cap.
     
  18. Tuttle

    Tuttle Technical Art Director - Env Lead Staff Member

    Joined:
    Feb 14, 2012
    Messages:
    2,480
    Likes Received:
    773
    It is clear there's a pattern for the 900 series. Nvidia had loads of RMAs in the early RTM phase, because of this problem (timeouts, gpu reboots, driver crash), which happens with very different situations and software. Just google "GTX 980 TDR" and you'll find hundreds of threads about any games, any software, any situation, getting 900 series (especially 980Ti/980) GPU reboots/TDR. Not saying rF2 does not demand more or less than other applications, but this should not really end in a GPU crash. I personally work the entire day, 365 days per year, with the rF2 engine at max settings, and I didn't get a single TDR with my GTX780 stock, nor with my old HD6870 in the low end test machine, nor with the R9M200X on a test laptop.

    I've the impression people who are "solving" the problem deprecating specific assets in a track, are just avoiding their GPUs to go in that "stress" area which seems handled in some erratic way. Stuff I'm reading here tends to confirm my theory.

    Said that, we are still trying to understand if there is some pattern of variables to replicate the problem, but really looks something happening at low level to me, hardware and/or driver level, which is for sure not easy to sample.
     
  19. #55

    #55 Registered

    Joined:
    May 23, 2016
    Messages:
    7
    Likes Received:
    3
    Hi Tuttle,

    Are there any specific tests or suggestions that we as a community can try to help you collate data to see if there is a way this can be resolved with a workaround? I've got access to a variety of machines running GTX 970s, GTX 980 and 980Tis? I'm sure others would be happy to spend a few moments of their time to help alleviate this issue.
     
  20. lamck

    lamck Registered

    Joined:
    Jun 9, 2013
    Messages:
    45
    Likes Received:
    2
    That triggered me to reduce AI to ~15 + no gpu overclock, no crash running ten+ laps at The Bugatti Circuit Update v0.95, will try again tomorrow.

    When using ~33 AI,
    Crash in 3-4 laps at The Bugatti Circuit.

    Crash in 5-10 laps at Le Mans 1991-1996, even frame rate limited to 75.

    980Ti SLI, triple mon.

    Edit: Try 11 AI at Le Mans, still crashed in ~5 laps.
     
    Last edited by a moderator: May 28, 2016

Share This Page