Gentoo Logo
Gentoo Spaceship

Get Started
Gentoo Handbook
Installation Docs
Downloads

News
Security Announcements
Monthly Newsletter

Documentation
Gentoo Handbook
Documentation List
IBM dW/Intel article archive
Developer's Manual

Get Gentoo
Downloads
Mirrors

Community
Discussion Forums
IRC Channels
Mailing Lists
Report Issues
Planet (Blogs)
Online Package Database
Contact Us
Supporting Vendors
Sponsors

Get Involved
Report Issues
Help Wanted
Discussion Forums
IRC Channels
Mailing Lists
Offer Resources
Enhancement Proposals (GLEPs)
Source Repositories
Developer's Manual

Other
Developer List
Developer Map
Gentoo Stores
Projects

About
About Gentoo
Philosophy
Social Contract
Name and Logo Guidelines
Logos and themes
Screenshots



AMD's official position: this is not a CPU bug
Posted on 23 Jan 2002 by Daniel Robbins

tux

Yesterday, Rik van Riel, William Lee Irwin and myself were able to discuss this issue of Athlon/AGP instability with AMD. This converation was very enlightening, to say the least.

Let's start with some facts. The issue that is being discussed here is not related to the so-called "INVPLG bug". So, what is the issue, and who's at fault? Yesterday, AMD offered us an explanation of what may be happening to cause AGP instability on Linux systems. This explanation appears to be the exact scenario that bit Windows 2000 in September 2000. AMD characterizes this issue as a cache coherency problem resulting from how other components on the motherboard (the GART in particular) interact with a particular performance-enhancing feature of the Athlon Processor called speculative writes. The GART and the CPU have two different views of memory -- the GART's view does not "see" the contents of the Athlon's internal caches, and herein lies the problem. In certain situations the CPU and the GART can step on each other's toes, resulting in memory corruption. For more information, see my Linux kernel mailing list post that contains an explanation of this particular problem in AMD's own words.

So, is the Linux kernel affected by this problem? It appears so. The GART and the CPU see two different views of memory, and it's the kernel's responsibility to map memory in such a way as to prevent bad interactions. Currently, that isn't happening. Third-party drivers such as the NVIDIA driver for Linux may also have some problems in this area. Now that the Linux kernel development community is aware of the issue, they have started the process of devising a good approach to avoid this memory corruption problem. Disabling extended paging via the mem=nopentium boot option is really not a good long-term solution since it can impact performance, so a better solution is needed.

I guess the big question on everyone's mind is this -- who's to blame for this very real problem? Strangely enough, no one party appears to be at fault here. But now that the problem is out in the open, the solution is clear. The Linux kernel's approach to memory management must become more sophisticated in order to address potential conflicts between the highly-speculative nature of Athlon processors and the non-cache-coherent AGP GART.

I'll be following the resolution of this issue closely and post information here as I get it.




Updated 23 Jan 2002

Donate to support our development efforts.

Support OSL
Gentoo Centric Hosting: vr.org
Tek Alchemy
SevenL.net
Global Netoptex Inc.
Bytemark
Online Kredit Index
Copyright 2001-2009 Gentoo Foundation, Inc. Questions, Comments? Contact us.