Personality Cafe banner

1 - 3 of 3 Posts

21,040 Posts
Discussion Starter #1
Jason Perlow

The new year has indeed started out with a bang for the computer industry.

Two highly publicized security flaws in the Intel x86 chip architecture have now emerged. They appear to affect other microprocessors made by AMD and designs licensed by ARM.

And they may be some of the worst computer bugs in history -- if not the worst -- because they exist in hardware, not software, and in systems that number in the billions.

These flaws, known as Meltdown and Spectre, are real doozies. They are so serious and far-reaching that the only potential fix in the immediate future is a software workaround that, when implemented, may slow down certain types of workloads as much as 30 percent.

In fact, the potential compromise to the affected systems is so widespread that the flaws are exhibited in the fundamental systems architecture of the chips themselves, and they may have been around in some form since 1995.

That's going back to when Friends was the hottest show on TV. And I still had hair and was freshly married. Oh, and most of us were using Windows 3.1.
The bloodline has to die out entirely

Without going into detail about exactly how these flaws present themselves -- because the explanation is highly technical in nature and you need to be a chip weenie to really grok it -- let's just say that they exploit certain basic functions used by modern microprocessors to optimize performance.

Read also: Massive Intel CPU flaw: Understanding the technical details of Meltdown and Spectre (TechRepublic)

It's very much analogous to DNA. DNA provides the blueprint and firmware programming for how an organism functions at a very basic level.

If you have a flaw in your DNA, you have a genetic disease. You can try to mitigate it with various treatments and medications, but you can't really cure it. Well, you have stuff like CRISPR, but there's no hardware equivalent to that.

Essentially, the only cure -- at least today -- is for the organism to die and for another one to take its place. The bloodline has to die out entirely.

The organism with the genetic disease, in this case, is Intel's x86 chip architecture, which is the predominant systems architecture in personal computers, datacenter servers, and embedded systems.

Ten years ago, I proposed that we wipe the slate clean with the Intel x86 architecture. My reasoning had much to do with the notion that, at the time, Linux was gaining in popularity and the need for continuing compatibility with Windows-based workloads in the datacenter and on the desktop (ha!) was becoming less and less of a hard requirement.

What has transpired in 10 years? Linux (and other related FOSS tech that forms the overall stack) is now a mainstream operating system that forms the basis of public cloud infrastructure and the foundational software technology in mobile and Internet of Things (IoT).

Virtualization is now widespread and has become standard business practice for large-scale enterprise systems' design and scalability.

Read also: Tech giants scramble to fix critical Intel chip security flaw

Containerization is now looking to augment and eventually replace virtualization for further growth and improved security in a multi-tenant, highly micro-segmented network future driven by DevOps and large-scale systems' automation.

Since 2008, Microsoft has since embraced open source and is successfully pivoting from being the Windows company to being the Azure/Office 365 cloud company that writes cloud exploitive application software for not just Windows, but also Linux, iOS, and Android.

All these advances are not necessarily tied to compatibility with x86. If anything, they potentially free us from writing this type of dependent code because of the levels of abstraction and portability that we now have at our disposal.

Despite these advances, our dedication to this aging but beloved pet -- the x86 systems architecture -- has not waned. We have been giving it all sorts of medical treatment over the years, now going on four decades, to keep it alive.

The question is not so much should we put Old Yeller Inside to sleep. It's what breed of puppy do we replace him with? Another purebred prone to additional genetic defects? Or something else?

We need to stop thinking about microprocessor systems' architectures as these licensed things that are developed in secrecy by mega-companies like Intel or AMD or even ARM.
Sun had the right idea

In 2008, when I wrote the precursor to this article, the now-defunct Sun Microsystems -- whose intellectual property assets are owned today by Oracle -- decided to open-source a chip architecture, the OpenSPARC T2.

The concept at the time did not exactly fly and didn't get any real takers. What has since happened to Sun in its absorption by Oracle has been less than pleasant for all the parties involved, and given the extremely litigious nature of the company, it is understandable why nobody has latched onto OpenSPARC.

Read also: Windows Meltdown-Spectre patches: If you haven't got them, blame your antivirus

However, despite the history, I think Sun had the right idea at the time. We need to develop a modern equivalent of an OpenSPARC that any processor foundry can build upon without licensing of IP, in order to drive down the costs of building microprocessors at immense scale for the cloud, for mobile and the IoT.

It makes the $200 smartphone as well as hyperscale datacenter lifecycle management that much more viable and cost-effective.

Just as Linux and open source transformed how we view operating systems and application software, we need the equivalent for microprocessors in order to move out of the private datacenter rife with these legacy issues and into the green field of the cloud.

Read also: Intel's new chips: Low-power, lower-cost Gemini Lake CPUs for PCs, 2-in-1s, laptops

This would have more benefits than just providing a systems architecture that can be molded and adapted as we see fit in the evolving cloud. The fact that we have these software technologies that now enable us to easily abstract from the chip hardware enables us to correct and improve the chips through community efforts as needs arise.
We need to create something new

Indeed, there are some risks, such as forking, which has been known to plague open-source systems -- but, more often than not, it creates an ecosystem of competition between the well-run communities and the bad ones.

And, more often than not, the good ones emerge as the standards that get embraced.

I cannot say definitively what architecture this new chip family needs to be based on. However, I don't see ARM donating its IP to this effort, and I think OpenSPARC may not be it either.

Perhaps IBM OpenPOWER? It would certainly be a nice gesture of Big Blue to open their specification up further without any additional licensing, and it would help to maintain and establish the company's relevancy in the cloud going forward.

RISC-V, which is being developed by UC Berkeley, is completely Open Source.

The reality is that we now need to create something new, free from any legacy entities and baggage that has been driving the industry and dragging it down the past 40 years. Just as was done with Linux.

Do we need a new open-source microprocessor architecture for the cloud-centric future? Talk Back and Let me Know.
Previous and related coverage

Major Linux redesign in the works to deal with Intel security flaw

A serious security memory problem in all Intel chips has led to Linux's developers resetting how to deal with memory. The result will be a more secure, but -- as Linux creator Linus Torvalds says -- slower operating system.

Critical flaws revealed to affect most Intel chips since 1995

Most Intel processors and some ARM chips are confirmed to be vulnerable, putting billions of devices at risk of attacks. One of the security researchers said the bugs are "going to haunt us for years."

116 Posts
Meh... speculative execution has been a popular performance optimization, and that combined with caching caused these two vulnerabilities. It's not clear to me that an open source chip design would have prevented the vulnerability.

Nor is it clear that an open source chip design would have sufficient funding to compete with Intel/AMD/etc. Squeezing more performance out existing chip technology has proven challenging even for companies with huge budgets. ARM comes closest to the open source approach, since they make money by licensing their chip designs (at millions of dollars a pop).

I'm not against the idea of an open-source chipset, but I don't see that it would have prevented this particular set of issues or how it would work long term in practice.

Also, I disagree that "you have to be a chip weenie to grok" the issue. I'm a programmer, and it's pretty easy to comprehend.

21,040 Posts
Discussion Starter #3

Meltdown and Spectre Expose the Dark Side of Superfast Computers

Larry Greenemeier
Hundreds of gadget makers and software companies at this week’s annual Consumer Electronics Show (CES) in Las Vegas are staking the success of their newest products on the latest and greatest processors from Intel, AMD, Arm and others. But those bets are looking shaky even by Sin City’s standards, after last week’s bombshell that many of those processors are plagued by serious security vulnerabilities known as Meltdown and Spectre.

Processors lend a degree of intelligence to just about any electronic device—including the thousands of automobiles, home appliances and gaming systems displayed at the exhibition. It is now clear that the insatiable need for faster processors has had a dark side, as chipmakers cut corners on security, exposing potentially billions of personal computers, mobile devices and other electronics to a new crop of digital attacks for years to come.

Every computer relies on a piece of software known as a kernel to, among other things, manage the interactions between end-user applications—spreadsheets, Web browsers, etcetera—and the underlying central processing unit and memory. The kernel starts and stops the other programs, enforces security settings and restricts access to a device’s memory and data resources. Not surprisingly, the kernel’s speed determines how fast the computer performs as a whole. Chipmakers protect the kernel by isolating it from other programs running on the computer, unless those programs are given specific permission—or “privilege”—to access the kernel.

Meltdown dissolves that isolation, potentially letting an attacker’s malicious software breach the kernel and steal whatever information it finds there—including personal data and passwords. Spectre impairs the kernel’s ability to stop a malicious program from stealing data from other software that uses the processor. Researchers working independently at Google Project Zero, security vendor Cyberus Technology and Graz University of Technology in Austria coordinated their announcement of Meltdown last week. Spectre’s existence surfaced around the same time, courtesy of investigations carried out separately by multiple educational institutions, cybersecurity research firms and noted cryptographer Paul Kocher. (pdf)

Kocher, best known for his work helping revise a set of cryptographic protocols used to secure computer networks, spoke with Scientific American about how shortcuts for increasing processor speeds led to both vulnerabilities, why it took decades to find these bugs and how to protect computers from attack.

[An edited transcript of the conversation follows.]

How did chipmakers compromise computer security to create faster processors?

The underlying issue is that processor clock speeds have largely maxed out. If you want to make a processor faster, you have to get more work done per clock cycle. The other thing that isn’t changing much is the speed of memory. Optimizations, then, become the key to speed increases when designing a computer processor. For example, if your processor comes to a spot where it’s waiting for information from memory, you don't want to have the processor sit idle until the data come back from memory. Instead, the processor can speculate on the information it will receive and begin working ahead rather than waiting. When the processor guesses right, it gets to keep this extra work; the processor’s speculative execution gives a significant performance boost. Under normal circumstances, the percentage of work that is lost because the processor makes the wrong guess and has to backtrack is in the single digits. This optimization has been part of the standard playbook of how to make a fast processor for many years.

What vulnerabilities did speculative execution create?

Meltdown and Spectre both involve this shortcut, but they work in different ways and have very different implications. Meltdown leverages an issue where ordinary unprivileged code can read memory with kernel permissions. This lets an attacker who can run software on the computer read the entire contents of the physical memory—which, for example, is a big problem in cloud servers where multiple clients share the same server.

Spectre, in contrast, doesn’t involve any privilege escalation issues. Instead it takes advantage of permissions that the code being attacked already has, but tricks the user’s system to do something speculatively that the program would never have done legitimately, and doing so leaks memory contents.

What prompted your research that ultimately uncovered Spectre?

The timing was that I had sold my company (Cryptography Research) and had time to get my hands dirty doing research. What originally got me working on this was the question: Where have we made tradeoffs between performance and security, and added complexity, in ways where security was not the top priority? I was completely unaware of Google Project Zero’s work on this until after I reported the issue and fully implemented the exploits in my [research] paper. The timing is a coincidence, although it’s actually more surprising these vulnerabilities weren’t found a long time ago.

The fact that Meltdown was discovered by several groups of researchers working independently makes more sense—partly because there was a post online by [security researcher] Anders Fogh, in which he had investigated the issue but did not find the problem. He then published a description of his work, which got other people thinking about the problem. There was also work on a patch set named KAISER (short for Kernel Address Isolation to have Side-channels Efficiently Removed) to address attacks against KASLR (kernel address space layout randomization—a security technique to make exploits harder by placing various objects at random, rather than fixed, addresses). That turns out to the fix for Meltdown, so it also drew attention to the issue.

Why didn’t Intel or the other chipmakers discover the problem first?

That’s a fair question. The mess they’re dealing with right now is vastly more painful than what they would have gone through if they had found and fixed the problem immediately. They would have been in a much better position than external researchers to find these vulnerabilities, since they know in intimate detail how all of the technology works—whereas I was poking around without any inside knowledge. For Spectre in particular, it’s also worth asking why it wasn’t found by Arm or more generally computer scientists who were teaching speculative execution in microprocessor academic courses. One answer might be that people tend to look at the things they know to look at, and Spectre involves a problem that cuts across different disciplines, where people working on one aspect of a technology don’t know as much about other aspects.

Still, if a few security people had looked closely at speculative execution and considered different ways it could go wrong, I think a significant number of them would have realized that it was a dangerous idea. I suspect that a lot more unpleasant things are lurking under the covers that we’re not seeing because questions about the security implications are not being asked.

What does this mean for the future of chip design?

I think there’s a bit of a fog of war right now makes it hard to come up with a clear answer. There are different, very high-level things that chipmakers can do. One of them is to leave it to software developers to deal with complex countermeasures, but I think this will largely fail since developers are not equipped to do this. Another is to build chips that combine cores with different security properties, and have separate execution units that are faster versus safer execution units on the same die. If you’re playing a game on your mobile device, the big processor core is running. If your phone receives a data packet while it’s asleep, the smaller core handles that.

Most of the really critical security applications don’t need a lot of performance. If you’re doing a wire transfer, for example, it doesn’t take a lot of [computing] power to get the user’s confirmation, to do the cryptography and to transmit the result. If you’re playing a video game, you really don’t need that kind of security but it does need the best performance. Still, it’s too early to say what things will look in 10 or even five years from now.

How do the patches for these vulnerabilities slow a processor’s performance?

The fix for Meltdown changes the way kernel memory is mapped. The way things worked before, the switch between normal user code and the kernel was a very lightweight transition. With the patches, more work has to be done on these transitions. The impact on performance will depend on the type of workload. If almost all of a program’s [number crunching] is done in user mode, with very few switches to kernel mode, you won’t see any significant impact. In contrast, if you have a piece of code that spends a lot of its time quickly switching back and forth between user and kernel modes—such as reading very small bits of information from files stored on a very fast disk—there ends up adding a lot of overhead.

How does the Spectre fix work?

Spectre can only be mitigated—not fixed—at this time, because the flaw impacts a lot of different software including operating systems, drivers, web servers and databases. Some of this software—such as drivers—is rarely updated. A lot of the software is also very complex, so developing fixes is a Herculean issue. For some CPUs there are also some microcode patches that help mitigate one of the aspects of Spectre, but there is a staggering amount of work required to get all this to work correctly. Even worse, Spectre involves things that are very hard to detect in the processor, making testing of countermeasures really, really difficult. Some chips will just have to be replaced because they can’t be updated. As a result, this is an issue that’s going to be with us for a long time.

How can computer users protect themselves?

On the Meltdown side, there are a number of fixes that can be implemented at the operating-system level—for example in MacOS, Windows, Linux, and iOS operating system updates. If you look at the trajectory of Meltdown, there are certainly attackers who can and will attack unpatched computers, so it’s urgent to get that patch applied. Once the patch is installed, so far as we know, Meltdown ceases to be a security threat.

With Spectre, on the other hand, there are partial mitigations. Some chips can’t be updated, others could be updated but the company that made the product with the chip won’t bother to pass the updates along. Often the updates won’t address all of the problems. Still, it’s important to keep the risk in perspective. There are plenty of other security threats that we live with every day. Spectre isn’t necessarily more dangerous than other vulnerabilities already causing security pros enormous headaches. With Spectre you’re tricking the processor into making wrong predictions at a particular spot in the software running on the victim’s computer. The attackers need to have awareness of the software code that the victim is using, and set up the right conditions for the attack to work. As a result, there isn’t a single attack program that will work across many computers.

What is the most important take-away message from the discoveries of Meltdown and Spectre?

The bigger picture relates to how we’re approaching security, and our inability to make systems secure even in the places where security is most needed. These bugs are a symptom of that problem. When you optimize for objectives—such as speed—that interfere with security, you can reasonably expect that you’re going to end up with problems. Spectre is a very clean example of a security/performance trade-off, where speed optimizations led directly to security problems. The fact that these security vulnerabilities affect all of the major microprocessor manufacturers really indicates that there has been a failure of thought and attention, rather than specific error that an individual or even a single company has made.

Larry Greenemeier
Larry Greenemeier is the associate editor of technology for Scientific American, covering a variety of tech-related topics, including biotech, computers, military tech, nanotech and robots.
Credit: Nick Higgins
1 - 3 of 3 Posts