So I thought I would add some information about why these changes to Windows and Linux will cause a large performance hit.
TLDR: The fix involves clearing a hardware cache on the CPU whenever the OS kernel is invoked, and CPU caches are much faster than accessing your RAM, so each time the OS kernel is invoked, it is going to take significantly more time that prior to these patches.
To explain this first I need to explain virtual memory. Basically, virtual memory is a way of letting every running application in a system pretend it has the entire addressable memory space to itself. So say Program A has data stored in physical memory addresses 1113,1114, and 1115, while Program B has data stored in physical memory addresses 4322,4323, and 4324. With virtual memory each program might "see" those addresses as address 10,11, and 12. Thus there is no need for them to worry about where stuff is actually located in physical memory, they only need to know where things are located in their virtual address space. On the CPU there is a module called the Translation Lookaside Buffer. It's job is to act as a cache of mappings between virtual addresses and real physical addresses. The operating system maintains it's own map of virtual addresses to physical addresses, but the TLB is present so that an expensive memory lookup doesn't need to take place every time the CPU needs to check what the actual physical address represented by a virtual address is.
Next, to oversimplify, you need to understand that the CPU can be in one of two modes: user mode or kernel mode. User mode is what your applications run in, while things like the OS kernel and your device drivers run in kernel mode. While in user mode certain CPU functions and memory addresses cannot be accessed. In order access the functionality of the kernel, the application makes system calls, special functions exposed by the kernel that switch the CPU into kernel mode, perform kernel operations like sending data across the network, accessing the disk, etc..., then switching back to user mode and returning to the user process.
Now, what happens in Windows and Linux is that the kernel memory space is mapped into every user processes virtual memory. Because of the kernel mode/user mode access restrictions, the user mode process can't read the portions of virtual memory occupied by the kernel, but they are there. The main benefit of this is that whenever a user->kernel or kernel->user switch occurs, the virtual memory pages don't need to be swapped out, and even more importantly the kernel page mappings are usually present in the CPU's Translation Lookaside Buffer. CPUs are fast, main memory access is comparatively quite slow, so you want these CPU caches like the TLB to be used as much as possible.
Now to the Intel CPU issue. I don't know all the details but it seems to be that one of the pieces of functionality of the CPU, the speculative execution feature. Your CPU doesn't execute instruction A, then when it is done execute instruction B etc... Processing of each instruction takes time and thus your CPU pipelines the instructions. So if each instruction has say 5 phases before it is done executing, instruction A finishes phase 1 and starts phase 2, instruction B starts phase 2, instruction A finishes phase 2 and starts phase 3 and instruction B finishes phase 1 and starts phase 2 while instruction C enters phase 1...etc....etc.... This is a pipelined architecture where the processing of instructions works a lot like an industrial assembly line. Now imagine instruction B was a conditional branch instruction. The next instruction that should be processed after B depends on the final result of B, so how do you pipeline it? Well the CPU tries to predict which way the branch will go, and just starts speculatively executing the instructions down that execution path. If it turns out the branch went the other way it has to discard all that work and start over, but if it guessed right, it already has a bunch of instructions well on the way to completion.
So what is the issue? It seems to be that the Intel CPUs don't enforce the ring 0 kernel mode security features when accessing the Translation Lookaside Buffer during a speculative execution. So if your user mode code tries to access kernel virtual address 10 while being executed speculatively, rather than the CPU throwing up a fault, the TLB will give you the physical memory address associated with it. This lets exploit writers effectively map where stuff is located in the kernel. I'm not sure if this bug allows them to actually access the kernel memory, or if it is just that the TLB will resolve the physical address.
What is the solution, and why does it suck? Like I said before, caches like the TLB are important because memory access is slow compared to the CPU. The solution to the problem is to force a full context switch whenever a kernel system call is made. You remove the kernel from the virtual address space of the user process, and every time a system call happens you have to flush the TLB, and load a new page table for the kernel. These are comparatively slow memory access processes that significantly impact the time taken by each system call. Any process that makes many system calls will be very adversely affected by this. Even overall system performance will be hampered by this however, since hardware interrupts (such as say receiving a network packet) will also force the context switch to the kernel. Hell, pressing a key on your keyboard causes a hardware interrupt, although I don't think any human could ever type fast enough to actually see a measurable difference in performance due to keyboard interrupts.