Fiction | Overcoming the limitations of computer memory




Fiction | Overcoming the limitations of computer memory

Today we continue the story of computer memory, which began the story of its hierarchy. Today, processors are able to work with incredible speed. But all their power rests in those limits are sluggish memory. If it was not designed by engineers techniques to overcome these annoying restrictions, not so much powerful processor would work as expected requested from the storage media. And in the development of fast and powerful chips would have no meaning. Of course, computer professionals are already aware of this, but for the millions of sincere lovers of high-tech material that may be informative.

Consider all the world’s species of storage devices in the overview is not possible. Therefore, we will focus on the types of memory, which is equipped with the majority of personal computers in the world today. It is a question of the first cache memory (L1) and second (L2) levels; system RAM and virtual memory, of course, the hard disk. Why are so many conventional computer memory the most diverse? To answer this question, we need to talk a little bit about what to serve every type of computer memory.
Slow and cheap virtual memory on your hard drive

The modern computer has a powerful processor. But he will lose all meaning if the data store to which it refers will be slow. If the processor has had to deal with a slow memory, most of my time was spent waiting for the response to the storage device. The current processor needs billions of bytes of data per second.

Only when the industry has overcome the 1-gigahertz processor speed line, she was faced with a problem: memory, which is able to cope with the demands of the mighty processor is very expensive. But no obstacle in the way of technological progress can not long delay its progress. The solution was found: a small amount of fast and expensive memory combines modern machines with more intensive, and less expensive storage media.

The cheapest (and very slow) type of rewritable computer memory is the hard drive. The speed of access to it is small, but it allows for reasonable money to create a large permanent storage of information. The cost of storing one megabyte of data on the disk is negligible. But at the time, and then to consider this megabytes required significantly more than if it was located on the more expensive (and fast) support. Since the hard disk is cheap and capacious, it forms a lower level of the memory hierarchy to which the processor accesses, called virtual memory.

A step higher in the hierarchy is random access memory (RAM, RAM). We have previously discussed how this type of memory. But in our previous story some details remained behind the scenes, which have less to do with RAM as the CPU and its possible interaction with the RAM. Today we shed light on these details.

Bitnost processor shows us that to how many bits of information placed in memory, it can be accessed simultaneously. As an example, consider the legacy 16-bit processors. They were able to operate simultaneously with two data bytes (1 byte = 8 bits, so 16 bits = 2 bytes). Consequently, the modern 64-bit processors are turning to 8 bytes of data at a time.

In megahertz (MHz, millions of cycles per second) and gigahertz (GHz billions of cycles per second) is measured speed data processor. That is how many cycles (cycles) treatment he is able to execute in one second. In order not to sink into an unimaginably huge numbers, as an example, take an old (and once represented the peak of perfection and power) Pentium III processor with a clock frequency of 800 megahertz per second. Its 32-bit architecture means that it can operate simultaneously with the four bytes of information. Not impressed? But he could not do this operation 800 million times per second. Before the RAM is not an easy task to keep up with the processor to keep up with him and provide him with information on time. Otherwise, all the amazing opportunities the chip will be idle waiting for the next portion of bytes.

System memory computer alone with this problem can not cope. What we need is one more type of ultra-fast data warehouse: a cache (which we will discuss below). But, of course, the faster the memory, the better for the system as a whole. Read and write memory depends on the type of RAM that is used in the computer. So let us return to the main memory, but this time look at its specifications.
System memory

Speed ​​is determined by the system memory bus bandwidth it. Bus bandwidth, in turn, determined by the number of bits that can be simultaneously sent to the CPU. How many times per second a set of bits can be sent to the processor? Answer this question with a number called the rate of the tire. A cycle is every transfer of data processor RAM.

For example, 32-bit 100-MHz bus theoretically capable of simultaneously transmitting information of 4 bytes (32 bits divided into 4 bytes = 8) 100 million times per second. 66-MHz 16-bit bus can simultaneously transmit only 2 bytes 66 million times per second. Simple math shows us that the first bus is superior in terms of a second every second processor to transmit information about triple (400 million bytes versus 132 million bytes).

But in reality, memory, of course, does not usually work at full capacity. And now it is time to introduce another term that also imposes limits on the actual speed of data transmission. Latency memory indicates the number of cycles required to read data bit. For example, memory clocked at 100 MHz, it would seem, is capable of transmitting bits on one hundred millionth of a second. In fact, five hundred millionth of a second will take to start the process of reading the first bit. To reduce the influence of the factor of memory latency, the processor uses a special technology called packet-monopoly (or a group of monopoly) mode (burst mode).

Memory expects data located in its particular cells would be requested by the processor. Therefore, the memory controller reads the multiple bits of data located in the memory at certain addresses. This means that the latency caused by the delay in reading the full affect only the first bit of information. Reading these bits will take significantly less time. Characteristics of group memory mode commonly referred to as four numbers separated by dashes. The first number tells us the number of cycles required to implement a read operation. The second, third and fourth numbers indicate how many cycles required to read each bit in the next group.

For example, the string “5-1-1-1” is able to tell us that for reading the first bit requires five cycles, one cycle for reading each successive bit of information. The lower the number, the better the memory.

Group mode is often combined with other means of reducing the effects of latency, the so-called pipelining. This method organizes the data sets in a kind of conveyor processes. The memory controller parallel reads from memory one or more words, the processor sends the current word, or words, write one word or more in memory. Group mode and pipelining used in conjunction to significantly reduce the retarding effect of latency.

The reader may ask: why did not immediately get the fastest memory with the highest possible in the modern world throughput? But it shall come into force a constraint system bus of PC’s motherboard. Of course, you can put the 100-MHz memory on the motherboard with a 66MHz system bus. But its speed limit will still be 66 MHz per second. And you will not receive any benefits. And 32-bit memory does not match the 16-bit bus.

And even with very fast memory bandwidth is still not up to the speed at which the processor is able to process the data. That’s what is needed extremely fast cache memory.
Cache memory and processor register

To close this weakness, developed the so-called cache memory. It is very small, but it’s very fast. For example, the processor AMD Jaguar boasts only 32 kilobytes of first-level instruction cache and 32 KB data cache pervourovnevogo. Based on this processor the latest gaming consoles today. Including Xbox One, a guest of the creators who recently visited our readers. The cache memory of the first level (L1) in modern processors typically ranges: from 2 to 64 kilobytes.

The secondary cache, or the cache memory of the second level (L2), can be technically implemented as located near the processor, and is directly related to him the card. A special chip on the motherboard – L2-controller (controller cache in the second level) – regulates the use of this memory is the central processor of a computer. Depending on the processor model, the size of the cache in the second level can range from 256 kilobytes to 2 megabytes. But technology is developing very fast, so from this “rule” may occur and exceptions. We processor Jaguar just 2MB cache L2. About the reasons for this is the chip was the basis for a game console PlayStation 4 in early April 2013 told her designer.

A modern computer is arranged so that 95% of the processor receives data from the cache without the need to refer to the more slow storage media. Some low-cost system does not L2-cache. But in many high-performance processors, cache memory of the second level is built directly into the chip. The size of the second-level cache and its integration directly into the CPU are the most important factors influencing the performance of the processor.

Technically, the cache is static random access memory (SRAM). In this type of memory cell is formed by each of its several transistor: usually they can be from four to six. It has an external gate array, called the edge with two stable states (bistable multivibrator, bistable multivibrator). With it being implemented to switch between the two states. This means that the memory of this type does not need (unlike dynamic memory DRAM) constant updating.

Each cell is capable of retaining its data posted during any extended time. As long as the power is turned off. Due to the lack of need for constant updating, SRAM operates at a fast pace. But the complex structure of each memory cell of this type makes it too expensive to use as a standard computer’s memory.

The static cache memory is asynchronous and synchronous. Synchronous SRAM is designed so that it corresponds exactly to the speed of the processor speed. What can be said about asynchronous cache. This seemingly small difference affects performance. We will not go into the technical details, let us just say that a synchronous memory is preferred.

At the top of the hierarchy of computer memory is a processor register. It is built directly into the chip and contains special – the necessary processor – the data: arithmetic and logical expressions. As an integral part of the processor, its register is managed directly by the compiler that sends information to the CPU for processing.

The reason for the consideration of the interaction of computer memory and CPU on the examples of legacy systems is the relative ease of computation. The fact is that modern computers are based on multi-core processors. The principle of treatment of the processor to the memory remains the same and everyone can, if desired, to make their own calculations for the most complex of modern systems. Within a brief overview of these calculations would be as a heap of information complicates the material, but not changing it in anything new.

According to the materials computer.howstuffworks
Back # computex | Visioning Intel: main points of the presentation
Next Galaxy S4 Active: «not killed,” the flagship of Samsung
Tags: Memory modules , Memory , Processors .


Tags: , ,

In: Technology & Gadgets Asked By: [15576 Red Star Level]

Answer this Question

You must be Logged In to post an Answer.

Not a member yet? Sign Up Now »