The performance of a hierarchical memory is defined by the effective access time, which is a function of the hit ratio and the relative access times between successive levels of the hierarchy. For example, suppose the cache access time is 10ns, main memory access time is 100ns, and the cache hit rate is 98%. Then the average time for the processor to access an item in memory is
Over a long period of time the system performs as if it had a single large memory with an 11.8ns cycle time, thus the term ``effective access time.'' With a 98% hit rate the system performs nearly as well as if the entire memory was constructed from the fast chips used to implement the cache, i.e. the average access time is 11.8ns, even though most of the memory is built using less expensive technology that has an access time of 100ns.
Although a memory hierarchy adds to the complexity of a memory system, it does not necessarily add to the latency for any particular request. There are efficient hardware algorithms for the logic that looks up addresses to see if items are present in a memory and to help implement replacement policies, and in most cases these circuits can work in parallel with other circuits so the total time spent in the fetch-decode-execute cycle is not lengthened.