In order to cache data from main memory the L1, L2 and L3 cache use what’s known as set associative mapping. Here a cache is divided up into a number of 64-byte lines (typically 8192), and broken up into blocks. Depending on if the cache is two-way, four-way, eight-way, or 16-way set associative on a 512K cache there will be 4096, 2048, 1024, or 512 blocks respectively. This associativity (you’re making up words again –ed) specifies how main memory is mapped to the cache – in four-way set associative, each block contains four lines, eight-way eight lines and so on. Main memory is divided into the same number of blocks as the cache, and as data is requested it’s loaded in 64-byte chunks into the relevant block in the cache, with up to two, four, eight, or 16 lines able to be used for that particular memory block. If data is read from memory and its corresponding cache lines are full, one or more of the lines will be emptied to accommodate the new data – but if the CPU requests data from those main memory locations that are mapped in the cache, it can be retrieved from the cache instead of main memory and thus negate the latency of accessing main memory.
By the sounds of it opting for 16-way associative would be better than eight-way, four-way, or two-way as there are more lines to cache more data for a given mapped memory block. But unless the size of the cache also increases, this will increase the size of each block mapped to main memory – so although there are more lines, each block now manages a larger chunk of main memory, reducing the granularity of the cache. Which is why there’s no golden rule here, and both Intel and AMD approach associativity differently depending on their processors.

Typically, L1 caches use two-way set associative while L2 and L3 caches, depending on the processor, use four-way, eight-way, and 16-way set associative.
Hit and miss
A cache isn’t a wonder tool, much though it may seem like one. After all, they are only as useful as the data they contain – if what the CPU is looking for isn’t in a cache, data needs to be fetched from the next cache down and, ultimately, main memory. Data that is found within a cache is called a ‘hit’, while data that is requested and not found is called a ‘miss’. It’s interesting to note that your machine, clean as it is when you turn it on, starts on a miss for all caches until the first data is read in from main memory. After that, it’s a jolly old game of cache management logic attempting to predict what will be needed next while your operating system and programs scatter themselves among main memory and issue instructions while they run.