the Core 2 has used a level 1 TLB that is extremely small (16 entries) but also very fast for loads only, and a larger level 2 TLB (256 entries) that handled loads missed in the level 1 TLB, as well as stores.
Nehalem now has a true two-level TLB: the first level of TLB is shared between data and instructions. The level 1 data TLB now stores 64 entries for small pages (4K) or 32 for large pages (2M/4M), while the level 1 instruction TLB stores 128 entries for small pages (the same as with Core 2) and seven for large pages. The second level is a unified cache that can store up to 512 entries and operates only with small pages. The purpose of this improvement is to increase the performance of applications that use large sets of data. As with the introduction of two-level branch predictors, this is further evidence of the architecture’s server orientation.