



















![](_page_5_Figure_0.jpeg)

| E                                             | nter TLP               |                         |
|-----------------------------------------------|------------------------|-------------------------|
| • Again not a new idea                        |                        |                         |
| <ul> <li>been around for &gt; 10 y</li> </ul> | <b>/ears</b>           |                         |
| » <b>Tulisen – UW – 1995</b>                  | publishes the SMT id   | ea                      |
| » TERA MTA & IBM Pu                           | isar show up in late 🤅 | 90's – both MT          |
| • Thread vs. Process con                      | <b>nfusion</b>         |                         |
| • process runs in it's ov                     | vn virtual memory      | space                   |
| » no shared memory                            | -                      | -                       |
| » lots of OS protection                       | & overhead             |                         |
| » communicate via "m                          | essage like channels   | s" – e.g. pipes in Unix |
| <ul> <li>threads</li> </ul>                   |                        |                         |
| » share memory and tl                         | nerefore synchroniza   | tion needed             |
| <ul> <li>both are independent</li> </ul>      | entities               |                         |
| » with their own sets o                       | of registers and proc  | ess state               |
| <ul> <li>TLP difference</li> </ul>            |                        |                         |
| » multiple threads can<br>same processor      | run concurrently or    | interleaved on the      |
| » one at a time and co                        | ntext switch for proc  | esses                   |
| School of Computing<br>University of Utah     | 12                     | <b>CS6</b> 810          |

![](_page_6_Figure_0.jpeg)

![](_page_6_Figure_1.jpeg)

![](_page_7_Figure_0.jpeg)

![](_page_7_Figure_1.jpeg)

![](_page_8_Figure_0.jpeg)

| CPU                    | uArch                                          | Fetch/                     | XU's                      | Clock         | T's &                       | Power   |
|------------------------|------------------------------------------------|----------------------------|---------------------------|---------------|-----------------------------|---------|
|                        |                                                | lssue/<br>Ex               |                           | (GHz)         | area                        | (Watts) |
| Pent 4<br>Extreme      | Spec. Dyn.<br>Issue, deep<br>pipe, 2way<br>SMT | 3/3/4                      | 7 Int<br>1 FP             | 3.8           | 125M<br>122 mm <sup>2</sup> | 115     |
| Athlon<br>64<br>FX-57  | Spec. Dyn<br>Issue                             | 3/3/4                      | 6 Int<br>3 FP             | 2.8           | 114M<br>115 mm <sup>2</sup> | 104     |
| 1 Core of<br>Power5    | Spec, Dyn.<br>Issue, SMT                       | 8/4/8                      | 6 int<br>2 FP             | 1.9           | 200M<br>300 mm <sup>2</sup> | 80      |
| Itanium<br>2           | EPIC, mostly<br>static sched                   | 6/5/11                     | 9 int<br>2 FP             | 1.6           | 592M<br>423 mm <sup>2</sup> | 130     |
| Power5 is<br>large die | s dual core – area<br>size is due to 9 M       | a, T's, powe<br>IB L3 cach | er estimated<br>e on chip | for single co | re                          |         |
| School of Computing    |                                                |                            |                           |               |                             | C86940  |

![](_page_9_Figure_0.jpeg)

![](_page_9_Figure_1.jpeg)

![](_page_10_Figure_0.jpeg)

![](_page_10_Figure_1.jpeg)