# OCP-IP Network-on-chip benchmarking workgroup

# Erno Salminen, Tampere University of Technology Krishnan Srinivasan, Sonics Inc. Zhonghai Lu, Royal Institute of Technology.

#### December 2010

# ABSTRACT

This article presents a summary of the work and infrastructure developed by the OCP-IP Network-on-Chip benchmarking workgroup. Network-on-chip (NoC) is an emerging paradigm for interconnecting resources in a complex integrated circuit. However, despite numerous published research efforts, no common NoC benchmarks are yet available. Unfortunately, this makes it very hard to compare NoCs in fair manner and to repeat experiments done by other researchers. To solve these problems, OCP-IP has formed a workgroup having members from academia and industry. Since 2007, the workgroup has produced 2 tool releases, 2 specifications, and several white papers as well as other articles.

# 1.Introduction

Multiprocessor-system-on-chip (MP-SoC) devices [1,2] integrate multiple processing elements, memories, peripherals, and offchip interfaces into a single silicon chip. This allows higher performance with reasonable power consumption, which is critical in mobile devices but also in many other embedded systems. Efficient parallel processing is very demanding for the interconnect network that is utilized inside the chip; this is also called a Network-on-chip (NoC) [3-6]. The practical implementation and adoption of the NoC design paradigm faces multiple unresolved issues related to design methodology/technology and analysis of architectures, test strategies and dedicated CAD tools. Although benchmarking has a long tradition in CPU and complier design, see for example [7-8], NoC benchmarking has been lacking so far. To advance and accelerate the state of the art of the NoC paradigm R&D, the community is in need of widely available reference benchmarks.

Open Core Protocol International Partnership (OCP-IP) [9] is dedicated to proliferating a common standard for intellectual property (IP) core interfaces, or sockets, that facilitate "plug and play" System-on-Chip (SoC) design. Making complex SoC design more efficient for the widest audience, OCP-IP provides the tools and services to its members that are necessary for convenient implementation, maintenance and support of the standard OCP socket interface. There are several workgroups each concentrating on a certain topic, such as socket specification, system-level design, debug, and NoC benchmarking. This article presents the goals and deliverables provided by the Network-on-chip benchmarking workgroup.

## 2.Goals of the workgroup

Our workgroup seeks to define a common NoC benchmarks [10]. It has formalized a set of relevant metrics, associated measurement methodologies, and a set of parameterized reference inputs for NoC benchmarks. These ensure meaningful comparison between various sources and the resulting view can be determined in incremental steps. The workgroup aims to provide the academic and industrial research and development (R&D) communities with a set of characteristic benchmark designs and guidelines that will serve as a common repository of relevant information with the following objectives:

- Enable the sharing and comparison of NoC-related R&D efforts and findings
- Enable and accelerate the NoC paradigm development
- Increase the reproducibility of R&D claims and results
- Bridge the gap between academic and industrial state of the art
- Accelerated development and analysis

The members of the international workgroup are mostly from academia with strong backgrounds in NoC design and benchmarking; see for example [11-13]. The following universities research institutes and companies have contributed to the group:

- · KTH Royal Institute of Technology, Sweden
- · Tampere University of Technology, Finland
- Sonics Inc, USA
- University of British Columbia, Canada
- · Washington State University, Boston University, and Carnegie-Mellon University, USA
- ENSTA, France

From our experience in NoC design and analysis, we have identified the following requirements for a NoC benchmark set [10,11]:

- Openness to allow comparison and wide adoption.
- High accuracy both in timing and the amount of data.
- Multiple test cases and scalable workload to generalize the results and to estimate future application requirements.
- Modularity several applications can be combined to model heterogeneous behavior.
- Expandable researchers can contribute new test cases easily to keep the set up-to-date.
- Standard interface to allow wide portability.
- Fast simulation- to allow design space exploration.
- Measures several performance factors
- Detects corrupted, duplicated, and missing data benchmark set is also a NoC testbench.
- Allows various component allocations and application mappings optimal allocation-mapping pair depends heavily on topology and other NoC parameters. This measures also the performance of NoC design tools which have a profound impact on system performance.

# 3.Benchmarking methodology

In view of the propriety issues involved, we propose that the interested community work toward the development of a set of *synthetic* benchmarks (see Specification 1 and 2). The term synthetic refers to some level of abstraction, in this case to a task graph with known computation times and communication loads instead of actual application code. Such models are characterized by the following sets of orthogonal parameters:

- i) A set of relevant metrics and their associated measurement methodologies
  - A set of parameterized reference inputs for the NoC benchmarks, consisting of:
    - a. NoC functional cores composition (# of Processing Elements PEs -, number and size of memories, number of I/Os)
    - b. Interconnect architectures
    - c. Data communication requirements

Synthetic traffic is preferred over applications because they are easier to port to different systems. At the moment, the workgroup is concentrating on SystemC simulation, but later some other types might be considered. Simulation is carried out using a Transaction Generator (TG) tool. The TG generates traffic for network-on-chip according to abstract software and hardware models. During simulation the TG measures performance metrics from the application and platform models, and from the traffic routed through network-on-chip. Because this freely available, highly-versatile tool, works on the transaction level, simulation of larger systems is substantially faster than those done at the clock-cycle accurate level. Fig 1. shows the concept of a TG. It models both the processing elements and their tasks, and based on those generates traffic to the NoC under evaluation. After the simulation, the statistics can be visualized and inspected using a tool called Execution monitor.

The tool is freely available to both OCP-IP members and non-members alike through GNU LGPL, and is useful for all systemlevel designers evaluating various interconnection solutions in a simulation model of a real, complex system. It can also be used to simulate IP blocks before real implementations are available, which enables the design of interconnect and implementation of IP blocks and SW for processors to advance in parallel, saving time, resources, and ensuring a faster time-to-market. Traffic modeling has been analyzed to some extent (see Specification 2 and white paper 2), and currently the workgroup is evaluating other appropriate traffic models to be used in benchmarking. In addition, the impact of external DRAM memories is investigated and modeling techniques are being developed.

# 4.Milestones

ii)

This section lists the milestones and production of the workgroup from years 2007-2010.

#### **Specifications (\*)**

- 1. Zhonghai Lu, A. Jantsch, E. Salminen, C. Grecu, "Network-on-Chip Benchmarking Specification Part 2: Micro-Benchmark Specification Version 1.0", OCP-IP, May 26, 2008, 16 pages.
- 2. E. Salminen, C. Grecu, T.D. Hämäläinen, A. Ivanov, "Network-on-chip benchmarks specifications Part I: application modeling and hardware description", OCP-IP, April 4, 2008, 15 pages.



Fig. 1 Transaction Generator creates traffic to the benchmarked network-on-chip. Traffic if generated according to an abstract data-flow model that mimics the application behavior.

#### Tool releases (\*)

- 1. Enhanced Transaction Generator release, incl. GUI + Tutorial, November 2010
- 2. Original Transaction Generator release, August, 2010

#### White papers (\*)

- K Srinivasan, E. Salminen, "A Memory Subsystem Model for Evaluating Network-on-Chip Performance," OCP-IP white paper, September 2010, 9 pages.
- 2. K Srinivasan, E. Salminen, "A Methodology for Performance Analysis of Network-on-Chip Architectures for Video SoC," OCP-IP white paper, April 2009, 10 pages.
- 3. E. Salminen, A. Kulmala, T.D. Hämäläinen, "Survey of Network-on-Chip Proposals," OCP-IP white paper, March 2008, 13 pages.
- 4. C. Grecu, A. Ivanov, P.Pande, A. Jantsch, E. Salminen, U. Ogras, R. Marculescu, "An Initiative Towards Open Network-on-Chip Benchmarks," OCP-IP white paper, February 2007, 16 pages.

#### Articles (\*)

- E. Salminen, C. Grecu, T.D. Hämäläinen, A. Ivanov,"Application modeling and hardware description for Network-on-chip benchmarking", IET Computers & Digital Techniques, September 1, 2009, Vol.3, Issue 5, Special issue on Network-on-chip, pp. 539-550.
- E. Salminen, C. Grecu, T.D. Hämäläinen, A. Ivanov, "An application modeling & hardware description for network-on-chip benchmarking", Embedded.com, [online]: <u>http://www.embedded.com/design/multicore/</u> 212900324, January 14, 2009, 3 pages.
- Zhonghai Lu, A. Jantsch, E. Salminen, C. Grecu, "Using micro-benchmarks to evaluate & compare Networks-on-chip MPSoC designs", Embedded.com, [online]: <u>http://www.embedded.com/design/multicore/</u> 210604311, September 28, 2008, 5 pages.

- C. Grecu, A. Ivanov, A. Jantsch, P.P. Pande, E. Salminen, U. Ogras, R. Marculescu, "Towards Open Networkon-Chip Benchmarks", First International Symposium on Networks-on-Chip (NOCS'07), Princeton, New Jersey, USA, May 7-9, 2007, pp. 205-205, IEEE.
- 5. Research Bibliography, <u>http://www.ocpip.org/university\_research\_bibliography.php</u>

#### Presentations (\*)

- 1. O. Hammami, Mini-Keynote: "Automatic MPSOC Generation and Design Space Exploration from Automatic Parallelizers," MPSoC'09.
- 2. E. Salminen, "Network-on-Chip benchmarking workgroup, status update," DATE '09.
- 3. Xinyu Li, O. Hammami, "Fast Design Productivity for Embedded Multiprocessor through Multi-FPGA Emulation: The case of a 48-way Multiprocessor with NOC," IP '08.
- 4. C. Grecu, "Network-on-Chip Benchmarks Status Update of OCP-IP Network on Chip Benchmarking Working Group," DATE '08.
- 5. A. Ivanov, "An Initiative Towards Open Network-on-Chip Benchmarking," VLSI Test Symposium '07.
- 6. I. Mackintosh, D. Wingard, J. Bainbridge, R. Marculescu, T. Pinkston, "Proliferating the Use and Acceptance of NoC Benchmark Standards," NOCS '07.

#### Press releases (\*)

- 1. November 29, 2010 "OCP-IP Announces Availability of New Memory Modeling White Paper"
- 2. August 24, 2010 "OCP-IP Delivers Transaction Generator Package"
- 3. January 19, 2010 "Tampere University of Technology Wins OCP-IP Contributor of the Year Award"
- 4. June 03, 2009 "OCP-IP Announces Availability of New Network on Chip Benchmarking White Paper"
- 5. May 27, 2008 "Part 2 of OCP-IP's Network-on-Chip Benchmarking Specification Released to Member Review"
- 6. May 13, 2008 "OCP-IP Releases Survey of Network-on-Chip Architectures"
- 7. March 03, 2008 "OCP-IP Announces Part 1 of Network-on-Chip Benchmarking Specification"
- 8. March 06, 2007 "OCP-IP Introduces Open Network-on-Chip Benchmarks Initiative"

(\*) Note: All of these items may be found and downloaded from <u>www.ocpip.org</u>. If you experience any difficulty obtaining these materials, please contact us at <u>admin@ocpip.org</u>

# 5.Concluding remarks

Our NoC benchmarking initiative has achieved good momentum with a good deal of interest and some notable outcomes to date. For example, the work has been noted in journals such as EETimes and EDA Tech forum. In addition, Tampere University of Technology was the recipient the OCP-IP Contributor of the year award for 2009, which was in part recognition of the commitment to the work group. At the time of this writing, a SystemC model for DRAM (see White paper 1) is being finalized at KTH and its publication is expected in 2011. In many cases, an external DRAM and its controller create a bottleneck which affects large parts of the system. Hence, this model will help evaluators achieve more realistic and dependable performance estimates as the critical DRAM parameters are modeled.

A major objective is to obtain more traffic models for the TG. At the moment, TG comes with a few, very simple examples and more complex and realistic cases are needed. The workgroup is currently analyzing the traffic patterns in real SoCs based on literature and plans to start profiling after that.

#### ACKNOWLEDGEMENTS

The authors would like to acknowledge the contribution of all the members of the Network-on-Chip Benchmarking Workgroup and especially OCP-IP president Ian R. Mackintosh.

### **6.References**

[1] A.L. Sangiovanni-Vincentelli, Quo vadis SLD: Reasoning about trends and challenges of system-level design, Proceedings of the IEEE, Vol. 95, Iss. 3, Mar. 2007, pp. 467-506

[2] W. Wolf, A. Jerraya, G. Martin, Multiprocessor System-on-Chip (MPSoC) Technology, IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, Vol. 27, Iss. 10, Oct. 2008, pp. 1701-1713.

[3] W. Dally, B. Towles, Route packets not wires: on-chip interconnection networks, DAC, 2001, pp. 684-689.

[4] L. Benini, G. de Micheli, Networks on chip: A new SoC paradigm, IEEE Computer, Vol. 35, Iss. 1, Jan. 2002, pp 70-78.

[5] A. Jantsch and H. Tenhunen, editors, Networks on Chip, Kluwer Academic Publishers, 2003.

[6] T. Bjerregaard, S. Mahadevan, A survey of research and practices of Network-on-chip, ACM Computing Surveys, Vol. 38, Iss. 1, 2006.

[7] The Standard Performance Evaluation Corporation, SPEC Web site, [online] http://www.spec.org/hpg/

[8] Embedded Microprocessor Benchmark Consortium (EEMBC) Web site, [online] http://www.eembc.org

[9] Open Core Protocol International Partnership (OCP-IP) Web site, [online] http://www.ocpip.org

[10] C. Grecu, A. Ivanov, A. Jantsch, P.P. Pande, E. Salminen, U.Y. Ogras, R. Marculescu, Towards Open Network-on-Chip Benchmarks, First International Symposium on Networks-on-Chip (NOCS'07), May 72007, pp. 205-205..

[11] E. Salminen, T. Kangas, T.D. Hämäläinen, J. Riihimäki, Requirements for Network-on-Chip Benchmarking, Norchip, Oulu, Finland, Nov., 2005, pp. 82-85.

[12] Zhonghai Lu. Design and Analysis of On-Chip Communication for Network-on-Chip Platforms. PhD thesis. Royal Institute of Technology, March 2007.

[13] E. Salminen, On Design and Comparison of On-Chip Networks, PhD Thesis, Tampere University of Technology, Publication 872, March 2010, 230 pages.