Monthly Archives: February 2016

GEMM and STREAM Results on Intel Edison

Intel Edison is a tiny computer (smaller in size than a credit card) targeted at the Internet of Things. Its CPU consists of two Silvermont Atom-CPUs running at 500 MHz and is offered for a price tag of around 70 US dollars. Even though Intel Edison is not designed for high performance computing, the design goal of low power consumption makes it nevertheless interesting to look at from a high performance computing perspective. Let us have a closer look.
Continue reading

Sparse Matrix Transposition: Datastructure Performance Comparison

While processor manufacturers repeatedly emphasize the importance of their latest innovations such as vector extensions (AVX, AVX2, etc.) of the processing elements, proper placement of data in memory is at least equally important. At the same time, generic implementations of many different data structures allow one to (re)use the most appealing one quickly. However, the intuitively most appropriate data structure may not be the fastest. Continue reading

Strided Memory Access on CPUs, GPUs, and MIC

Optimization guides for GPUs discuss in length the importance of contiguous ("coalesced", etc.) memory access for achieving high memory bandwidth (e.g. this parallel4all blog post). But how does strided memory access compare across different architectures? Is this something specific to NVIDIA GPUs? Let's shed some light on these questions by some benchmarks. Continue reading

Join the PETSc User Meeting 2016!

PETSc, the Portable, Extensible Toolkit for Scientific Computing, is one of the world's most widely used software libraries for high-performance computational science. With most of the PETSc core team employed at the Argonne National Laboratory in Illinois, USA, exchange with the European user community was hampered by geographic distance. This year, the PETSc team will reach out to Europe and hold the PETSc User Meeting 2016 on June 28-30 in Vienna, Austria. Continue reading