High Performance Computer Architecture

A GT OMSCS Course Review – High Performance Computer Architecture (CS6290)

For me, Georgia Tech’s OMSCS program’s biggest draw was it’s extensive machine learning and artificial intelligence curriculum. There are other online Master’s programs from well-regarded schools (University of Texas and University of Illinois immediately come to mind), but none as established as Georgia Tech and none with classes that felt worth the time and investment. However, through a quirk of scheduling (most of the ML/AI courses are in high demand and fill up quite quickly), three of my first four classes at GT have focused on computing systems.

Now, it would be unfair to blame this purely on scheduling. It would have been quite easy to take different courses from different specializations, but I got into this program to learn and to challenge myself, and the computing systems offerings come highly recommended from the community of OMSCS students and are known for their difficulty. High Performance Computer Architecture (HPCA) certainly belongs in that conversation, and like the other computing systems courses that I’ve taken so far (Graduate Introduction to Operating Systems and Advanced Operating Systems being the other two), I left the class with a far better grasp on and appreciation for the internals of computers.

The Business of Bits

If your vocation is one that manages computers, you’re in the business of bits. That is to say, you’re somehow responsible for the writing and/or reading of binary digits. Ones and zeros. Bits.

A bit is a fundamental unit for computers. One bit represents a binary logical state as it can be one of two values (again, 0 or 1). Alone, a bit doesn’t tell us much (the value of its information has been measured) but if you string them together magic happens. If you work with computers at a higher level, like writing a web-app in JavaScript or PHP, it’s easy to forget this although you’ll certainly encounter them from time to time. If you work with computers at an even higher level, like say just opening Excel from time to time, then you’re apt to think most of this is gibberish. However, at the lowest levels of software, it’s impossible to escape bits.

The Engineering Art of Balancing Desire with Reality (as told by processor caches)

In a course about high performance computer architecture, it’s no surprise that most of the time is spent discussing how to speed up computers using their architecture. It’s almost as though the name of the course tells you exactly what to expect.

This week in CS6290 at Georgia Tech, we’ve moved on to caches, which play a key role in speeding up the retrieval of information. The processor’s goal is crunching data which is held either in main memory (RAM) or on the disk (an SSD or HDD). To get that data, the processor issues requests for memory addresses and retrieves the data from the memory storage unit that holds that information.

Processor Pipelines and the Foundation of Computing Systems

This will probably be my last semester at Georgia Tech that includes a computing systems course (unless high performance computing becomes available again online). The rest of my coursework will be focused on my specialization – machine learning – and while I’m excited to focus more on the questions that brought me to this program, I will undoubtedly miss computing systems.

The beautiful part of this area of computer science is that it is where the rubber meets the road. Theory meets application and provides lessons to feed back into theory which then feeds into other applications.

Everything In Its Right Place – A Primer on Hardware Support for High Performance Computer Architecture

My wife and I have a running joke in the house when either one of us moves something to its “correct” resting place, usually punctuated by breaking out into song.

Computer science is the practical application of many other sciences (solid state physics, calculus, linear algebra, information science, etc., etc., etc.), but it is at its most exacting and least forgiving the closer to the hardware you get. Here, everything truly does have its right place.

Checklists Are Important for Everything – Especially Processors

Every so often, a few posts come across my desk at the same time, and it reminds me of how at some basic level, all work is the same work, just manifested in different ways. Checklists and agendas, which are near and dear to my heart, are crucial for communicating and getting things done correctly across a team. They represent an agreement, a contract, reflections of expectations.

When you enter a meeting that has gone off the rails, it’s likely that either someone has torpedoed the agenda or one was never established. Likewise, any time I’ve needed to get a project back in a manageable state, a forced prioritized to-do list is my weapon of choice.

Similarly, sequential logical steps are the bread and butter of processors. Most of my high performance computing architecture course is focused on how processors squeeze every possible optimization out of a program’s instructions. There are dozens of ways that it does this (branch prediction, loop unfurling, data caches, etc), but perhaps most approachable is how a processor issues and executes instructions.