NVIDIA Tools Extension Library (NVTX) Scoped Timer 2015-07-20

NVIDIA’s Tools Extension library is an easy way to add profiling information to your code if you use their NSight profiler. Beyond the simple example shown here, timers can be colored, set to a specific version and there are special language specific functions for CUDA and OpenCL. See the reference below for more information.

The example here is extremely simply in nature but shows how NVTX can be used to create a lightweight scoped timer. The idea of a scoped timer useful because it allows you to time a section of code and when the timer loses scope it will stop itself automatically by calling its destructor.

Timing data will automatically show up when enabled in NSight.


NVIDIA Tools Extension


struct NVTXTimer
    NVTXTimer(const char *name){

Add GCC compiler to Xcode 6 2015-02-25

A few years ago I wrote a guide on how to add a newer version of GCC to Xcode. At the time OSX/Xcode came with a version of GCC 4.2 that could be modified to run as a different version of GCC. The problem with newer versions of Xcode is that the GCC compiler was removed making it more difficult to add a custom compiler.

The plugin is based on an older xcode-gcc plugin which hasn’t been updated for Xcode 6 and a post on stackoverflow. The modifications themselves are pretty simple and as long as the plugin structure doesn’t change with future versions of Xcode, they should work with with newer versions of GCC. You can see the modifications in the revision history.

Reading a Matlab Matrix in c++ 2015-02-16

In doing a performance comparison between several linear algebra libraries I had to read in several large (more than 21 million non zero values) sparse matrices. I’m not going to claim that this is the fastest way to read in a matrix that is stored on disk, but for me it was fast enough.

The Data Structure

This struct contains three std::vectors which store the row, column and value entries from each line in the file. Some assumptions are made on the matrix, namely that there are no rows will all zero entries and that the lass column with data is the last column in the matrix. If your matrix is larger than this then you will need to manually modify the data structure that you store your matrix into. The matlab ascii sparse matrix format does not store the number of rows and columns Reference.

#include <iostream>
#include <vector>
#include <algorithm>
struct COO {
  std::vector<size_t> row;     // Row entries for matrix
  std::vector<size_t> col;     // Column entries for matrix
  std::vector<double> val;     // Values for the non zero entries
  unsigned int num_rows;       // Number of Rows
  unsigned int num_cols;       // Number of Columns
  unsigned int num_nonzero;    // Number of non zeros
  // Once the data has been read in, compute the number of rows, columns, and nonzeros
  void update() {
    num_rows = row.back();
    num_cols = *std::max_element(col.begin(), col.end());
    num_nonzero = val.size();
    std::cout << "COO Updated: [Rows, Columns, Non Zeros] [" << num_rows << ", " << num_cols << ", " << num_nonzero << "] " << std::endl;

Updating python eggs using pip and easy_install 2015-02-01

I use buildbot to manage our labs build/testing infrastructure. I wrote up a guide a while back on how to set it up on different platforms. In this post I wanted to document how to keep the setup updated.

Note: If using a sandbox first source that sandbox

source sandbox/bin/activate

Update using pip

Using a shell command

pip freeze --local | grep -v '^\-e' | cut -d = -f 1  | xargs pip install -U

Using a python file

import pip
from subprocess import call

for dist in pip.get_installed_distributions():
    call("pip install --upgrade " + dist.project_name, shell=True)

Make vs Ninja Performance Comparison 2015-01-31

Ever since I started using CMake to handle generating my build files I have relied on Makefiles. Most linux distributions come with the make command so getting up and running doesn’t require too much effort. Make and its derivatives been around for almost 40 years and it’s an extremely powerful tool that can do many things beyond simply compiling code. There are cases where the flexibility and power of make are overkill in terms of compiling code and if you are willing to trade them with improved performance Ninja might be what you are looking for.

Ninja, written by Evan Martin is a build system that is focused on performance. It was designed for fast incremental builds and large projects in general. To quote the chromium project “Ninja is a build system written with the specific goal of improving the edit-compile cycle time”.

Ninja does not output information about the current progress of the build on more than one line. Warnings and Errors are output like normal. Make on the other hand will output a line for every single cpp file that was compiled and linked. So beyond the performance improvements Ninja has a higher signal to noise ratio than Make.