Senior Software Engineer,
July 2010 – Present
Intel's Waterloo, Ontario site was established at the end of 2009
from its acquisition of RapidMind, a startup selling a data-parallel
JIT programming system.
Worked on bringing the Intel Labs Ct project, with some additions
from RapidMind, up to product quality as Intel Array Building
Adapted Intel's internal OpenCL vectorizer for an LLVM-based
successor to Array Building Blocks designed around high-level data
parallel intrinsics in LLVM IR.
Added Cilk Plus elemental functions to Clang, using an IR
interface modelled on OpenCL SPIR, in support of a Clang-based front
end for ICC.
Addressed problems related to present and future use of LLVM in
Android, including fixing Renderscript-related bugs, and other
issues building the x86 Android tree with LLVM.
Planned and prototyped support for Intel Memory Protection
Extensions in LLVM. (MPX is a set of new instructions aimed at
allowing pointer bounds checking in C/C++ without breaking
existing ABIs and data layouts.)
Senior Software Engineer,
July 2005 – July 2010
Sandvine develops carrier-grade network equipment providing flexible
congestion management, network integrity, and operational support.
During most of my tenure at Sandvine, I was primarily or solely
responsible for designing, developing, and maintaining software
running on, and related to, the load-balancing network processors
crucial to the current product architecture.
Optimized the network processor's assembly-language software to
40% of its original length to meet the essential hard real-time
target imposed by 10GigE wire speed.
Enhanced the software to support IP-based load balancing, MPLS and
L2TP tunneling protocols, and new network deployment variations,
while maintaining the critical-path time constraint.
Ported the assembly code to two new hardware platforms.
Worked on FreeBSD kernel drivers for the network processor and
Maintained and ultimately redesigned the controlling C++ host
Wrote a debugging shell that was ultimately adopted as the system
Senior Software Engineer,
April 1989 – January 2004
Archelon, previously Bit Slice Software, began developing compilers
for microcode for high-performance graphics systems, and broadened
to support other processors outside the general-purpose mainstream,
with recent emphasis on DSPs (digital signal processors) and embedded
controllers; clients have ranged from start-ups to the top computer
and semiconductor makers.
Except in my earliest years at Archelon, I was normally responsible
for all technical aspects of the tasks on which I worked – design,
implementation, testing, maintenance, and documentation.
Designed, wrote, maintained and enhanced the register allocator for
the current compiler, including advances beyond the published
state of the art in handling incongruous processors. This project
led to Archelon's compiler outperforming GCC on customer benchmarks.
Targeted the compiler to several processors: the AOX QT Engine, the
Clarkspur CD2458, the CSEM CoolRISC 816, the Oxford A236, and the
United Technologies HS1600. These tasks typically included
designing the compiler ABI, and hand-coding low-level library
routines, as well as retargeting the code generator.
Contributed to targeting the AOX QT32, the Oasis Cougar, the OnSpec
90C36, the Oxford A436, the 3DSP Waimea, the Atmel AT76C202, the
Clarkspur CD2450, the Dspfactory RCORE, the Inicore iniDSP, the
Motorola 56000, the ST WDSP and ST Emerald, the Streamachine SMDSP,
and the Zucotto Xpresso.
Worked, at various times, with a wide variety of other processors,
including the David Sarnoff Research Center's Princeton Engine (a
2048-way SIMD architecture designed for real-time video processing),
a 4-way SIMD graphics engine by a major semiconductor manufacturer,
a microcoded engine designed for a major computer manufacturer's
midrange systems, and the NSC HPC (an early 16-bit microcontroller).
Added SIMD conditional locking to the retargetable compiler to
support the Oxford A236.
Designed and wrote an output rewriting layer for the compiler, to
allow it to support arbitrary third-party assemblers.
Added COFF symbolic debugging to the C compiler.
Designed, wrote, benchmarked, and maintained a standalone
instruction scheduler for the Intel i960 family of processors.
(The i960 CA was the first superscalar microprocessor.) The
scheduler operates at the assembly code level, and can deal
equally with hand-written and compiler-generated code.
Enhanced the microcode compaction tool to better schedule
instructions on processors with delayed branches.
Worked on instruction scheduling for the pipelined Intel i860.
Wrote a FORTRAN 77 front end, wrote the FORTRAN intrinsic libraries,
and wrote diagnostics. Interfaced the front end to three local code
generators (for the Trancept/Sun TAAC and successor Jet, for TI bit
slice parts, and for the TI TMS34082) and to a third-party back end.
Later added support for Sun and DEC extensions. Modified the linker
to support FORTRAN common blocks.
Modified the linker to support libraries, and wrote an object
library manipulation tool. Other additions to the linker include
support for arbitrary relocation expressions, and user-defined debug
Added general code relaxing to the assembler, with user-level
directives for conditional assembly by address range. Other
additions to the assembler included debugging tables for
hand-written assembly code; C-style expressions; and user-defined
Worked on a custom assembler, with continually evolving
specifications, for a major computer manufacturer's VLIW research
Designed and wrote a parameterized simulator in support of testing
an earlier retargetable code generator.
Wrote an m4 clone.
Built a floating-point compiler support library, derived from the
contemporary Cephes library.
Ported the GNU assembler, linker, and object tools to 16-bit MS-DOS.
Control Data Corporation,
Fall 1985 and Summer 1986
At CDC, I primarily performed acceptance testing for VX/VE, a UNIX
environment for the Cyber 180, and the associated C compiler. I wrote
and ran tests and prepared bug reports for the vendor; I also filtered
or handled customers' issues.
I have long-standing interests in programming languages and systems,
and in computer architecture.
I own a variety of interesting hardware, including a DEC PDP11/45, a
Data General Nova 2, and a Xerox 8010, along with various other more
I am gradually learning more electronics, in part to repair and
maintain the older machines, and hold an amateur radio license.
My other (less closely related) interests include typography and type
design, photography, and epistemology.