With the advent of CUDA (well, at least as long as I’ve been looking at it), I keep getting the question: “Can I use CUDA with MATLAB?” The short answer: yes! The long answer: yes…and you can do it for free or pay some money for extra features. If you’re interested in spending a little extra for the most optimized, feature-rich tool set, check out MATLAB’s Parallel Computing Toolbox.
For those (like myself) who are a little tight on money, I recommend GP-you’s GPUmat MATLAB library, which can be found on their site: http://gp-you.org/. And for fun, I’ll walk you through installation and a basic GPU-enabled script!
1) Download and install the newest NVIDIA drivers for your computer (note for Alienware m11x users: download the ones from Dell’s site). Additionally, you can install the developer drivers for CUDA: http://developer.nvidia.com/cuda-toolkit-32-downloads
2) Download and install the CUDA Toolkit (if you’re using 64-bit Windows and are running 64-bit MATLAB, I recommend the 64-bit version of the toolkit).
3) Download latest GPUmat from http://gp-you.org/ (once again, I recommend the 64-bit version if everything else is 64-bit)
4) Extract GPUmat zip file into a working directory (you will need to access it from MATLAB).
5) Navigate to the /etc directory in the extracted GPUmat folder, and run vcredist_x64.exe (or the 32-bit one, if you choose) to install the Microsoft Visual C++ redistributable package.
6) Start MATLAB
7) Change working directory to the extracted GPUmat directory
8) Start the GPUmat program by typing into MATLAB (should result in no errors):
9) Add GPUmat to MATLAB path (optional). Go to File->Set Path. Add extracted GPUmat directory. Save and close.
1) Every time you start MATLAB and want to use GPU features, run GPUstart
2) Read up on supported MATLAB functions, which is found in the User’s Guide for GPUmat on GP-you’s site.
3) Try running a test script, like the one here:
% File: GPU_FFT_test.m
for i = 1:1:6
% Store variables on host side
A = single(rand(1, round(10^i)));
% Executed on CPU
FFT_A = fft(A);
timecpu(i) = toc;
% Initialize variables on GPU
A_d = rand(1, round(10^i), GPUsingle);
% Executed on GPU
FFT_A_d = fft(A_d);
timegpu(i) = toc;
x(i) = 10^i;
speedup = timecpu./timegpu;
title('1D FFT GPU vs. CPU speedup')
which should result in a graph similar to the one below. Notice that you really only start seeing benefits from the GPU with large data sets (at least for the FFT function).