
The primary focus of my PhD thesis is finding analytical solutions to the Flierl-Petviashvili (FP) equation using the method of group foliation. As a complement to the search for analytical solutions, I have also created a simple program to numerically solve the equation. The FP equation has two spatial dimensions and a time dimension. The program uses spectral differentiation to differentiate the spatial dimensions in the frequency domain by performing a forward fast fourier transform (FFT), differentiating, then performing the inverse transform. The time step is iterated using a fourth order Runge-Kutta method. The program itself is written in C, for the numerical methods, and Objective-C for the interface and object interaction. The program makes heavy use of latest technologies in Cocoa and also uses the Accelerate Framework for the FFTs and other vector operations.
Please note that this is active research, the program and code are not designed for user friendly use, it is a research tool, and there is no guarantee of accuracy. Further, some classes like the XYPlotView are quick and dirty hacks written in 15 minutes to get the job done, while other functions, like those in SpectralMethods.c are much more well thought out over a day or two. If you find errors, please email me.
The project and binary were last updated April 23, 2008. Speed increase over original, and the plots work a litlte better now. No badness in the y.
In the context that we are using the equation, you can think of psi as sea surface height, with typical heights of about 20 centimeters and the size of the domain is thousands of kilometers on each side. The basic idea is to set initial conditions for psi, start the simulation and let the equation progress in time.

The primary view in the program is the simulation view which shows value of psi at each point. The reddest values are the maximum, while the bluest values are the minimum. When you first open the program there are no initial conditions set, so the simulation is solid red. You change the initial conditions by using the initial condition builder on the left-hand side. At this point there are two types of initial conditions you can add, a monopole, or white noise. The initial condition builder simply adds the monopole or white noise to the current state, so you can add as many monopoles or as much noise as you would like. Just hit clear to reset the intial conditions to zero.
Other conditions related to the equation can also be set. You can change the grid size, turn on and off terms in the equation, and change the time step.
On the right-hand side of the primary window the log of the power spectral density is shown in a contour plot, as well as cross-sections along the axes.
Several interesting initial conditions are possible. You can try adding a couple of different Tan and Boyd monopoles of varying speeds and watch them interact. Typically you will want speeds between -1.01 and -1.10. Another interesting scenario is to adds low-frequency white noise. Set the amplitude of the white noise to 12.0 and the band pass to 0.05.
The program runs on Mac OS X 10.5 only and will not run on earlier versions. This is because the program uses Objective-C 2.0 and also several of the newer APIs, like NSOperation for some basic multithreading. When I first wrote the program in June of 2007 it ran on 10.4, but I quickly migrated the codebase because Objective-C 2.0 properties rule -- there is just no sense in wasting time writing all that boilerplate code.

The basic structure is typical for Cocoa program and follows the standard practices of model-view-controller. There are two controllers in this application, the AppController and the Diagnostics controller. The FPEquation is launched by the AppController onto its own thread. As the FPEquation object runs through the equation loop it hands time step data back to the AppController, which in turn hands it off to the DiagnosticsController to process.
The AppController is responsible for all of the necessary conditions required to send the FPEquation object off on its own to run through the time step loop. This includes the initial conditions, the domain size, what terms will be used in the equation and how long it should run for. As the user adjusts the initial conditions, the AppController adjust the time step (t=0) model object, and sends it to the DiagnosticsController. When the user clicks Start Simulation, the AppController creates an FPEquation object, sets all the right conditions, and then adds it to the NSOperationQueue to let it run on its own thread.
The DiagnosticsController is reponsible for taking each piece of time step data and analyzing it. This includes creating a contour plot of the data, finding the power spectral density and reporting minimum and maximum values to the user. The DiagnosticsController adds its processing objects to the same NSOperationQueue to be threaded.
The FPEquation object runs through a time step loop. At each iteration it creates an array of bytes, the value of psi at each point, and returns it to the AppController.
Early exeperiments showed absolutely no advantage to the additional accuracy afforded by using double precision floating point, rather than single precision. Given the substantial speed advantage in some cases, everything is therefore written with single precision floats.
The scalar field psi is completely specified by nx*ny floating point numbers where nx and ny are the number of points in the horizontal and vertical domain, respectively. In the C code the standard buffer is therefore allocated on the heap with malloc( nx * ny * sizeof(float) ) and must later be freed. This allocates a continuous chunk of memory that can advantageously be treated as a single vector for use with functions like vDSP_vmul and the FFT routines found in the vDSP part of the Accelerate framework. In the parts of code that use Cocoa, such a buffer is allocated using [NSMutableData dataWithLength: nx * ny * sizeof(float)]. Because this application uses garbage collection the buffer allocated as an NSData object will be automatically freed when there are no more references to it.
The files SpectralMethods.h and SpectralMethods.c contain my basic functions for spectral differentiation. At the highest level there are only three functions you need only to call. The function initSpectralMethods() must be called to initialize various differentiation arrays and freeSpectralMethods() needs to be called when you are done, to free the arrays. The function spectralDifferentiate() does all the hard work. You pass it your function, the number of times you want to differentiate in the x and y directions, and it returns the differentiated function. In practice this is certainly the simplest way to implement to code necessary to solve the FP equation, but unfortunately, it is not the fastest.
The routines in SpectralMethods.c fundamentally use the two-dimensional FFT routines in vDSP. In addition to the higher level functions SpectralMethods.c also has more primitive functions to for the forward and inverse FFT and for creating differentiation matrices. A few other functions have been created to implement various filters.
Given psi we can use the function spectralDifferentiate() to find the value of all the terms in the FP equation and add them as appropriate. After implementing the Runge-Kutta method to advance the time step, we have essentially solved the equation. This works fine, but isn’t the fastest approach. Each time we call the function spectralDifferentiate() it performs two FFTs in addition to two multiplications in the frequency domain. Each FFT is a very expensive operation and whether we are differentiating psi by x once, or by y twice, we are doing the same forward FFT. The best approach is therefore to transform psi into the frequency domain once, do all of our operations there, and then transform back. There is a huge disadvantage to this in that it makes the code far more complicated and therefore more prone to errors.
An additional complication arises with nonlinear partial differentiatial equations like the FP equation, multiplication. Multiplication in one domain is a convolution in the other domain. So in theory, we could be performing a number of convolutions in the implementation of the FP equation. However, Apple’s framework are deficient in this regard (no 2D convolutions, only 1D). So rather than piece together a 2D convolution from the 1D API or write one from scratch, it turns out to be just as easy and computationally expensive to perform the reverse FFT and multiply in the other domain. That’s the choice made here.
Performance testing of this code with Shark showed that far and away the largest performance hit came from allocating buffers each time step. So again, at the high cost of cluttering the code, a very good optimization is to preallocate buffers before starting the time step loop and freeing them afterwards.
I hope to create two parallel sets of functions, one that uses the fastest implementation I can muster, and another that uses the cleanest code possible.
Jeffrey J. Early