Programming Assignment -- CS 7810
The following resources will help you learn the basics of Pthreads
programming (details are also provided about MPI and TM). After
experimenting with the toy example programs,
create the following non-trivial multi-threaded application with
shared-memory (using Pthreads). This is due on Friday February 27th, 2009.
You are being provided with the following template
C program (here's the corresponding
Java program -- you are welcome to write
your program in Java as well, but don't have to). This program sets up
a connection with
a database server and invokes mysql queries to gather a large amount of
stock price data. In essence, this template program only performs some of
the I/O required of your program. It is up to you to take this data and
populate your own data structures. In the template program, the function
getTickers() shows you how to collect the names of each stock and the
function getSingleStockData() shows you how to collect daily stock prices
for a given stock.
- The first problem for you is to produce a multi-threaded version
of the basic I/O program. You must fork off N threads that first read the
list of stock names from the database server and populate your data
structures. You must then fork off M threads that read daily stock prices
for a year for each of these stock names and again populate your database.
Experimentally determine the optimal values for N and M for some
state-of-the-art multi-core system. Note that this problem deals with
I/O parallelization with CPU threads, not computation parallelization.
- Now perform the following computation on the data. For each stock,
perform a running average (over the last 5 days) of its stock price for
each day of the year. For each day of the year, identify (and print) the
stock that on that day has the highest percentage gain over its 5-day average.
Break up this entire computation into Q threads such that performance is
optimized. Note that this deals with computation parallelization, but largely
deals with only reads to shared variables. There is almost no read-write
sharing.
- Finally, conjure up another computation on this database that does
involve some non-trivial read-write sharing, i.e., you should acquire locks
when accessing some critical section and there should be a producer-consumer
relationship between some of the threads. Again, compute the optimal number
of threads R.
What you need to submit back to me (email me the tarball if that is most
convenient): the C programs (appropriately titled and commented), a
README file that provides details on the application, how to run it, and
any analysis that you may have carried out. Be detailed when discussing
the performance observations and describing what problem you solved in
part 3 and how it involved non-trivial synchronization.
Resources:
Optional reading: