Skip to content


Multi-threaded Homework Problem

Occasionally I have been asked about learning multithreaded programming.

My standard suggested problem is to create a “simplistic” TCP/IP block ordering library. This is a variant of the standard producer/consumer problem that shows up in interviews. This problem is a standard multiple-producer/multiple-consumer problem with a twist and goes way beyond what can be handled in an interview.

Problem #1

Use case:

  • each block as a conversation Id and a sequence id within the conversation.
  • All blocks are received on a single queue.
  • Blocks are separated by conversation and then ordered by sequence number.
  • Blocks arrive semi-random order – the sequence id will be within +/-2 of the previous block (in the same conversation)’s sequence id. But there will be no gaps in the sequence id for a given conversation.
  • sequence id 0 starts the conversation, sequence id 99 ends the conversation

Sample sequence: (1,0), (1,2), (2,0), (1,3), (2,2), (1,1), (2,1), ….

Problem #2:

Extending Problem #1 with these additional requirements:

  • Conversations where the 0 block has not been seen are discarded.
  • Blocks may be repeatedly sent. Duplicates are discarded.
  • Conversations are now buffered when doing the ordering ( represents buffering within the TCP/IP stack )
  • Each conversation chunk can only hold 3 blocks.
  • Each buffer when complete is dispatched in order to a dummy thread ( represents the application ) – which has a random delay and then tests and discards the data.

Sample data (for single conversation):

(1,0) – buffer#1; (1, 5) – buffer#2; (1,6) – buffer#3; (1,1) – buffer #1; (1,2) -buffer #1 (dispatched); (1,1) – duplicate (discarded); (1,3) – buffer #2; (1,7) – buffer #3 ( cannot be dispatched because buffer #2 has not been dispatched); (1,4) – buffer #2 and #3 are both dispatched; (1,10) – buffer #1 is now reavailable; ….

Problem #3:

Additional requirements:

  • Each conversation now has the sequence variation by +/-10 (bell curve)
  • Each conversation (on the consumer side) has only 2 buffers available to it
  • Blocks that can not be sequentially placed in the conversation buffers are discarded.
  • When the consumer threads detect 3 failures to complete the buffer, the consumer notifies the producers that the missing block needs to be resent.
  • Conversations have an indeterminate length. Additional flag signals the last block in the conversation.
  • Conversations that have not sent data in the last 30 seconds are terminated. All uncompleted buffers are discarded. Any further TCP/IP blocks are discarded.
  • Producers have a 5 buffers per conversation available

Things to look for:

  • Starvation – conversation is not being processed in spite of buffers being available
  • Effective I/O rate
  • Retry rate because of dropped buffers

Update:

Some resource suggestions from Ronald Vermeij:

Simple very basic tutorial

Concurrency: State Models & Java Programs Book

Nice “Concurrency RefCard” at Devzone

Update 2

The homework problem is actually fiendishly difficult. The problem is similar to a real world problem I had constructing a network switch monitoring program. The switches run a http server on a low priority thread. The http server respond slowly, occasionally they just drop a request. In the test lab, the switches are not under high load but in the field they are. Under high load, the http server may not respond in a timely manner. Different switch models responded differently. Determining the monitoring program’s response to a switch’s behavior or (non) behavior was “interesting” – panic and tell the operator that the switch has crashed? Do we delay the failure notification (turning switch status to red)? How often to poll the switch ( we don’t want to add to the overloaded switch’s work ). We also had to process the incoming data as fast as possible so the switch doesn’t drop the connection because the response is not being processed.

Extra credit:

  • Separate out the producer and consumer portions into 2 different programs
  • Each conversation is on a separate Java NIO connection
  • 100+ producer threads (in the producer program), 5 consumer threads in the consumer program
  • each conversation’s producer produces data at a random rate (will be bursty).
  • The conversation’s consumer must consume the produced data fast enough so the producer always has an available buffer to write to ( producer has 2 buffers – but try reducing the producer’s available buffers ). Consumer’s per conversation buffers can be higher than previous problem.
  • Producer may arbitrarily stop sending data
  • Use Java NIO

Notice:

  • When setting up, the connection the consumer must quickly complete establishment of the connection otherwise the connection will be dropped. [Hint: Have a dedicated consumer thread do just the connection establishment. Many Java NIO sample programs try to have the same thread establish connections and do the read operations those connections.]
  • Consumer must be aggressive about processing incoming data.
  • Consumer needs to have a “give up and retry” mechanism

Posted in how to, technical.


0 Responses

Stay in touch with the conversation, subscribe to the RSS feed for comments on this post.



Some HTML is OK

or, reply to this post via trackback.