Assignment 1: Memcached-lite
Due: 2024-01-31

1. Summary

In this assignment, you will be designing and implementing a simple key-value store. The main task is to implement a server, that stores and retrieves data for different clients. For each key, there is a unique value that the server stores, using the set <key> <value> command, and retrieves it using a get <key> command.

Memcached is a popular key-value store, and you will try to mimic its client protocol https://github.com/memcached/memcached/blob/master/doc/protocol.txt

A higher level explanation of the ASCII protocol can also be useful https://github.com/memcached/memcached/wiki/Commands#standard-protocol

In contrast to regular Memcached that stores data in memory, your server will use stable storage. That is, the data should persist even after the server process dies.

Note: This is part of a semester-long series of assignments, and you will be adding more advanced functionality to the same key-value store in later assignments. It is thus important to use good abstractions and think about the general case.

2. What to Implement

You will be implementing a simpler version of memcached server, along with a client, for testing the server implementation.

2.1. TCP-socket server

The server listens on a port, say 9889 for incoming client connections and commands, and stores all the keys and values. In this assignment, the server should store all data in a file system.

You must implement these commands:

  1. get <key> \r\n
  2. set <key> <value> \r\n

Keys are text strings (without spaces, newlines, or other special characters). Values are text strings. Note that the exact synatax of set is slightly different and specificied below.

A get will fetch the data corresponding to the key and return it to the client. A set should store the value for later retrieval.

2.1.1. Set

Specifically, the set command is whitespace delimited, and consists of two lines:

set <key> <value-size-bytes> \r\n

<value> \r\n

Note that this is a simpler version of the memcached protocol, in which the set command also accepts flags and expiry time, which we will ignore for this assignment.

The server should respond with either "STORED\r\n", or "NOT-STORED\r\n".

2.1.2. Get

Retrieving data is simpler: get <key>\r\n

The server should respond with two lines:

VALUE <key> <bytes> \r\n

<data block>\r\n

After all the items have been transmitted, the server sends the string "END\r\n"

2.1.3. Sleep

Get/set operations can be very fast. For interesting test-cases, you should add a small random delay for all of them on the server. This is typically done by sleeping for a random time less than, say 1 second before actually reading or writing.

2.2. Clients

You should also implement client programs, that connect to the server and make a series of get and set requests. This will be used for testing the server implementation.

Example client programs make a sequence of set/get requests to a small set of keys.

You should test with atleast 2 clients.

2.3. Advanced : Concurrency

A concurrent server should be able to handle more than one request at a time. Each set/get request essentially spawns a new thread. This requires careful synchronization.

2.4. Advanced : Memcached compatibility

As a bonus, you can implement the server such that regular, off-the-shelf memcached clients can connect to your server. For this, you will have to tweak the command parsing etc to support memcached protocol specification such as flags etc.

There are many Memcached clients for testing, for instance https://pymemcache.readthedocs.io/en/latest/getting_started.html

3. What to Submit

You must submit your code, examples, test-cases, and a report.

Your code should run on the SICE Linux servers (sharks, silo, etc.). You should submit all the code in a zip file on canvas. The submission should be self contained as much as possible—if you are using a non-standard library, then try to include it.

It is strongly encouraged to use Makefiles (if required) to compile and run the client and servers. You can use any widely available programming language (C, C++, Python, Java,…), but you must ensure that it compiles and runs on

Think carefully of how you can test the code for correctness, and provide at least one interesting test case.

The report should have describe the design details, and some experimental evaluation (what is the performance of your server? how many concurrent clients have you tested with?, etc). You should also describe the limitations of your server. What are the key and value size limits? Concurrency limits? Kind of errors that you do not handle? Future improvements etc.

3.1. 434 grading scheme

Component Weight
Basic Server 50%
Clients and testcases 30%
Report 20%
Advanced: Concurrency 10%
Advanced: Memcache compatibility 10%

The advanced parts are bonus for undergraduate students in 434.

3.2. 534/510 grading scheme

The advanced parts (concurrency and memcache compatibility) are compulsory for grad students (534/510).

Component Weight
Basic Server 30%
Clients and testcases 30%
Report 20%
Advanced: Concurrency 10%
Advanced: Memcache compatibility 10%

4. Future Assignment Tasks

These are the improvements you will make to memcached-lite (in future assignments):

  • Distributed strong and weak consistency

Author: Prateek Sharma

Created: 2024-01-23 Tue 07:05

Validate