Assignment 1: Memcached-lite
Due: 2023-01-31
1 Summary
In this assignment, you will be designing and implementing a simple key-value store. The main task is to implement a server, that stores and retrieves data for different clients. For each key
, there is a unique value
that the server stores, using the set <key> <value>
command, and retrieves it using a get <key>
command.
Memcached is a popular key-value store, and you will try to mimic its client protocol https://github.com/memcached/memcached/blob/master/doc/protocol.txt
A higher level explanation of the ASCII protocol can also be useful https://github.com/memcached/memcached/wiki/Commands#standard-protocol
In contrast to regular Memcached that stores data in memory, your server will use stable storage. That is, the data should persist even after the server process dies.
Note: This is part of a semester-long series of assignments, and you will be adding more advanced functionality to the same key-value store in later assignments. It is thus important to use good abstractions and think about the general case.
2 What to Implement
You will be implementing a simpler version of memcached server, along with a client, for testing the server implementation.
2.1 TCP-socket server
The server listens on a port, say 9889 for incoming client connections and commands, and stores all the keys and values. In this assignment, the server should store all data in a file system.
You must implement these commands:
get <key> \r\n
set <key> <value> \r\n
Keys are text strings (without spaces, newlines, or other special characters). Values are text strings.
A get
will fetch the data corresponding to the key and return it to the client.
A set
should store the value for later retrieval.
2.1.1 Set
Specifically, the set command is whitespace delimited, and consists of two lines:
set <key> <value-size-bytes> \r\n
<value> \r\n
Note that this is a simpler version of the memcached protocol, in which the set command also accepts flags and expiry time, which we will ignore for this assignment.
The server should respond with either "STORED\r\n", or "NOT-STORED\r\n".
2.1.2 Get
Retrieving data is simpler: get <key>\r\n
The server should respond with two lines:
VALUE <key> <bytes> \r\n
<data block>\r\n
After all the items have been transmitted, the server sends the string "END\r\n"
2.1.3 Sleep
Get/set operations can be very fast. For interesting test-cases, you should add a small random delay for all of them on the server. This is typically done by sleeping for a random time less than, say 1 second before actually reading or writing.
2.2 Clients
You should also implement client programs, that connect to the server and make a series of get and set requests. This will be used for testing the server implementation.
Example client programs make a sequence of set/get requests to a small set of keys.
You should test with atleast 2 clients.
2.3 Advanced : Concurrency
A concurrent server should be able to handle more than one request at a time. Each set/get request essentially spawns a new thread. This requires careful synchronization.
2.4 Advanced : Memcached compatibility
As a bonus, you can implement the server such that regular, off-the-shelf memcached clients can connect to your server. For this, you will have to tweak the command parsing etc to support memcached protocol specification such as flags etc.
There are many Memcached clients for testing, for instance https://pymemcache.readthedocs.io/en/latest/getting_started.html
3 What to Submit
You must submit your code, examples, test-cases, and a report.
Your code should run on the SICE Linux servers (sharks, silo, etc.). You should submit all the code in a zip file on canvas. The submission should be self contained as much as possible—if you are using a non-standard library, then try to include it.
It is strongly encouraged to use Makefiles (if required) to compile and run the client and servers. You can use any widely available programming language (C, C++, Python, Java,…), but you must ensure that it compiles and runs on
Think carefully of how you can test the code for correctness, and provide at least one interesting test case.
The report should have describe the design details, and some experimental evaluation (what is the performance of your server? how many concurrent clients have you tested with?, etc). You should also describe the limitations of your server. What are the key and value size limits? Concurrency limits? Kind of errors that you do not handle? Future improvements etc.
3.1 434 grading scheme
Component | Weight |
---|---|
Basic Server | 50% |
Clients and testcases | 30% |
Report | 20% |
Advanced: Concurrency | 10% |
Advanced: Memcache compatibility | 10% |
The advanced parts are bonus for undergraduate students in 434.
3.2 534/510 grading scheme
The advanced parts (concurrency and memcache compatibility) are compulsory for grad students (534/510).
Component | Weight |
---|---|
Basic Server | 30% |
Clients and testcases | 30% |
Report | 20% |
Advanced: Concurrency | 10% |
Advanced: Memcache compatibility | 10% |
4 Future Assignment Tasks
These are the improvements you will make to memcached-lite (in future assignments):
- Distributed strong and weak consistency