MIT 6.824: Lecture 2 - RPC and Threads

The course is based on Go, the reason for Go is that it is a modern language that is well suited for building concurrent and distributed applications. It is type and memory safe, garbage collected, and not too complex. It also has convenient RPC and good support for threads.

Threads

Most programming languages have some support for threads, which are a useful structuring tool for concurrent programs. A thread is a "thread of execution" that allows a program to do many things at once. Each thread executes serially, just like a non-threaded program, but threads share memory. Each thread includes some per-thread state, such as a program counter, registers, and a stack.

Go calls threads goroutines, but everyone else calls them threads :)

Why threads?

Threads are useful for I/O concurrency, multicore performance, and convenience. For example, a client can send requests to many servers in parallel and wait for replies. A server can process many simultaneous client requests, where each request may block. While waiting for the disk to read data for client X, the server can process a request from client Y.

Threads are also useful for multicore performance, allowing code to execute in parallel on several cores. They are also convenient for tasks such as checking whether each worker is still alive once per second.

Alternatives to threads

Yes

There is an alternative to threads: write code that explicitly interleaves activities in a single thread. This is usually called "event-driven" programming.

In event-driven programming, you keep a table of state about each activity, such as each client request.

There is one "event" loop that:

checks for new input for each activity (e.g., arrival of a reply from a server),
does the next step for each activity,
updates state.

Event-driven programming can get you I/O concurrency and eliminate thread costs, which can be substantial. However, it doesn't get multi-core speedup and can be painful to program.

Threading challenges

Threads can introduce challenges, such as sharing data safely and coordinating between threads. These challenges include:

Race conditions: What if two threads try to increment a variable at the same time? This can lead to bugs. One way to avoid bugs is to use locks (e.g., Go's sync.Mutex) or avoid sharing mutable data.
Coordination between threads: How can one thread wait for another thread to produce data? How can the producer wake up the consumer? Go provides channels, sync.Cond, and sync.WaitGroup to help with coordination.
Deadlock: A cycle of threads waiting for each other can lead to deadlock. This can happen via locks, channels, or RPC.

Remote Procedure Call

Remote Procedure Call (RPC) is a key piece of distributed system machinery. They make it easy to program client/server communication by hiding the details of network protocols and converting data (e.g., strings, arrays, maps) to "wire format." This makes it easier to achieve portability and interoperability.

Binding: how does client know what server computer to talk to?

For Go's RPC, server name/port is an argument to Dial
Big systems have some kind of name or configuration server

Marshalling: format data into packets

Go's RPC library can pass strings, arrays, objects, maps, &c
Go passes pointers by copying the pointed-to data
Cannot pass channels or functions
Marshals only exported fields (i.e., fields w/ CAPITAL letter)

RPC problems

How to handle errors?

e.g. lost packet, broken network, slow server, crashed server

What does a failure look like to the client RPC library?

The client never sees a response from the server.
The client does not know if the server saw the request.
- Maybe the server never saw the request.
- Maybe the server executed but crashed just before sending the reply.
- Maybe the server executed, but the network died just before delivering the reply.
Remote procedure call doesn't behave the same as a procedure call on a single machine.
- This is a recurring challenge in implementing distributed systems.

The simplest failure-handling scheme is "best-effort RPC":

Call() waits for a response for a while.
If none arrives, re-send the request.
Do this a few times.
Then give up and return an error.

Q: is best effort ever OK?

read-only operations
operations that it's harmless to repeat e.g. DB checks if record has already been inserted

RPC Semantics under failure:

Atleast once
- server executes request at least once
- client may see duplicate replies
- client may retry request
- e.g. DB inserts record, crashes, restarts, inserts again
- Unreliable
Atmost once (Duplicates)
- server executes request at most once
- client may see no reply
- client may retry request
- e.g. DB checks if record already inserted
- Need to ensure idempotent operations
Exactly once (Hard)
- server executes request exactly once
- client never sees duplicate replies
- client never retries request
- hard to achieve
- e.g. DB inserts record, crashes, restarts, doesn't insert again

Summary

Threads are a useful structuring tool for concurrent programs.
Go provides goroutines for concurrency.
Threads are useful for I/O concurrency, multicore performance, and convenience.
Event-driven programming is an alternative to threads.
Remote Procedure Call (RPC) is a key piece of distributed system machinery.
RPC makes it easy to program client/server communication.
RPC hides the details of network protocols and converts data to "wire format."
RPC can introduce challenges, such as handling errors and ensuring the semantics of RPC calls under failure.