Created
February 19, 2015 14:04
-
-
Save jlouis/83a03ff124ab53e403a3 to your computer and use it in GitHub Desktop.
erlang-talk.slide
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Erlang | |
A brief introduction | |
17 Feb 2015 | |
Tags: introduction, fun | |
Jesper Louis Andersen | |
[email protected] | |
@jlouis666 | |
* About me | |
- Functional programmer since 1996 (Standard ML, OCaml, Haskell, …) | |
- Erlang programmer since 2007 | |
- Other stuff: Twelf and its sibling, CELF | |
- Research: formal methods, type theory, semantics, … | |
- Nowadays: distribution, concurrency, p2p, … | |
- Writes about the subject on Medium as @jlouis666 | |
* Goal | |
I will not teach Erlang | |
- Too much stuff and too big a system | |
- I can't cover everything | |
- I have cherry–picked some things which are rarely told | |
Have to make a choice: | |
- Ideology, or | |
- Code: syntax, semantics | |
* Setting the stage | |
* Traits of software: bugs | |
- Bugs are _always_ complex run-time events | |
- Nobody knew about them | |
- In hindsight, the bug is easily explained | |
- No root cause | |
- Cognitive failure: blaming the programmers | |
* Traits of software: robustness | |
- Is your system always on the verge of breaking down? | |
- Does it always run in degraded mode? | |
- Does it require daily tinkering to work? | |
* Traits of software: dynamic nature | |
Constantly moving operating point: | |
- Do you alter the code daily? | |
- Do users? | |
- Do administrators? | |
- Does seemingly benign changes lead to catastrophe? | |
* Erlang | |
* An illustration of fault tolerance | |
.image 2015-feb-erlang-intro/fault-tolerance-1.gif | |
* Erlang roots | |
Bjarne Däcker, head of Ericsson CSLab, set the 10 requirements: | |
- Handling of a very large number of concurrent activities | |
- Actions to be performed at a certain point in time or within a certain time | |
- Systems distributed over several computers | |
- Interaction with hardware | |
- Very large software systems | |
* Erlang roots (continued) | |
- Complex functionality such as feature interaction | |
- Continuous operation for many years | |
- Software maintenance (reconfiguration etc.) without stopping the system | |
- Stringent quality and reliability requirements | |
- Fault tolerance both to hardware failures and software errors | |
* Constraints | |
- Performance is not a requirement | |
- Interaction with existing code is not a requirement | |
- Producing academic papers is not a requirement | |
* Another illustration of fault tolerance | |
.image 2015-feb-erlang-intro/fault-tolerance-3.gif | |
* Erlang birth | |
1986 is the birth year. Principal architects: Joe Armstrong, Robert Virding, Mike Williams. | |
- Many iterations and reconfigurations since then. | |
- Erlang/OTP is runtime + libraries, latest version is 17.4.1 | |
Influence by many people throughout the years in addition: Tony Rogvall, Claes Wikström, Ulf Wiger, Patrick Nyblom, Björn Gustavsson, Richard Carlsson, Hans Nilsson, Håkan Mattsson, … | |
* Some design choices | |
- Readability: low abstraction | |
- Low level messaging primitives. Build high-level on top (many iterations) | |
- Dynamic: untyped, late binding, hot code loading (Prolog heritage, age means no types) | |
- Integers are `BigInt` by default: no word sizes, fewer errors | |
- Functional: isolated state, fewer errors, excellent for FSM programming | |
- _Seamless_Distribution_ without local/distributed distinction | |
- Bytecode interpreted: portability | |
- Low latency more important than throughput | |
* Being functional matters: shared memory | |
.image 2015-feb-erlang-intro/shared-memory-1.gif | |
* It works! | |
Top stories, but there are way more: | |
- Ericsson AXD 301 telecom switch | |
- Ejabberd - Jabber Server | |
- Tail-f (sold to Cisco) | |
- Bluetail (sold to Nortel) | |
- WhatsApp (sold to Facebook) | |
- RabbitMQ (sold to Pivotal) | |
- Klarna (not sold yet) | |
* Why does it work? | |
My thoughts: | |
From a holistic point of view: | |
- Engineered for high sustained load: worst case focus | |
- Avoid writing for the unknown case | |
- Time constraint: Erlang performs well | |
- Forced run-time modularity: process is barrier | |
- Separate trusted and untrusted code | |
- Concurrency comes first | |
- Maintenance mode support | |
- "How do you want your program to crash?" | |
* Key implementation choices | |
- Focus on preemption: cannot monopolize the machine | |
- Focus on testing: Common Test. Concurrent testing is easy. | |
- Profiling | |
- Tracing on live production systems (think `DTrace`) | |
- Introspection capabilities are state-of-the-art | |
Mistake: | |
- The standard library was created on-the-fly. | |
Bottom line: *Industrial* language, has to work tomorrow. Academic questions must come later. | |
* Language | |
- The language is _algorithmically_ functional. | |
- Pure language core. | |
- Imperative message passing on top. | |
- Access to other imperative primitives (process dictionary, ETS, ports, …) | |
* Typical programming examples | |
.code 2015-feb-erlang-intro/z.erl /^fib.*$/,/\./ | |
# .code 2015-feb-erlang-intro/z.erl /^perms.*$/,/\./ | |
.code 2015-feb-erlang-intro/z.erl /^example_1.*$/,/\./ | |
.code 2015-feb-erlang-intro/z.erl /^example_2.*$/,/\./ | |
case file:open(FName, [read, binary, raw, ...]) of | |
{ok, FD} -> ...; | |
{error, Reason} -> ... | |
end, | |
% becomes | |
{ok, FD} = file:open(FName, [read, binary, raw, ...]), | |
* IPv4 Headers from RFC 791 | |
0 1 2 3 | |
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 | |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | |
|Version| IHL |Type of Service| Total Length | | |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | |
| Identification |Flags| Fragment Offset | | |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | |
| Time to Live | Protocol | Header Checksum | | |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | |
| Source Address | | |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | |
| Destination Address | | |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | |
| Options | Padding | | |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | |
* Parsing IPv4 headers in Erlang | |
parse_ipv4(<<Version:4/integer, IHL:4/integer, TOS:8/integer, Len:16/integer, | |
Ident:16/integer, Flags:4/integer, FragOffset:12/integer, | |
TTL:8/integer, Protocol:8/integer, CheckSum:16/integer, | |
Source:32/integer, | |
Dest:32/integer, | |
Options:24/integer, Padding:8/integer, | |
Payload/binary>> = Packet) | |
when Len == byte_size(Packet) -> | |
case checksum(CheckSum, ...) of | |
ok -> | |
{ok, Version, IHL, TOS, Len, Ident, Flags, FragOffset, | |
TTL, Protocol, CheckSum, Source, Dest, Options, Padding, | |
Payload}; | |
error -> {error, checksum_mismatch} | |
end; | |
parse_ipv4(_Otherwise) -> | |
{error, length_mismatch}. | |
To create packets, we do the opposite, e.g.: | |
frame(Data) -> | |
L = byte_size(Data), | |
<<L:32/integer, Data/binary>>. | |
* ETS | |
Erlang Term Storage is a tuple-space implementation in Erlang: | |
{value, key1, value, value} | |
{value, key2, value, value} | |
{value, key3} | |
{value, key4, value} | |
- Shared data store among many users/processes | |
- _NOT_ garbage collected | |
- Sub μs key lookup | |
- No kernel context switch for access | |
You don't need memcached, redis, ... | |
The built-in mnesia database uses ETS as the basic building block. | |
* Concurrency primitives | |
* Processes | |
Pid = spawn(fun() -> | |
S = init(), | |
loop(S) | |
end), | |
- Spawns new activity, `Pid`, concurrent with the current process. | |
- Isolated state from caller. | |
- When a process terminates, owned files close, network sockets close, etc. | |
* Process isolation | |
.image 2015-feb-erlang-intro/process-isolation-1.gif | |
* Pids/Mailboxes | |
Processes: | |
- Every process has a *process* *ID* (`Pid`) | |
- `Pid` is seamlessly distributable in the cluster — `<X.Y.Z>` | |
- Pids are *registered* to names | |
- Access by proxy through the registry | |
- Different registries with different failure semantics | |
Mailboxes: | |
- Every process has a *mailbox* | |
- Messages enter the mailbox in FIFO order | |
System is dual to message channels and rendesvouz. More Actor-like, less CSP/π-calc. | |
* Explaining message passing | |
.image 2015-feb-erlang-intro/message-passing-1.gif | |
* Asynchronous messaging | |
Assume S → R, syntax: | |
% Inside the code of S | |
R ! Msg, | |
- Returns `Msg` | |
- Sending _never_ fails and _never_ blocks† | |
- Message might never arrive (network failure) | |
- Order is preserved among any `(S,R)` pair, but may interleave other senders | |
The send operator is _rare_ in real programs. | |
Asynchronous is _key_: the real world is! | |
* Receive | |
On, the R side, we use a receive expression: | |
receive | |
Pat₀ -> Exp₀; | |
Pat₁ -> Exp₁; | |
… | |
after Timeout -> | |
TExp | |
end, | |
- _Rare_ in real programs | |
- First message matching one of the patterns are chosen | |
- Implements selective receive | |
* Lifetime handling | |
- Processes can fail and will exit | |
- Processes can terminate normally | |
- Processes can be killed by other processes | |
Cleans up owned context as well, as already said | |
Termination is central to error handling through links/monitors | |
* Links/monitors | |
Side channels for process messaging: | |
Links: | |
- bidirectional | |
- form a "web" in which all processes exit together | |
- supervisors override the web to implement restart strategies | |
Monitors: | |
- unidirectional | |
- used by janitors/managers to monitor other subsystems and clean up on exit | |
- often cross-cutting | |
These two primitives yields fault tolerance. | |
* Using links/monitors | |
process_flag(trap_exit, true), % for links | |
Turns termination into message delivery of an `'EXIT'` message. | |
_Rarely_ used. | |
erlang:monitor(process, Pid) | |
termination delivers a `'DOWN'` message. | |
Set a monitor before communicating, remove it afterwards: | |
erlang:demonitor(MonRef, [flush]) | |
* Example, a pinger | |
.code 2015-feb-erlang-intro/pinger.erl /^ponger.*$/,/\./ | |
.code 2015-feb-erlang-intro/pinger.erl /^ponger_loop.*$/,/\./ | |
.code 2015-feb-erlang-intro/pinger.erl /^ping.*$/,/\./ | |
.code 2015-feb-erlang-intro/pinger.erl /^run_ping.*$/,/\./ | |
* dragon.lan | |
Dragon is a 32bit ARM, in Big Endian mode: | |
[jlouis@dragon ~]$ erl -setcookie foobar -name '[email protected]' | |
Erlang/OTP 17 [erts-6.2] [source] [async-threads:10] [kernel-poll:false] | |
Eshell V6.2 (abort with ^G) | |
([email protected])1> c("pinger.erl"). | |
{ok,pinger} | |
([email protected])2> pinger:ponger(). | |
<0.45.0> | |
([email protected])3> global:registered_names(). | |
[ponger] | |
* lady-of-pain | |
Lady-of-pain is an x86-64 core i7: 64bit mode, Little Endian: | |
[jlouis@lady-of-pain 2015-feb-erlang-intro]$ erl -setcookie foobar -name 'pinger@lady-of-pain' | |
Erlang/OTP 17 [erts-6.3.1] [source] [64-bit] [smp:8:8] [ds:8:8:10] | |
[async-threads:10] [kernel-poll:false] | |
Eshell V6.3.1 (abort with ^G) | |
(pinger@lady-of-pain)1> l(pinger). | |
{module,pinger} | |
(pinger@lady-of-pain)2> pinger:ping(foo). | |
** exception error: no function clause matching pinger:run_ping(foo,undefined) (pinger.erl, line 24) | |
(pinger@lady-of-pain)3> net_adm:ping('[email protected]'). | |
pong | |
(pinger@lady-of-pain)4> pinger:ping(foo). | |
{pong,foo} | |
(pinger@lady-of-pain)5> | |
* Messaging can be complex: | |
Typical examples are A → Proxy → B, then B → A for a proxy service. | |
Or A → Call ← B for a call handler | |
Or, for cancellable requests: | |
A → B % Request | |
A ← B % Request accept/deny with scheme | |
... | |
A ← B % Result | |
* Why message passing works: | |
- One message at a time | |
- Process internal state is small, and invariant between messages | |
- On error, functional persistence means we have the state from _before_ the crash: | |
Given this _state_ and this _message_ the system crashes with this _backtrace_ | |
* OTP | |
* What is it? | |
OTP - misnamed the Open Telecom Platform | |
- Libraries for common concurrent patterns | |
- Move 90% of the code into a library | |
- Handles subtle concurrency bugs for you | |
Performs many tasks: | |
- Supervision | |
- Event handling, logging support, alarms | |
- Generic implementations of Servers and FSMs | |
For the Robin Milner nerds: Erlang programs are bigraphs. | |
* Typical code smell | |
- Any program not using OTP is an invitation for subtle concurrency error | |
- Be afraid | |
- Either the programmer is a total newbie, or Claes `klacke` Wikström level | |
Not using OTP is akin to: | |
- Java with no OOP | |
- Haskell without type classes and monads | |
- ML without the module system | |
- F# or Prolog written as were it C | |
* Hot code loading | |
.image 2015-feb-erlang-intro/hot-code-loading-1.gif | |
* Hot code loading | |
Two variants: | |
- Upgrade a system to a new one. Rarely used in modern software | |
- Load a new module into a running system. _Very_ common in use | |
* Implementation | |
* The secret | |
- Erlang implements message passing by _copying_ | |
- "The big secret of Erlang"—Dan Sahlin | |
- Fully embraced, desirable properties follow | |
- Copying is like allocation in functional programming | |
* Preemptive scheduling | |
- Low latency operation over throughput | |
- Everything costs _reductions_ | |
- At 2000 reductions, forced context switch | |
- A function call is 1 reduction, other things cost more | |
- Advanced scheduler interactions with ports, balancing CPU load with I/O load. | |
Ports also have a reduction scheme! | |
* SMP scalability | |
- State of the art: process migration, carrier migration | |
- Scheduler binding to cores | |
- ETS is scalable to at least 64 cores, nearing 128. | |
- Most optimization on congestion avoidance. Screw the single core! | |
* Interpreter | |
- Bytecode interpreter, | |
- Threaded code, | |
- Macro peephole optimizer, | |
- Optimizing bytecode compiler written in Erlang, | |
- No inlining by default | |
Typical Erlang programs spend 20% time in `beam_emu.c` according to Linux `perf(1)`. Yet work on a JIT is ongoing. |
To answer @lbalker's question: The images used in the presentation was taken from the tumblr blog This OTP Life.
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Hej Jesper, findes billed-filerne online?