Skip to content

Instantly share code, notes, and snippets.

@jlouis
Created February 19, 2015 14:04
Show Gist options
  • Save jlouis/83a03ff124ab53e403a3 to your computer and use it in GitHub Desktop.
Save jlouis/83a03ff124ab53e403a3 to your computer and use it in GitHub Desktop.
erlang-talk.slide
Erlang
A brief introduction
17 Feb 2015
Tags: introduction, fun
Jesper Louis Andersen
[email protected]
@jlouis666
* About me
- Functional programmer since 1996 (Standard ML, OCaml, Haskell, …)
- Erlang programmer since 2007
- Other stuff: Twelf and its sibling, CELF
- Research: formal methods, type theory, semantics, …
- Nowadays: distribution, concurrency, p2p, …
- Writes about the subject on Medium as @jlouis666
* Goal
I will not teach Erlang
- Too much stuff and too big a system
- I can't cover everything
- I have cherry–picked some things which are rarely told
Have to make a choice:
- Ideology, or
- Code: syntax, semantics
* Setting the stage
* Traits of software: bugs
- Bugs are _always_ complex run-time events
- Nobody knew about them
- In hindsight, the bug is easily explained
- No root cause
- Cognitive failure: blaming the programmers
* Traits of software: robustness
- Is your system always on the verge of breaking down?
- Does it always run in degraded mode?
- Does it require daily tinkering to work?
* Traits of software: dynamic nature
Constantly moving operating point:
- Do you alter the code daily?
- Do users?
- Do administrators?
- Does seemingly benign changes lead to catastrophe?
* Erlang
* An illustration of fault tolerance
.image 2015-feb-erlang-intro/fault-tolerance-1.gif
* Erlang roots
Bjarne Däcker, head of Ericsson CSLab, set the 10 requirements:
- Handling of a very large number of concurrent activities
- Actions to be performed at a certain point in time or within a certain time
- Systems distributed over several computers
- Interaction with hardware
- Very large software systems
* Erlang roots (continued)
- Complex functionality such as feature interaction
- Continuous operation for many years
- Software maintenance (reconfiguration etc.) without stopping the system
- Stringent quality and reliability requirements
- Fault tolerance both to hardware failures and software errors
* Constraints
- Performance is not a requirement
- Interaction with existing code is not a requirement
- Producing academic papers is not a requirement
* Another illustration of fault tolerance
.image 2015-feb-erlang-intro/fault-tolerance-3.gif
* Erlang birth
1986 is the birth year. Principal architects: Joe Armstrong, Robert Virding, Mike Williams.
- Many iterations and reconfigurations since then.
- Erlang/OTP is runtime + libraries, latest version is 17.4.1
Influence by many people throughout the years in addition: Tony Rogvall, Claes Wikström, Ulf Wiger, Patrick Nyblom, Björn Gustavsson, Richard Carlsson, Hans Nilsson, Håkan Mattsson, …
* Some design choices
- Readability: low abstraction
- Low level messaging primitives. Build high-level on top (many iterations)
- Dynamic: untyped, late binding, hot code loading (Prolog heritage, age means no types)
- Integers are `BigInt` by default: no word sizes, fewer errors
- Functional: isolated state, fewer errors, excellent for FSM programming
- _Seamless_Distribution_ without local/distributed distinction
- Bytecode interpreted: portability
- Low latency more important than throughput
* Being functional matters: shared memory
.image 2015-feb-erlang-intro/shared-memory-1.gif
* It works!
Top stories, but there are way more:
- Ericsson AXD 301 telecom switch
- Ejabberd - Jabber Server
- Tail-f (sold to Cisco)
- Bluetail (sold to Nortel)
- WhatsApp (sold to Facebook)
- RabbitMQ (sold to Pivotal)
- Klarna (not sold yet)
* Why does it work?
My thoughts:
From a holistic point of view:
- Engineered for high sustained load: worst case focus
- Avoid writing for the unknown case
- Time constraint: Erlang performs well
- Forced run-time modularity: process is barrier
- Separate trusted and untrusted code
- Concurrency comes first
- Maintenance mode support
- "How do you want your program to crash?"
* Key implementation choices
- Focus on preemption: cannot monopolize the machine
- Focus on testing: Common Test. Concurrent testing is easy.
- Profiling
- Tracing on live production systems (think `DTrace`)
- Introspection capabilities are state-of-the-art
Mistake:
- The standard library was created on-the-fly.
Bottom line: *Industrial* language, has to work tomorrow. Academic questions must come later.
* Language
- The language is _algorithmically_ functional.
- Pure language core.
- Imperative message passing on top.
- Access to other imperative primitives (process dictionary, ETS, ports, …)
* Typical programming examples
.code 2015-feb-erlang-intro/z.erl /^fib.*$/,/\./
# .code 2015-feb-erlang-intro/z.erl /^perms.*$/,/\./
.code 2015-feb-erlang-intro/z.erl /^example_1.*$/,/\./
.code 2015-feb-erlang-intro/z.erl /^example_2.*$/,/\./
case file:open(FName, [read, binary, raw, ...]) of
{ok, FD} -> ...;
{error, Reason} -> ...
end,
% becomes
{ok, FD} = file:open(FName, [read, binary, raw, ...]),
* IPv4 Headers from RFC 791
0 1 2 3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|Version| IHL |Type of Service| Total Length |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Identification |Flags| Fragment Offset |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Time to Live | Protocol | Header Checksum |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Source Address |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Destination Address |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Options | Padding |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
* Parsing IPv4 headers in Erlang
parse_ipv4(<<Version:4/integer, IHL:4/integer, TOS:8/integer, Len:16/integer,
Ident:16/integer, Flags:4/integer, FragOffset:12/integer,
TTL:8/integer, Protocol:8/integer, CheckSum:16/integer,
Source:32/integer,
Dest:32/integer,
Options:24/integer, Padding:8/integer,
Payload/binary>> = Packet)
when Len == byte_size(Packet) ->
case checksum(CheckSum, ...) of
ok ->
{ok, Version, IHL, TOS, Len, Ident, Flags, FragOffset,
TTL, Protocol, CheckSum, Source, Dest, Options, Padding,
Payload};
error -> {error, checksum_mismatch}
end;
parse_ipv4(_Otherwise) ->
{error, length_mismatch}.
To create packets, we do the opposite, e.g.:
frame(Data) ->
L = byte_size(Data),
<<L:32/integer, Data/binary>>.
* ETS
Erlang Term Storage is a tuple-space implementation in Erlang:
{value, key1, value, value}
{value, key2, value, value}
{value, key3}
{value, key4, value}
- Shared data store among many users/processes
- _NOT_ garbage collected
- Sub μs key lookup
- No kernel context switch for access
You don't need memcached, redis, ...
The built-in mnesia database uses ETS as the basic building block.
* Concurrency primitives
* Processes
Pid = spawn(fun() ->
S = init(),
loop(S)
end),
- Spawns new activity, `Pid`, concurrent with the current process.
- Isolated state from caller.
- When a process terminates, owned files close, network sockets close, etc.
* Process isolation
.image 2015-feb-erlang-intro/process-isolation-1.gif
* Pids/Mailboxes
Processes:
- Every process has a *process* *ID* (`Pid`)
- `Pid` is seamlessly distributable in the cluster — `<X.Y.Z>`
- Pids are *registered* to names
- Access by proxy through the registry
- Different registries with different failure semantics
Mailboxes:
- Every process has a *mailbox*
- Messages enter the mailbox in FIFO order
System is dual to message channels and rendesvouz. More Actor-like, less CSP/π-calc.
* Explaining message passing
.image 2015-feb-erlang-intro/message-passing-1.gif
* Asynchronous messaging
Assume S → R, syntax:
% Inside the code of S
R ! Msg,
- Returns `Msg`
- Sending _never_ fails and _never_ blocks†
- Message might never arrive (network failure)
- Order is preserved among any `(S,R)` pair, but may interleave other senders
The send operator is _rare_ in real programs.
Asynchronous is _key_: the real world is!
* Receive
On, the R side, we use a receive expression:
receive
Pat₀ -> Exp₀;
Pat₁ -> Exp₁;
after Timeout ->
TExp
end,
- _Rare_ in real programs
- First message matching one of the patterns are chosen
- Implements selective receive
* Lifetime handling
- Processes can fail and will exit
- Processes can terminate normally
- Processes can be killed by other processes
Cleans up owned context as well, as already said
Termination is central to error handling through links/monitors
* Links/monitors
Side channels for process messaging:
Links:
- bidirectional
- form a "web" in which all processes exit together
- supervisors override the web to implement restart strategies
Monitors:
- unidirectional
- used by janitors/managers to monitor other subsystems and clean up on exit
- often cross-cutting
These two primitives yields fault tolerance.
* Using links/monitors
process_flag(trap_exit, true), % for links
Turns termination into message delivery of an `'EXIT'` message.
_Rarely_ used.
erlang:monitor(process, Pid)
termination delivers a `'DOWN'` message.
Set a monitor before communicating, remove it afterwards:
erlang:demonitor(MonRef, [flush])
* Example, a pinger
.code 2015-feb-erlang-intro/pinger.erl /^ponger.*$/,/\./
.code 2015-feb-erlang-intro/pinger.erl /^ponger_loop.*$/,/\./
.code 2015-feb-erlang-intro/pinger.erl /^ping.*$/,/\./
.code 2015-feb-erlang-intro/pinger.erl /^run_ping.*$/,/\./
* dragon.lan
Dragon is a 32bit ARM, in Big Endian mode:
[jlouis@dragon ~]$ erl -setcookie foobar -name '[email protected]'
Erlang/OTP 17 [erts-6.2] [source] [async-threads:10] [kernel-poll:false]
Eshell V6.2 (abort with ^G)
([email protected])1> c("pinger.erl").
{ok,pinger}
([email protected])2> pinger:ponger().
<0.45.0>
([email protected])3> global:registered_names().
[ponger]
* lady-of-pain
Lady-of-pain is an x86-64 core i7: 64bit mode, Little Endian:
[jlouis@lady-of-pain 2015-feb-erlang-intro]$ erl -setcookie foobar -name 'pinger@lady-of-pain'
Erlang/OTP 17 [erts-6.3.1] [source] [64-bit] [smp:8:8] [ds:8:8:10]
[async-threads:10] [kernel-poll:false]
Eshell V6.3.1 (abort with ^G)
(pinger@lady-of-pain)1> l(pinger).
{module,pinger}
(pinger@lady-of-pain)2> pinger:ping(foo).
** exception error: no function clause matching pinger:run_ping(foo,undefined) (pinger.erl, line 24)
(pinger@lady-of-pain)3> net_adm:ping('[email protected]').
pong
(pinger@lady-of-pain)4> pinger:ping(foo).
{pong,foo}
(pinger@lady-of-pain)5>
* Messaging can be complex:
Typical examples are A → Proxy → B, then B → A for a proxy service.
Or A → Call ← B for a call handler
Or, for cancellable requests:
A → B % Request
A ← B % Request accept/deny with scheme
...
A ← B % Result
* Why message passing works:
- One message at a time
- Process internal state is small, and invariant between messages
- On error, functional persistence means we have the state from _before_ the crash:
Given this _state_ and this _message_ the system crashes with this _backtrace_
* OTP
* What is it?
OTP - misnamed the Open Telecom Platform
- Libraries for common concurrent patterns
- Move 90% of the code into a library
- Handles subtle concurrency bugs for you
Performs many tasks:
- Supervision
- Event handling, logging support, alarms
- Generic implementations of Servers and FSMs
For the Robin Milner nerds: Erlang programs are bigraphs.
* Typical code smell
- Any program not using OTP is an invitation for subtle concurrency error
- Be afraid
- Either the programmer is a total newbie, or Claes `klacke` Wikström level
Not using OTP is akin to:
- Java with no OOP
- Haskell without type classes and monads
- ML without the module system
- F# or Prolog written as were it C
* Hot code loading
.image 2015-feb-erlang-intro/hot-code-loading-1.gif
* Hot code loading
Two variants:
- Upgrade a system to a new one. Rarely used in modern software
- Load a new module into a running system. _Very_ common in use
* Implementation
* The secret
- Erlang implements message passing by _copying_
- "The big secret of Erlang"—Dan Sahlin
- Fully embraced, desirable properties follow
- Copying is like allocation in functional programming
* Preemptive scheduling
- Low latency operation over throughput
- Everything costs _reductions_
- At 2000 reductions, forced context switch
- A function call is 1 reduction, other things cost more
- Advanced scheduler interactions with ports, balancing CPU load with I/O load.
Ports also have a reduction scheme!
* SMP scalability
- State of the art: process migration, carrier migration
- Scheduler binding to cores
- ETS is scalable to at least 64 cores, nearing 128.
- Most optimization on congestion avoidance. Screw the single core!
* Interpreter
- Bytecode interpreter,
- Threaded code,
- Macro peephole optimizer,
- Optimizing bytecode compiler written in Erlang,
- No inlining by default
Typical Erlang programs spend 20% time in `beam_emu.c` according to Linux `perf(1)`. Yet work on a JIT is ongoing.
@lbalker
Copy link

lbalker commented Feb 20, 2015

Hej Jesper, findes billed-filerne online?

@gausby
Copy link

gausby commented Feb 21, 2015

To answer @lbalker's question: The images used in the presentation was taken from the tumblr blog This OTP Life.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment