Skip to content

Instantly share code, notes, and snippets.

@spiralman
Last active August 29, 2015 14:06
Show Gist options
  • Save spiralman/b099bdcda2e310ce138d to your computer and use it in GitHub Desktop.
Save spiralman/b099bdcda2e310ce138d to your computer and use it in GitHub Desktop.
Notes from StrangeLoop 2014.

Unix style of programming involves editing the program from outside; the Smalltalk style is editing the progrom from inside the program itself.

Also, in the C/Unix style, your program operates on opaque bytes, while Smalltalk deals with described objects.

Demo: run xclock; attach gdb to it; execute putenv to change the TZ while running.

So: it is possible to write and edit Unixy applications from inside them.

Complexity has fallen out from languages not being able to talk to each other (shades of reducing applications that do the same thing from keynote; part of the complexity is due to re-writing things in different languages?)

FFI is evil, and unnecessary: we have to describe the internal implementation details of the foreign function in the other language before we can access it from our language.

The real difference between Smalltalk and C is that everything in Smalltalk (compiler, interpreter, browser and debugger) communicates through shared data structures. In C, things interact through explicit interfaces (debugging symbols).

liballocs can provide introspection on applications written in C. Showed a version of node that can import arbitrary C modules and call functions on them, but without having to define an FFI.

Could be expanded: node could export the same metadata, so that C could call into Node, etc. GC also causes some additional problems.

https://github.com/stephenrkell/

How to add types to existing code:

First, import; we'll get errors and messages that things aren't annotated.

Annotate the function signatures. Define type aliases.

Used type checking to detect stack overflow/infinite recursion bug in a simple summarization function.

Can refactor/extend by changing types; type checking, errors indicate where bugs might exist.

Cursive Clojure: IDE for Clojure; people have reportedly dropped emacs and vim for it. Can use typed clojure for type annotations in the IDE.

Keyword maps are modeled in TypedClojure using HMaps. Can define mandatory keys: (HMap :mandatory {:a Num :b Str}) means a type which is a map that must have a key a which is a num, etc.

CircleCi use case:

20% code is type checked. 22% of the code is a library that needs to be annoted. 59% of their type aliases are HMaps.

What's next:

Typed ClojureScript, GSoC project. Using TypeScript annotations to annotate JS interop functions for ClojureScript

Graudal typing: good error messages when we interopperate between typed and untyped clojure. Available in Typed Racket.

Three metrics of evaluation: productivity, safety and performance

Confession of a Used Language Salesman. Author left Haskell and moved to MS, to add FP to existing languages, to ease adoption.

Coders in the popular imagination are solitary, but the reality is we're on GitHub, and we program socially.

Ecological model of adoption: not enought time to listen to all music, so you listen to music your friends listen to, so there is an intersection of the music you and your friends listen to. You can use this to see how adoption grows through a social network.

Analyze niche usage: what languages are only used in a few niches, versus being used across the board. Niche languages can live in the "long tail" of adoption, and can last.

Most of the reasons people pick a language for a project are about library availability and past usage (I/my team knows it, has used it before, or we're just extending an existing code base already in that language).

Languages actually aren't prefered by particular age groups (even distribution). Most people seem to say they "know well" about 4, 4-5 "know somewhat".

Diffusion of innovation: how technology is dispersed. Knowledge, persuasion, decision, trial, confirmation.

Diffusion catalysts; safe sex case study: relatively good and simple, but not observable, trialable or compatible.

Can improve adoption by increasing observability, by convincing opinion leaders and then teaching them to promote a practice.

Tools that could improve adoption of good practices: make it easier for people to see what libraries others use; validate programs against actual input data from users, etc.

Expert systems and rule engines made some successes, but by the 80s, there were a lot of failures that caused a backlash.

Most rule engines try to allow non-developers to build applications, but that doesn't end up working for much beyond simple example problems. Often, when things get complex, you end up inserting a general purpose programming language into your rule language.

Rule authoring is software development.

We often program by building transformations (functions) and wire them up together manually. This is often the cause of a lot of complexity and difficulty changing the requirements/software.

Instead, what if we wire them automatically? Can do this based on types.

Rules are a type of control structure.

Clara is a Clojure implementation of this concept.

Can generate specifications from the rules, to help domain experts validate that the rules are correct.

We are trying to get rid of global mutable state in our applications, but the state is just in the DB, so still part of the system.

We want a magically consistent cache, off the DB. Can use the replication log, and pipe it into a materialized view of that data.

Solves some problems:

  1. Better data - can structure data differently for writes and reads. For example: can write normalized data, but read from denormalized data in the views.
  2. Each view is a fully pre-computed cache. There are no cache misses, because all the data is in the view. Invalidation happens automatically by reading off the replication log.
  3. The entire application could be conceived of this way, not just the DB component.

Since the materialized views are built in a stream processing framework, they can then stream out their changes to an application that renders the view to the end user.

Finagle - RPC for building reliable distributed systems. Written in Scala, good debugging tools (Zipkin), built on Netty (async library).

Finagle adds futures, with three states:

  • undefined: no response from server
  • successful
  • failed

Since it's async, you shouldn't block (IO/locking/waiting for future). You compose futures to avoid blocking waiting for them.

Finagle also has Service which takes a request and returns a future response. Clients & servers both use Service

Protocol agnostic, mostly uses thrift and HTTP, but others available.

Scala interop in Clojure

A single Scala component might compile down to multiple JVM classes, but that makes it hard to expose the Scala component back into Clojure.

finagle-clojure

Wraps JVM interface to Finagle without worrying about Scala/Clojure interop.

Composing futures:

to compose successful responses, can use map or flatmap. map is for synchronous transformations, flatmap for async.

handle and rescue are the same, for failure conditions. Both transform an unsuccessful response into a successful response.

future-pool for calling blocking code without blocking the IO Loop.

Using Thrift for communication; Scrooge to compile Thrift interface to Java.

Code examples: https://t.co/finagle-clojure-example

First Order FRP

Signal graphs - signals coming from UI based on user events, enter program as inputs, processes them, then sends them back to the UI.

Inputs are of Signal type; not just the current value of, e.g. mouse position, but the stream of all mouse positions over time.

Transformations transform signals. their input is a signal, and their output is a signal that outputs the transformed values.

State - foldp (fold from the past). Takes initial state and input signal, folds over the values on the input signal.

Merge takes multiple signals and merges them together. merge just merges values, but lift2 merges them via your own function.

Core Design

  • Signals are connected
  • Signals are infinite (cannot be deleted; inputs are fixed)
  • Signal graphs are static
  • Signals are synchronous by default

Higher order FRP

Don't want to force signal graphs to be static. join takes group of signal, and returns one signal as output, without transforming it. Run into issues when a signal isn't being watched: pause it? throw values away? This is a hard problem, brings up issues such as infinite lookback. Can work around it by only switching signals that have a "safe" amount of history.

Not totally obvious that this is a useful thing, especially

Asynchronous Data Flow

Often the kind of FRP done in imperative languages. Removes static signal graphs, and that they are infinite, and that signals are synchronous by default.

flatten is the mechanism. It takes a list of signals and picks one. Does this by being able to create/destroy signals, which solves the "what to do when signal isn't being listened to" problem.

Arrowized FRP

Removes static signal graphs, and the fact that they are connected to the world. In fact, don't really have Signal at all.

pure an "Automaton" that takes an input and provides an output.

state is similar to foldp, takes an initial value, new value comes in and it reduces it into the current state and returns that.

Can join Automatons with >>>

Works very well, but because signals aren't connected to the world, you can't use it for things like UIs. It's for structuring code, not dealing with user inputs.

What is Mesos?

Originally developed to allow a cluster to be time-shared with different workloads. Provides OS like features:

  • Isolation: uses Linux containers to isolate processes.
  • Resource sharing: scales apps up and down on a cluster based on what is running.
  • Common infrastructure: common ways to launch/kill and get status of a task, docker for package distribution

Architecture

Coordinator/worker model. Coordinators for clustering (Hadoop, Storm, etc.) communicate with Mesos, and Mesos divides out the workers. Can then dynamically re-allocate machines between clusters in Mesos.

How can you use Mesos?

Existing apps/frameworks: Jenkins, Hadoop, etc.

Marathon: PaaS on Mesos; init.d for your cluster. supports Docker, scales easily, manages edge routers using HAProxy.

Chronos: distributed cron. Supports restarts, job dependencies, has a REST API.

Aurora: Advanced PaaS on Mesos. Phased rollouts, automated rollback based on health checks. Supports complex deployments, can isolate apps, tell it about racks/datacenters/etc.

http://mesosphere.io/learn

How can I build on Mesos?

How management works: scheduler requests resources it needs; Mesos tries to provide, but will respond with what can be provided if it's over committed.

What are Transducers?

Make map/filter/etc. more reuseable for different input types (seqs, streams, channels, etc.). They are "process transformations".

What kind of processes?

Ones that can be defined in terms of a succession of steps. Each step ingests input; sometimes they build collections. Left map reduce is the generalized case.

Etymology

  • reduce: to "lead back"
  • ingest: to carry into
  • transduce: to carry across

Conveyances, sources and sinks of a process are irrelevant. It doesn't matter where your inputs come from, where they go to, or how you move them around.

The problem with map/filter/etc. is that they talk about the whole job. Instead, just describe the steps.

Using Transducers

mapcatting, filtering, mapping return transducers.

into, sequence now also take transducers. transduce reduces using a tranducer. chan can take a transducer as the consumer.

Things which can accept a transducer to do work are called "Transducible Processes".

Theory

map/filter/etc. can be defined as foldl or foldr. The problem, though, is that you do that by, internally, doing a conj, which is seq specific.

Transducers take the fold implementation, but take a step which replaces the conj, so that the interface to the outer function can be injected later.

Early Termination

Sometimes there are infinite streams, or just more data than needs to be processed. How does the transducing function signal out to stop.

Transducers support reduced. Which is a value that a transducer may return to signal that the transduction should stop.

State

Some sequence functions require state (counting, summing, etc.). It has to go inside the transducer. This state must be created each time you transform. A built transducer has no state, and then an applied transducer creates the state.

From outside, you don't need to worry about state.

Completion

Some inputs have an end, and some don't. Sometimes, the innermost stepping function, or the transformation functions, might want to perform some operation at completion.

Implemented as: all step functions have a single-arity definition. This will be called exactly once with the accumulated value. Transducers also must have these; may flush before completing.

Initialization

Sometimes the reducing function may support an arity-0, which takes nothing and returns an initial accumulating value.

Transducers must support it, but don't do anything, only call nested init.

In Clojure

Don't actually have mapping, just map of an arity-1 is a transducer instead of an actual map.

Why ClojureScript

Per Rich Hickey: easy and simple aren't the same thing. Often, it's easier to build something complicated

JavaScript was designed to be easy for amateurs, but that makes the language complicated.

One of the fundamental issues, besides the silly things, is that its easy, or encouraged, to have global mutable state, which is difficult to reason about.

ClojureScript's immutable data moves you away from the concept of global mutable state.

ClojureScript also can be updated on your schedule, instead of the browser vendors' schedules.

JS Interop is there, and simple, so easy to use external JS libraries, and interact with the programming environment.

Why React

React works around the fact that the DOM is a giant lump of global mutable state. You can't remove that (it's how the browser works), so we're stuck with it. React lets you write your code as if there isn't this lump of global mutable state.

A React component has a render function, which is a functional mapping of data -> DOM elements. It's outputting virtual/shadow DOM elements, so there's still no mutable state in your code.

When state changes, React re-computes the virtual DOM, which is very fast, then diffs it with the real DOM, and only writes the minimal DOM change set.

You don't worry about where the DOM needs to be updated, or how to do it, React does that part.

Event handling is also taken care of for you; you just say you need to listen for an event on a DOM element, and React adds/removes event handlers to real DOM elements as needed (based on diff).

Why Om

Om provides a single mutable application state. Provides cursors, which allow you to create components that operate on a small part of the overall app state.

Part of your app can be concerned with updating the app state (i.e. from the server) and then your rendering code is concerned with rendering the state to the DOM.

Om can use a fast reference check to figure out the sub-trees of the app state that have changed, and then React does the same thing for the DOM, so we get pretty efficient even with immutable data.

Prismatic went from 26k loc to 5k loc, porting from JS to ClojureScript.

Haste is a Haskell -> JS compiler, plus libraries for DOM manipulation, etc.

Another goal of the project is to build an application as a single codebase, with both frontend and backend together.

Talk was almost all code examples. One big takeaway was that the ancillary libraries and stuff are probably not ready quite yet.

Notes from StrangeLoop 2014.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment