Thoughts on Monoids and Monads

Trying to consolidate my understanding of monoids and monads, after encountering them for a second time in Just Enough Math.

I learnt about monads first while learning Scala, and have used them. Because they looked really cool. Definition by Martin Odersky in the Reactive Programming course was in code, so was pretty unambiguous.

Learnt about monoids more in Just Enough Math. Definition was in English, so left a little room for interpretation in my mind. Though integers and polynomials are monoids. There’s a mathematical definition on SO, so pretty unambiguous, if only I understood what the symbols meant. Odersky covered them extremely briefly while explaining the associativity law of monads by saying that they were “a simpler form of monads that doesn’t bind anything. For instance integers are a monoid.“

In any case they look to be good for the same thing, which is parallelisation. This is because:

  1. You can pretty much perform operation on two of them in any order you like*
  2. The operation gives you another monoid/monad of the same type

*there’s a clause somewhere that stipulates you must maintain the left-to-right order

And it’s worth a link to the original article which started the SO thread: A monad is just a monoid in the category of endofunctors, what’s the problem?

Mutating using functions in dplyr

Mutating using functions in dplyr always catches me out; functions, unless otherwise specified, get applied to the first row. For example let’s assume I have the following data, based on Glasgow House Sales:   I really only want the postcode district (the first bit), so I have this function to extract it: postcodeDistrict <- function(postcode) strsplit(toupper(postcode), […]

Coding with long-running operations (using memoisation)

I find myself often working on a piece of code which involves a long-running operation, e.g. reading a large file. All good developers I know advocate an incremental approach, i.e. code a little, test a little.  With this approach it’s very important to get the running time of the code down. There are various ways of doing […]

MongoDb and the Lambda Architecture

The need for speed in big data systems Companies such as Google and Twitter have made something that was thought previously to be impossible, to be something that is expected.  That is, sub-second response time regardless of the size of the data. Most enterprise systems lag quite badly behind this ideal.  Even relatively simple queries over […]