Thoughts on development & design: coursera

Showing posts with label coursera. Show all posts

I have recently started the Coursera Functional Programming with Scala course (taught by Martin Odersky - the creator of Scala) - which is actually serving as an introduction to both FP and Scala at the same time having done neither before. The course itself is great, however, trying to watch the videos and take in both the new Scala syntax and the FP concepts at the same time can take a bit of effort.

I wanted to work through some of the core FP concepts in a more familiar context, so am going to apply some of the lessons/principles/exercises in Groovy.

Functions:

If you have done any Groovy programming then you will have come across Groovy functions/closures. As Groovy is dynamically typed (compared to Scala's static typing), you can play it fairly fast and loose.

For example, if we take the square root function that is demonstrated in the Scala course, it is defined as follows:

As you can see, Scala expects the values to be typed (aside, you don't actually always need to provide a return type in Scala). But in the Groovy function it is:

A groovy function can be defined and assigned to any variable, thereby allowing it to be passed around as a first class object. If we look at the complete solution for calculating the square root of a number (using Newton's method - To compute the square root of "x" we start with an estimate of the square root, "y" and continue to improve the the guess by taking the mean of x and x/y )

So that's in Scala, if we try Groovy we will see we can achieve pretty much the same thing easily:

Recursion (and tail-recursion):

As FP avoids having mutable state, the most common approach to solve problems is to break the problem down in to simple functions and call them recursively - This avoids having to maintain state whilst iterating through loops, and each function call is given its input and produces an output.

If we again consider an example from the Scala course, with the example of a simple function that calculates the factorial for a given number.

This simple function recursively calculates the factorial, continuing to call itself until all numbers to zero have been considered. As you can see, there is no mutable state - every call to the factorial function simply takes the input and returns an output (the value of n is never changed or re-assigned, n is simply used to calculate output values)

There is a problem here, and that is as soon as you attempt to calculate the factorial of a significantly large enough number you will encounter a StackOverflow exception - this is because in the JVM every time a function is called, a frame is added to the stack, so working recursively its pretty easy to hit upon the limit of the stack and encounter this problem. The common way to solve this is by using Tail-Call recursion. This trick is simply to have the last code that is evaluated in the function to be the recursive call - normally in FP languages the compiler/interpreter will recognise this pattern and under the hood, it will really just run the code as a loop (e.g. if we know the very last piece of code in the block of code is calling itself, its really not that different to just having the block of code/function inside a loop construct)

In the previous factorial example, it might look like the last code to be executed is the recursive call factorial(n-1) - however, the value of that call is actually returned to the function and THEN multiplied by n - so actually the last piece of code to be evaluated in the function call is actually n * return value of factorial(n-1).
Let's have a look at re-writing the function so it is tail-recursive.

Now, using an accumulator, the last code to be evaluated in the function is our recursive function call. In most FP languages, including Scala, this is enough - however, the JVM doesn't automatically support tail-call recursion, so you actually need to use a rather clunkier approach in Groovy:

The use of the trampoline() method means that the function will now be called using tail-call recursion, so there should never be a StackOverflow exception. It's not as nice as in Scala or other languages, but the support is there so we can continue.

Currying:

This is like function composition - the idea being you take a generic function, and then you curry it with some value to make a more specific application of the function. For example, if we look at a function that given values x and y, it returns z which is the value x percent of y (e.g. given x=10, y=100, it returns the 10 percent of 100, z=10)

The above simple function is a generic mechanism to get a percentage value of another, but if we consider that we wanted a common application of this function was to always calculate 10% of a given value - rather than write a slightly modified version of the function we can simply curry the function as follows:

Now, if the function tenPercent(x) is called, it uses the original percentage() function, but curries the value 10 as the first argument. (If you need to curry other argument positions you can also use the rcurry() function to curry the right most argument, or ncurry() which also takes an argument position - check the Groovy docs on currying for more info)

Immutability:

Immutability is partially supported in Java normally with use of the final keyword (meaning variables can't be changed after being initially set on object instantiation). Groovy also provides a quick and easy @Immutable annotation that can be added to a class to easily make it immutable. But really, there is more to avoiding immutable state than just having classes as immutable - As we have functions as first class objects, we can easily assign variables and mutate them within a function - so this is more of a mindset or philosophy that you have to get used to. For example:

The first example is probably more like the Groovy/Java code we are used to writing, but that is mutating the state of the list - where as the second approach leaves the original list unchanged.

Map Reduce:

As a final note, there are some functions in FP that are pretty common techniques - the most famous of which these days (in part thanks to Google) is Map-Reduce, but the trio of functions are actually Map, Reduce(also known as Fold) & Filter - you can read more about the functions here (or just google them!), but these functions actually correlate pretty nicely to core Groovy functions that you probably use a lot of (assuming you are groovy programmers).

Map

map is the easiest to understand of the three. It takes in two inputs - a function, and a list. It then applies this function to every element in the list. You can basically do the same thing with a list comprehension however.

Sound familiar? This is basically the .collect{} function in Groovy

Reduce/Fold

fold takes in a function and folds it in between the elements of a list. It's a bit hard to understand at first

This one is a bit more complicated to descibe, but is the same as the .inject{} function in groovy

Filter

filter is easy. It takes in a 'test' and a list, and it chucks out any elements of the list which don't satisfy that test.

And another simple one - filtering out a list for desired elements, this is Groovy's .findAll{} function

As I said at the start, I am new to FP and coming from an OO background, but hopefully the above isn't too far from the truth! As I get further through the Coursera course I will try to post again, maybe with some of the assignments attempted in Groovy to see how it really stands up.

Some useful references:

Groovy docs on FP: http://groovy.codehaus.org/Functional+Programming+with+Groovy

Coursera Scala course: https://www.coursera.org/course/progfun

I have just signed up for the Coursera Algorithm Design & Analysis course from Stanford. I have been meaning to take it up for a while but kept missing when it was running. As it happens, this time I have all but missed it, with the final assessments due already, but I have signed up and am going through the material and lectures on my own.

I am only on week one so far, but it is all good - the lecturer seems good and articulates the content well (so far the presentation and sophistication of the course is better than the Scala & FP course and the Logic courses I have previously done on Coursera).

After I completed the Functional Programming & Scala course, I wrote up some notes on some of the techniques and problems in Groovy and thought it might be fun to write up some groovy notes from the algos course.

Week one is covers an introduction to algorithm analysis, divide and conquer (with merge sort) and some other stuff.

Mergesort - Analysis

Mergesort has three key steps

Split the list in to two halves
Recursively sort the first half
Recursively sort the second hald
Merge the two sorted halves

This is a well known example of the divide&conquer paradigm, the recursive sorting continues to recurse and divide the list into two halves until sorting is trivial (e.g. base case of one item in each list).

Let's look at the merge step:

(pseudocode taken from the course)
result = output [length = n]
leftList = 1st sorted array [n/2]
rightList = 2nd sorted array [n/2]
i = 1
j = 1
for k = 1 to n
if leftList(i) < rightList(j)
    result(k) = leftList(i)
    i++
else [rightList(j) < leftList(i)]
    result(k) = rightList(j)
    j++
end

The above should be fairly straight forward to understand:

result stores our final merged list, and will be length n (number of elements in starting array)
leftList/rightList - these are the two inputs to the merge, and are simple the two halves of the list that need to be merged. This assumes that both lists are already sorted prior to merging
i/j - These are pointers to what we consider the "head" of each list. Rather than removing items from lists we will simply keep track of a pointer for each list to track how many items of each list have been merged
We then iterate through all "n" items, comparing the "head" of our two lists, the smaller value getting added to our result list

Complexity analysis of merge step:

In the course, the lecturer states that the worst case running time = 4n + 2 which he then further simplifies to = 6n (this simplification is simply because n must always be at least 1, so 4n + 2 will always be no worse than 6n.

Let's look at how we get to that figure (we will then later look at what the asymptotic run time is). This figure is simply the number of operations that need to be executed when this algorithm is run so let's count those:

result = output [length = n]
leftList = 1st sorted array [n/2]
rightList = 2nd sorted array [n/2]
i = 1
j = 1
In the above, the only operations that are actually set are i & j (the others are just describing the function input/output). So that costs us 2 (constant regardless of how big "n" is)

for k = 1 to n
if leftList(i) < rightList(j)
    result(k) = leftList(i)
    i++
else [rightList(j) < leftList(i)]
    result(k) = rightList(j)
    j++
end

In the above, we have one condition checking/incrementing "k" in the for statement (executed every iteration, so this will be done "n" times); we then have an IF conditional check, this will also be done in every iteration (so this is another "n" operations); then depending on which branch of that condition is executed, there are always a further two operations executed (the assignment of the head to result and the increment of the head pointer) - another two operations executed "n" times.

So the above block executes 4 operations for each of the "n" iterations. The first block executes 2 operations =

4n + 2

Seems straight forward enough. Really though, this is just a fairly simplified piece of pseudo code, and there is a bunch of additional code that is needed to handle other stuff, edge cases etc (e.g. what happens if you have added all of the left list to the result and only have the right list left? obviously that can just be appended to the results, but the above code doesn't handle this). Then there are further questions about different language implementations that might have more operations to do this (not to mention difference when it gets compiled down to machine code) - we will get more into this later when we talk about asymptotic performance. For the time being we will just say the run time (worst case) for merging two lists is 4n + 2.

Now, let's look at recursively sorting..

This part of the algorithm is also relatively simple, we just keep recursing through the list, splitting the list in to two halves until each half of the list is just one element (one element is already sorted!), then we can merge those lists easily.

So, if we are recursing through and splitting the list into two halves, how many times will we have to perform the merge?

Let's step through this, with an example input list of 8 elements (n=8)

One list of 8 elements
Two lists of 4 elements
Four lists of 2 elements
Eight lists of 1 elements

this is just a basic binary tree - see the illustration below taken from wikipedia:

As you can see in the above tree, it splits down into individual elements, then it re-combines them using the merge technique. To work out the complexity of the algorithm we need to know two things:

How deep is our tree going to be?
What is the runtime of a given depth in the tree (e.g. for a given depth "j" in the tree, what is the cost of re-merging all the lists at that level

The first question is relatively easy, as it is a binary tree we know the depth is simply going to be log2(n)+1 - How do we know that? the log of a number is simply how many times it needs to be divided by 2 (assuming 2 is the logarithm base) before the number is <= 1 - which is exactly what we are doing when we are recursively dividing our lists in half until we reach the single element lists.

E.g. the log2(8) = 3 (8/2=4; 4/2=2; 2/2=1)

So now lets calculate the running time for any given level of the tree. Lets consider a list of length "n" and we want to know a few things:

How many merge operations do we need to perform for depth "j" of a list "n"?
In a similar way to how we calculated the depth, we can work out the number of merges (or number of leaves at the level of the tree). If you have halved the lists each step 0..j then the number of leaves will be 2 to the power j.

E.g. The first level down, after just halving the original list, we have j=1, so expect 2 power 1 = 2 leaves.
Second level down, we again halve each of those two lists, we have j=2, so expect 2 power 2 = 4 leaves

Note that j cannot exceed the depth of the tree, which we already know.

What is the complexity of each of these merge operations?
We know the complexity of a merge from our earlier analysis of the merge step as 6m (where m is the size of the halved lists - not the original value "n" of the starting list size), and we know we have 2 to the power j of these merges, which means we know at any level of the tree, the run time performance is:

(2 power j) * (6m)

Now the value of "m" will also be dependent on the value of "j" - because as we get further down the tree, we have a greater number of merges to perform, but the size of the lists are also getting smaller (so the actual value of "m" is decreasing), so what is the value of 6m when we are at level "j" of the tree? We can work that out in a similar fashion - we know the original "n" (size of starting list) has been halved "j" times - so it is simply

(n)/(2 power j)

So now we can combine those and resolve the algebra.. we can see that the 2 power j cancels each other out (intuitively really, as we are increasing the number of merges, but at the same time and at the same rate reducing the size of the lists to be merged:

(2 power j) * 6 * (n~~/(2 power j)~~) = 6n

What is the overall complexity of the tree?
So we now already know the complexity of any given level of the tree (independent of the depth) and we know the depth of the tree (number of levels):

6n * (log2(n)+1) = 6nlog2(n) + 6n

So there we have it - we have calculated the worst case running time for the merge sort.

There are a few further things that we will need to consider in later posts: In one post I will have a quick look at asymptotic analysis (e.g. "Big-Oh" notation) and what this means; in another post I will look at a Groovy implementation of it and the analysis of that.

Is one of your New Year Resolutions to re-skill? Thinking about re-training for a new career (or even just a new hobby) in tech? Then you're in luck!

Today, more than ever, the barrier to entry for starting to learn a new technology or programming language is all but nonexistent, all you really need is a computer (or even a mobile device) and a web connection and you are pretty much good to go - just choose your preferred technology, an IDE and get started. Almost everything is open source or at least free to use for a single developer just out to learn and there is a wealth of blogs, articles and Q'n'A sites ready to help you with tutorials, walk-through's and helpful advice - many of these backed with ready and running code bases on GitHub free for you to play with and generally work out what is going on.

However, with all these resources it can sometimes be a bit daunting with so much content. Once you have chosen a language how do you know where to start? Here are some of our favourite sites and resources that we have discovered and found useful in learning new skills:

iTunes U - a lesser known category on iTunes is their academic section, iTunes U(niversity) featuring loads of podcasts and lectures from a range of academic organisations, and some of this stuff is serious! Several large universities have uploaded full lecture series there, and by and large they are free to download (yes, you have to install iTunes, which sucks, we know). Want to take the full term of Stanford university's iOS course? Its up there. Want to learn AI for chess playing from Cambridge uni? Yep, got that too. And for free.
MIT OpenWare - MIT have been one of the strongest advocates of open sourced education. A lot of there lecture series are online (can also be found on iTunes, but can be avoided). Is it just us who thinks its amazing that anyone around the world with a web connection can get educated by the most prestigious academic organisations around?
Khan Academy - there is a lot of hype around this one, well funded with some pretty big names supporting it (jQuery creator John Resig is a Dean there), a not-for-profit aiming at providing free education for everyone. The academy provides lots of video based courses as well as interactive challenges and detailed stats on how you are doing.
Udacity - this is another recent, well-funded startup trying to tackle free higher education for all. Founded my three robotocists it is slowly building a very respectable catalogue of uni level courses ranging from CS101 to AI for robotics. As with the Khan academy, the lectures are purely for the web so the videos are clear and designed for remote learning (different from the filmed university lectures which are targeting classroom based learning). We have recently created and Open Sourced the Spring-Social implementation of the Khan Academy API - so if you are working with the JVM and want to have a play with the Khan Academy API then check it out on GitHub
CodeAcademy - we have mentioned before we are fans of code academy, code academy is an in-browser development environment that walks you through programming exercises to help you learn with your hands - currently supporting JavaScript, HTML, Ruby and Python
Free eBooks - there are loads of great free eBooks available online, so many there is no point listing them, instead I will just point you here. Which leads nicely on to the next point..
StackOverflow - what really needs to be said about SO? It is the definitive q'n'a site for tech. If you are just starting learning head over and sign up, the help from the incredibly active community over there will be invaluable (although be sure to read the posting guides, they can be a little unforgiving at times!).
Coursera - Another massively popular online learning resource, this one recently generated a lot of interest with its recent Scala course taught by the original creator of the language! We are currently working on some secret integration with Coursera at NerdAbility, and you will soon be able to integrate your Coursera account and show off which courses you have completed!

Hopefully the above resources help on your path to re-skilling. In reality, getting your hands dirty with code and trying to solve problems and fix errors is the best way to learn, so don't forget to get stuck in - and maybe when you are more confident try answering questions on StackOverflow!

Of course, with these new found skills you will want to show them off, so we'd recommend heading to NerdAbility and registering (if you haven't as already) and update your skills, add your StackOverflow profile and even add a custom section talking about what you are learning (employers always love to know that candidates are proactive and motivated when it comes to learning new things and keeping up with technology).

Leave your comments with any other tools and resources you have found useful in your journey of becoming a life long learner.

Thoughts on development & design

Tech: Functional programming in Groovy

Functions:

Recursion (and tail-recursion):

Currying:

Immutability:

Map Reduce:

Map

Reduce/Fold

Filter

Algorithm analysis & complexity: MergeSort - part 1

Mergesort - Analysis

Essential Resources to Become a Life Long Learner (in tech)

ROB HINDS

Popular Posts

Blog Archive

DZone MVB