Declarative vs. Imperative

Image result for coding

Declarative & Imperative code paradigms, are common buzz-words in the tech industry. In this post we’ll discuss these programming paradigms, what’s so good/bad about them and most important: a simple practical suggestion that can turn imperative code into declarative. Ready?

Let’s start with the more intuitive paradigm: Imperative. Imagine you want to get from point A to point B, like most of us, you’ll turn your phone’s GPS on, and ask for directions.

The list of directions may looks something like the following:
* drive straight for 0.5 miles
* turn right into street X
* continue for another 0.3 miles
* turn right into street Y
* on the first intersection turn left into street Z
* drive another 0.6 miles and the destination will be to your left

Imperative coding means creating an “ordered list of instructions” that, once followed, the expected result should be calculated.

In a CS degree, students run into imperative coding examples very often. For example, let’s take the pseudo-code for BFS

(credit to: https://www.ics.uci.edu/~eppstein/161/960215.html):

unmark all vertices
    choose some starting vertex x
    mark x
    list L = x
    tree T = x
    while L nonempty
    choose some vertex v from front of list
    visit v
    for each unmarked neighbor w
        mark w
        add it to end of list
        add edge vw to T

Pros:
* to most of us, it feels more natural to write imperative code
* once we wrote such a code, we’re familiar with it and it’s easy for us to follow it, modify if needed, etc

Cons:
* Unless someone tells you what this code is doing, you’ll have to read the whole thing, might even need to think about it for a few seconds, until you figure out what the code is doing
* If you’re not very familiar with the code, you’ll need to “re-learn” it every time you’ll encounter it, again and again until you know it almost by heart

Q: Why is this considered to be so bad?
A: Since you (usually) write code only once, but read it many times, as well as have to maintain & debug it. Optimizing for “easy-writing” is not ideal.

Q: So how do we optimize for reading/maintaining/debugging?

I’m glad you asked!

That’s where declarative code paradigm comes into the picture!
Declarative code means that the code expresses what you want to achieve without nitpicking on the details of what steps need to be taken in order to achieve it. Sounds ideal, right? that’s also why it’s so hard for most of us to accomplish.

Let’s explore a few examples:

Example 1
Shamelessly stealing from Reed

var odds = collection.Where(num => num % 2 != 0);

The intention is very clear: we want to grab elements from a collection, but not all of them: we want only the odd numbers.

Example 2
Shamelessly stealing from Mark

SELECT score FROM games WHERE id < 100;

Here we say that we want all the scores from a table called games, but only those that have id smaller than 100.

Pay close attention that in both examples we can’t tell how the result will be achieved: in the first example we don’t know how the function Where() is implemented. And on the second example we don’t have any idea how these scores will be collected, which steps will be taken in order to provide us with these scores: will it require a full table-scan? is the ID column indexed and hence more performant?

Pop quiz: consider the two approaches of writing a program “top-down” vs. “bottom-up”, which approach is Imperative and which one is Declarative?

The answer to this question also implies how can we turn imperative code into declarative: the secret is refactoring!

Why refactoring?

Again, let’s take an example: say we want to print every prime number < 1000. Let’s start with the imperative approach, we’ll use javascript:

function prms() {
    let i, j;
    for (i = 2; i < 1000; i++) {
        for (j = 2; j < i; j++) {
            if (i % j === 0) {
                break;
            }
        }
        if (j === i) {
            console.log(i);
        }
    }
}

not complicated right? still, it requires a few moments of going over the implementation details to understand what prms() does.
Now let’s refactor it just a bit:

function printPrimesBelow(N) {
    for (let i = 2; i < N; i++) {
        if (isPrime(i)) {
            console.log(i);
        }
    }
}

awesome, now it’s much clearer: just changing the name of the function already makes it clearer to the reader. Not to mention how the code became shorter and more readable, the code flow doesn’t require break and the last if-condition that we had in the previous implementation. And before some of you start yelling at me that isPrime was not implemented, here we go:

function isPrime (n) {
    for (let i = 2; i < n; i++) {
        if (n % i === 0) {
            return false;
        }
    }
    return true;
}

Some people dislike this kind of refactoring, they’ll tell you that the original implementation is more performant and that it’s also shorter (12 lines vs. 15 lines). These sounds like two good arguments, so why should we go through this trouble?

1. It makes our code declarative, hence more readable & easier to maintain. Only for this reason it’s worthy: one person writes code but the whole team is maintaining it, why make them work harder?

Any fool can write code that a computer can understand. Good programmers write code that humans can understand.” – Martin Fowler, 2008.

2. Unit-tests: breaking your code to smaller units of logic, allows you to easily test each one of them. Most of the time, when you break the code, you also reduce the complexity. An example would be an imperative function that has 16 edge-cases that needs to be tested, but breaking this function into 3 smaller functions might allow you to handle only two edge-cases in each one of them, thus reducing the number of test cases you need to write from 16 to 6. This is a naive example, in reality, when there is a long function that contains many code-paths, even if you try hard to write good tests to cover them, you most likely won’t be able to cover all of them, and we all know where the bugs like to hide (hint: it’s not in the area of the code that you tested thoroughly…)

Note: unit-tests are not the purpose, they are a means to an end of making your code more reliable, and of providing you with the confidant & security to apply code changes without worrying about breaking anything. Having this kind of safety net not only gives you delight, it also improves your velocity tremendously!

Ok, but we didn’t answer the claims about length and performance:

Length: today we deal with GBs and TBs of data, another couple of lines of code are translated into a few more bytes which is practically nothing. But even if there was a real trade off between length and readability – I’d choose the latter almost every single time.

Performance: this is my favorite because most people that try to use the “performance” card don’t know much about it. The retort would be: can you benchmark it and prove your claim that the first is more performant? this will get 95% of them off your back. The other 5% that will bother to write a benchmark (which is not a trivial thing to do if you don’t know what you’re doing) will most certainly find that even if they were right, the difference between the two versions is negligible. Further, many popular languages are not very performant to begin with (JS, Python, Ruby, PHP…) which makes the “performance” claim even more ridiculous. And last, in some rare cases you will want to make your code less readable in favor of performance (real-time systems, library that is popular, batch/long running jobs that process a lot of data) but even in these cases, making your code more performant and less readable is usually the last thing you’ll do.

To sum up, we usually want to optimize for code-reading, not for writing, and since declarative code is easier to read & understand (though it might require more effort in writing), we’ll prefer to work with declarative code. That said, we usually start by writing imperative code, and then refactoring it until it becomes declarative.

Happy coding!

Declarative vs. Imperative

Negative Space, and How Does it Apply to Coding

There’s a term in drawing called “negative space”. The idea is very simple: you don’t draw what’s there – you draw what’s not there.

neg_mugchair

Credit: http://www.artinstructionblog.com/an-introduction-to-negative-drawing-with-mike-sibley

NEGATIVE SPACE by Leonardo Santamaria for Satellite Citi.

In software architecture realm, we usually come up with a basic design, implement a “layout” (infrastructure of your application) and then start augmenting it by adding features and providing better coverage for different use-cases.

Defining the negative space of your program means that not only you define what it should do – but also declare what it shouldn’t. By understanding what should be the limitations of your program you can simplify the design and easily reject feature requests that are crossing these borders.

In the world or micro-services it is very easy to create a new service and sometimes the right decision is to not support a new capability in “service A” but rather push that capability into a more qualifying “service B” or even spawn up a new dedicated service.

During my work at Netflix I ran a few times into a case where we needed to implement a new capability into existing services. That capability had a complex spec and was latency-sensitive, so we created new services in order to be able to:

  1. Keep the current services simple
  2. Add a new dedicated caching layer, but more important:
  3. Make our services resilient by applying the bulkhead design pattern.

So next time you architect a piece of software don’t forget to take into account the negative space!

Negative Space, and How Does it Apply to Coding

Implementing a Multi-Iterator in Java

Say you had to implement a class that takes as an argument (in the constructor) Iterator<Iterator<E>> iterator and returns an iterator that can iterate all the elements under Iterator<Iterator<E>> iterator in a round-robin manner, meaning, according to the order of the numbers in the following screenshot:

sc1

I searched online and found implementations such as thisthis and even Guava but they are not helpful in our case since the order they are iterating is vertical: 1,7,13,17,2,8,14,18,3,9,4,…

So I decided to come up with my solution to the problem, and here’s what I got:

import java.util.Iterator;
import java.util.LinkedList;
import java.util.List;
import java.util.NoSuchElementException;

/**
 * Created by alfasin on 11/15/15.
 */


public class MultiIterator<E> implements Iterator {

    List<Iterator<E>> iterators = new LinkedList<>();
    Iterator<E> current = null;

    public MultiIterator(Iterator<Iterator<E>> iterator) {
        // copy the iterators into a list
        while (iterator.hasNext()) {
            iterators.add(iterator.next());
        }
    }

    @Override
    public boolean hasNext() {
        boolean result = false;
        if (iterators.isEmpty() && (current == null || !current.hasNext())) {
            return false;
        }

        if (current == null) {
            current = iterators.remove(0);
        }

        while (!current.hasNext() && !iterators.isEmpty()) {
            current = iterators.remove(0);
        }

        if (current.hasNext()) {
            result = true;
        }
        return result;
    }

    @Override
    public E next() {
        if (current == null) {
            try {
                current = iterators.remove(0);
            } catch (IndexOutOfBoundsException e) {
                throw new NoSuchElementException();
            }
        }
        E result = current.next(); // if this method was called without checking 'hasNext' this line might raise NoSuchElementException which is fine
        iterators.add(current);
        current = iterators.remove(0);
        return result;
    }

    // test
    public static void main(String[] args) {
        List<Integer> a = new LinkedList<>();
        a.add(1);
        a.add(7);
        a.add(13);
        a.add(17);
        List<Integer> b = new LinkedList<>();
        b.add(2);
        b.add(8);
        b.add(14);
        b.add(18);
        List<Integer> c = new LinkedList<>();
        c.add(3);
        c.add(9);
        List<Integer> d = new LinkedList<>();
        d.add(4);
        d.add(10);
        d.add(15);
        List<Integer> e = new LinkedList<>();
        e.add(5);
        e.add(11);
        List<Integer> f = new LinkedList<>();
        f.add(6);
        f.add(12);
        f.add(16);
        f.add(19);
        List<Iterator<Integer>> iterators = new LinkedList<>();
        iterators.add(a.iterator());
        iterators.add(b.iterator());
        iterators.add(c.iterator());
        iterators.add(d.iterator());
        iterators.add(e.iterator());
        iterators.add(f.iterator());
        MultiIterator<Integer> it = new MultiIterator<>(iterators.iterator());
        while (it.hasNext()) {
            System.out.print(it.next() + ","); // prints: 1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,
        }
    }
}

A different approach would be to read all the elements, during initialization, into one list and then return an iterator of that list!

Let’s see how it works:

import java.util.Iterator;
import java.util.LinkedList;
import java.util.List;

/**
 * Created by alfasin on 11/15/15.
 */

public class MultiIterator<E> {

    Iterator<Iterator<E>> iterator = null;
    List<E> elements = new LinkedList<>();

    private MultiIterator(Iterator<Iterator<E>> iterator) {
        this.iterator = iterator;
    }

    private void copyElementsInOrder() {
        List<Iterator<E>> iterators = new LinkedList<>();
        // copy the iterators into a list
        while (iterator.hasNext()) {
            iterators.add(iterator.next());
        }
        // go over the list, round-robin, and grab one
        // element from each sub-iterator and add it to *elements*
        // empty sub-iterators will get dropped off the list
        while (!iterators.isEmpty()) {
            Iterator<E> subIterator = iterators.remove(0);
            if (subIterator.hasNext()) {
                elements.add(subIterator.next());
                iterators.add(subIterator);
            }
        }
    }

    public static <E> Iterator<E> iterator(Iterator<Iterator<E>> iterator) {
        MultiIterator<E> instance = new MultiIterator<>(iterator);
        instance.copyElementsInOrder();
        return instance.elements.iterator();
    }

    // test
    public static void main(String[] args) {
        List<Integer> a = new LinkedList<>();
        a.add(1);
        a.add(7);
        a.add(13);
        a.add(17);
        List<Integer> b = new LinkedList<>();
        b.add(2);
        b.add(8);
        b.add(14);
        b.add(18);
        List<Integer> c = new LinkedList<>();
        c.add(3);
        c.add(9);
        List<Integer> d = new LinkedList<>();
        d.add(4);
        d.add(10);
        d.add(15);
        List<Integer> e = new LinkedList<>();
        e.add(5);
        e.add(11);
        List<Integer> f = new LinkedList<>();
        f.add(6);
        f.add(12);
        f.add(16);
        f.add(19);
        List<Iterator<Integer>> iterators = new LinkedList<>();
        iterators.add(a.iterator());
        iterators.add(b.iterator());
        iterators.add(c.iterator());
        iterators.add(d.iterator());
        iterators.add(e.iterator());
        iterators.add(f.iterator());
        Iterator<Integer> it = MultiIterator.<Integer>iterator(iterators.iterator());
        while (it.hasNext()) {
            System.out.print(it.next() + ","); // prints: 1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,
        }
    }
}

There are pros and cons to each one of these approaches:

The first approach is more complex because it requires checking and handling of many edge-cases (including catching an IndexOutOfBoundsException in case that our list of iterators is empty and throwing the proper NoSuchElementException instead).

The second approach is more concise on one hand, but on the other hand it requires copying, in-advance, all the elements (greedy) vs. the first approach which is lazy and should be preferred in cases of large datasets.

Note that the second approach implements a factory and as such it doesn’t implement the Iterator interface!

EDIT:

Now with Java 9, it’s much more elegant to create such lists (using the new collections APIs). I re-did this exercise and here’s what I got:

class MultiIterator<E> implements Iterator {

    private LinkedList<Iterator<E>> iterators = new LinkedList<>();
    private int current = 0;

    public MultiIterator(Iterator<Iterator<E>> iterator) {
        while(iterator.hasNext()) {
            iterators.add(iterator.next());
        }
    }
    /**
     * Returns {@code true} if the iteration has more elements.
     * (In other words, returns {@code true} if {@link #next} would
     * return an element rather than throwing an exception.)
     *
     * @return {@code true} if the iteration has more elements
     */
    @Override
    public boolean hasNext() {
        if (iterators.size() == 0) {
            return false;
        }
        if (iterators.get(current).hasNext()) {
            return true;
        }
        return false;
    }

    /**
     * Returns the next element in the iteration.
     *
     * @return the next element in the iteration
     * @throws NoSuchElementException if the iteration has no more elements
     */
    @Override
    public E next() {
        E next;
        try {
            next = iterators.get(current).next(); // might throw IndexOutOfBoundsException if wasn't called with hasNext()
            Iterator<E> iter = iterators.remove(current);
            if (iter.hasNext()) {
                iterators.add(iter);
            }
        } catch (IndexOutOfBoundsException e) {
            throw new NoSuchElementException();
        }
        return next;
    }

    public static void main(String[] args) {
        List<Integer> l1 = List.of(1, 2, 3, 4, 5, 6);
        List<Integer> l2 = List.of(5, 6, 7, 8);
        List<Integer> l3 = List.of(9, 10, 11, 12);
        Iterator<Iterator<Integer>> iterators = List.of(l1.iterator(), l2.iterator(), l3.iterator()).iterator();
        MultiIterator<Integer> iters = new MultiIterator<>(iterators);
        while (iters.hasNext()) {
            System.out.println(iters.next());
        }
    }
}
Implementing a Multi-Iterator in Java

Python: Six, ABC and functools.wraps

Python
Over the past week I went through some Python code in Stackstorm (hosted in GitHub), and by doing so I learned a few cool new things. Maybe ‘new’ is not the right term, but these things were new to me and I wanted to share them with you.

The code I am referring to resides in a few places in Stackstorm, the first item:

@six.add_metaclass(abc.ABCMeta)
class Access(object):
Two things here:

1. six.add_metaclass – makes Access a metaclass in a way which is compatible both with Python2 and Python3. In general, six package has a purpose of helping developers migrate their code from Python 2 to Python 3 by writing code which is compatible with both.
Hence the name: six = 2 * 3. To read more: https://pypi.python.org/pypi/six

2. abc.ABCMeta – abc is another builtin package, its purpose is to provide support for Abstract Base Classes. By making a class abstract you ensure that nobody can instantiate objects out of it.
This is useful for (at least) three purposes:
a. Maintain logic which is common to multiple classes in one place
b. force the users to implement methods which are annotated by @abc.abstractmethod (you can also force the inheriting classes to create specific class members by using the annotation @abc.abstractproperty). For more info in regards see the docs.
c. creating a singleton. I’m still not convinced about the usefulness of creating a singleton in Python (because we have modules), but assuming you want to do so, you can either create a metaclass and call its methods as ClassName.method() or extend a metaclass but not implement the abstractmethod or abstractproperty. By not implementing the abstract-methods/properties the inheriting class will become abstract as well. This technique is used in a few different places in StackStorm, if you want to see some examples, search for classes that inherit from Access.

3. This one took me a while to wrap my head around (requires a “functional programming” mind if you will):
def decorate(f):
        @functools.wraps(f)
        def callfunction(*args, **kwargs):
By calling @functools.wraps(f) we’re using a decorator inside a decorator. Not trivial…
To understand the main idea of ‘functools.wraps’, it’s easier to go through the following points/process (at least, easier for me…):
– When you decorate a function, you’re creating a new function that wraps the ‘original’ one, and return the wrapper
– The wrapper is used in order to be able to perform all kinds of tasks before/after the ‘original’ function is applied. Examples: preparing the arguments for the ‘original function’, timing how long does it take for the function to execute, add log printings before and after and etc.
– Say you have a wrapped function f, returned from a decorator, the user will then see the description (metadata) of the decorator instead of f, the original function.
– In order to avoid this confusion, @functools.wraps(f) comes into the picture: by annotating the wrapper function with @functools.wraps(f) – you’re making sure that the user of the wrapper will see the same description/metadata as the original function has. By ‘description’ I mean the function-signature (the name of the function and the names of the arguments it takes).

Hope this is helpful!
Python: Six, ABC and functools.wraps

How to calculate square root

The other day I ran into a question in Stackoverflow asking how to implement square-root calculation. I started googling it up and some implementations seemed to me very cumbersome but then I found Newton’s method which is pretty easy to implement. Translating it to Java was a matter of a few lines of code:

 
class Sqrt {
    public static void main(String[] args) {
        System.out.println(sqrt(17)); // 4.123105985575862
    }

    final static double DELTA = 0.0001;
    /**
     * Newton's method:
     *
     * X1 = X0 - f(x)/f'(x)
     * x^2 = num
     * f(x) = x^2 - num
     * f'(x) = 2*x
     *
     * X1 = X0 - (X0^2 - num) / 2*X0
     * ...
     * Xn = Xn-1 - ((Xn-1)^2 - num)/2Xn-1
     *
     * Stop condition: |num - (Xn)^2| < DELTA 
     * */ 

    public static double sqrt(double num) { 
        double x = num / 2; // starting point
        while(Math.abs(num - x * x) > DELTA) {
            x = x - ((x * x - num) / (2 * x));
        }
        return x;
    }
}
How to calculate square root

A simple calculator in Python

A while back, I wrote a simple calculator in Ruby. A few minutes ago I ran into a similar question in Stackoverflow, and decided to implement the Python version:

def is_number(s):
    try:
        float(s)
        return True
    except ValueError:
        return False


def calc(expr):
    if is_number(expr):
        return float(expr)    
    arr = expr.split('+')
    if len(arr) > 1:
        return sum(map(calc, arr))
    arr = expr.split('-')
    if len(arr) > 1:
        return reduce(lambda x,y: x-y, map(calc, arr))
    arr = expr.split('*')
    if len(arr) > 1:
        return reduce(lambda x,y: x*y, map(calc, arr), 1)
    arr = expr.split('/')
    if len(arr) > 1:
        return reduce(lambda x,y: x/y, map(calc, arr))


print calc("3+4-2 *2/ 2")    # 5

Nice, isn’t it?

A simple calculator in Python

Mac productivity tools

Alfred

I would recommend using Alfred, even without buying the power-pack it has very useful clipboard snippets:

 sc1

This way you can save commands that you’re using frequently and find them by typing “keywords” and when you press “enter” it’s paste into your clipboard/terminal. Very useful – especially when you work remotely on nodes in the cloud where you don’t have all your aliases defined.

I think that only if you buy the power-pack – you can use the workflows as well:

 sc1

The great thing about it is, that with one command/shortcut you can open multiple windows in your browser, open a terminal and connect somewhere, and basically do multiple operations simultaneously.

This is very useful in different scenarios, for example, when you are in emergency situation – it can both save time *and* make sure you’re not forgetting anything. For example, when I get paged for an incident I open the terminal and connect to our production bastion, open our public channel in hipchat to make sure we didn’t get any reports about the incident from other people in the company, open a tab in the browser with our dashboard and etc.

But Workflows are great not only in emergency situations, for instance, we have a service that, given a node-id – it provides all the information about the node (aws-account/region/instance-type/asg/etc) as well as provides links to Asgard and some other stuff. So in Alfred I have a workflow with a shortcut, and now all I have to do is type:
“loc <node-id>” (short for “locate”) and it opens that service in a new tab in my browser.

Alfred has many other features which I haven’t yet explored, but in case you decide to try it out here’s one very important tip:

under “advanced” tab there’s a “set sync folder” button. Set it to save your stuff under dropbox (or any other cloud-drive). This way you’re making sure that:

  1. you can use your Alfred setup with multiple computers
  2. it’s always backed-up – if anything happens to your computer – you haven’t lost anything (all your shortcuts, clipboard, workflows etc).
 sc3

Even without using multiple screens I find this app very useful: it provides keyboard shortcuts to resize windows, center them in the screen, and when you’re using multiple monitors – move them from one screen to another and back really easily. And yes, I agree, it is annoying you have to pay for something that you get for free as a Windows user, but I think that Apple users (unlike Windows users) are used to pay for stuff they need/want…

sc4

Captures screenshots of all/part of the screen, menus, etc. And then lets you edit and save/embed them. This app is free and I’m using it all the time – the screenshots in this post were done by Skitch!

Thanks to JS who introduced me to Alfred & Sizeup!

Disclaimer: I’m not affiliated with any of the applications above, nor I am receiving any kind of commission/royalties/etc!

 

Mosh is a command-line tool that replaces SSH. The main advantage of this terminal application is that it keeps the connection active, so even if you close your laptop, pack it in your bag, travel somewhere, open your laptop again – and you’re still connected to the same session!

Note: this may be an advantage for the lazy developer (me) but it’s also a disadvantage from security perspective!

Mac productivity tools

foo.bar: Rusty calculator

Last week I wrote about the code challenge foo.bar: Peculiar balance. Today I’ll tease you with another one:

Rusty calculator
================

Lab minion Rusty works for Professor Boolean, a mad scientist. He’s been stuck in this dead-end job crunching numbers all day since 1969. And it’s not even the cool type of number-crunching – all he does is addition and multiplication. To make matters worse, under strict orders from Professor Boolean, the only kind of calculator minions are allowed to touch is the Unix dc utility, which uses reverse Polish notation.

Recall that reverse Polish calculators such as dc push numbers from the input onto a stack. When a binary operator (like “+” or “*”) is encountered, they pop the top two numbers, and then push the result of applying the operator to them.

For example:
2 3 * => 6
4 9 + 2 * 3 + => 13 2 * 3 + => 26 3 + => 29

Each day, Professor Boolean sends the minions some strings representing equations, which take the form of single digits separated by “+” or “*”, without parentheses. To make Rusty’s work easier, write function called answer(str) that takes such a string and returns the lexicographically largest string representing the same equation but in reverse Polish notation.

All numbers in the output must appear in the same order as they did in the input. So, even though “32+” is lexicographically larger than “23+”, the expected answer for “2+3″ is “23+”.

Note that all numbers are single-digit, so no spaces are required in the answer. Further, only the characters [0-9+*] are permitted in the input and output.

The number of digits in the input to answer will not exceed 100.

Test cases
==========

Inputs:
(string) str = “2+3*2″
Output:
(string) “232*+”

Inputs:
(string) str = “2*4*3+9*3+5″
Output:
(string) “243**93*5++”

The solution (in Python) is surprisingly short:

def answer(normal):
    res = ''
    muls = normal.split('+')
    for mul in muls:
        if len(mul) > 1:
            local_mul = mul.split('*')
            res += ''.join(local_mul)
            res += (len(local_mul) - 1) * '*'
        else:
            res += mul 
    res += (len(muls)-1) * '+'    
    return res

That’s definitely NOT the way you should write your code…
Can you figure it out?

Hint: a similar technique was used here

foo.bar: Rusty calculator

foo.bar: Peculiar balance

In the past few months Google pulled “Matrix” on many Python and Java engineers: if you google terms like “python lambda” you might get pulled into “The Matrix” as well…
To me it happened on a Saturday around midnight, I googled something related to Python and suddenly the search page collapsed and became dark. There was a single sentence that said something like:
“You speak our language”.
and then it invited me to participate in a series of code-challenges.
As a veteran member of CodeEval I gladly stepped into the Matrix and started solving different challenges. All was good until I ran into the following:

Peculiar balance
================
Can we save them? Beta Rabbit is trying to break into a lab that contains the only known zombie cure – but there’s an obstacle. The door will only open if a challenge is solved correctly. The future of the zombified rabbit population is at stake, so Beta reads the challenge: There is a scale with an object on the left-hand side, whose mass is given in some number of units. Predictably, the task is to balance the two sides. But there is a catch: You only have this peculiar weight set, having masses 1, 3, 9, 27, … units. That is, one for each power of 3. Being a brilliant mathematician, Beta Rabbit quickly discovers that any number of units of mass can be balanced exactly using this set.
To help Beta get into the room, write a method called answer(x), which outputs a list of strings representing where the weights should be placed, in order for the two sides to be balanced, assuming that weight on the left has mass x units.
The first element of the output list should correspond to the 1-unit weight, the second element to the 3-unit weight, and so on. Each string is one of:

“L” : put weight on left-hand side
“R” : put weight on right-hand side
“-” : do not use weight

To ensure that the output is the smallest possible, the last element of the list must not be “-“.
x will always be a positive integer, no larger than 1000000000.

Test cases
==========
Inputs:
(int) x = 2
Output:
(string list) [“L”, “R”]
Inputs:
(int) x = 8
Output:
(string list) [“L”, “-“, “R”]

After playing a bit with numbers, I figured out my strategy:
Given a number x I’ll try to find the minimal integer y >=x so that y can be presented as a sum of distinct powers of 3.
I say “distinct” because the coefficients for any power of 3 can be either 0 or 1.
Example:
A good solution: 12 = 0*3^0 + 1*3^1 + 1*3^2
A bad solution: 12 = 3*3^0 + 0*3^1 + 1*3^2
The second solution is bad because one of the coefficients is 3.
So I need to find the smallest y that respects this rule. Once I’ve found such a y, and since I already know how to represent it as a series of distinct powers of 3 – it’s a good candidate for ‘R’.
Now I still need to figure out how to represent the difference y-x as a “series of distinct powers of 3″ and I’m done.
As you might already started suspecting: we cannot find such a representation for any y-x. And indeed, in case we fail, we’ll have to look for the “next y” and on and forth.
The code looked like this:
from itertools import combinations

def answer(x):
    """
    Main logic:
    find the closest number y >= x so such that:
    1. y is a sum of powers of 3
    2. x + z = y where z is sum of numbers, each of which is a power of 3
    3. z will be decomposed to "L" and y will be decomposed to "R" in build_list()
    """
    next = find_next_sum_of_powers(x)        
    left = represent_as_powers(next - x)   
    while list(set(left)) != left: # meaning we have the same item more than once in "left"
        # look for the next "sum of powers" and try to use it instead
        next = find_next_sum_of_powers(next + 1)        
        left = represent_as_powers(next - x)   
    right = represent_as_powers(next)
    return build_list(left, right)


def find_next_sum_of_powers(x):
    """
    Given a number x, find the closest number y
    such that y >= x and y is one of the numbers 
    returned by get_list_of_sums() and there is no number z,
    which is also returned by get_list_of_sums() such that:
    y > z > x
    """
    next = find_next_power(x)
    if x == next:
        return x
    if x <= next // 2:
        sums = []
        for list_of_sums in get_list_of_sums(x):
            for sm in list_of_sums:
                if sm not in sums:                
                    if x == sm:
                        return x                
                    if sm - x in sums:
                        return sm
                    sums.append(sm)
    else:
        return next

def is_power_of_3(x):
    """
    Checks if x is a power of 3
    """
    n = 0
    b = 0
    while n < x:
        n = 3 ** b    
        b += 1
    return n == x


def get_list_of_sums(x):
    """
    Generates a list of sums of powers of 3:
    1, 3, 4, 9, 10, 12, 13...
    4 = 3 + 1; 10 = 9 + 1; 12 = 9 + 3; 13 = 9 + 3 + 1; ...
    """
    workset = []
    for i in gen_next_power():         
        if i > x:
            break        
        workset.append(i) 
    res = []
    for i in range(len(workset)+1):
        for j in combinations(workset, i):
            res.append(j)      
            yield map(sum, res[1:])

def gen_next_power():
    """
    Generator for powers of 3
    each iteration will get the next power
    """
    pow = 0
    while True:
        yield 3 ** pow
        pow += 1


def find_next_power(x):
    """
    Given a number x find the closest 
    power of 3, y such that y > x and there
    is no power of 3, z such that  x < z < y
    """
    if x == 1:
        return x
    t = 3
    while t < x:
        t *= 3
    return t

def represent_as_powers(num):
    res = []
    while num > 0:
        t = find_next_power(num)
        if t > num:
           t /= 3
        res.append(t)
        num -= t
    return res[::-1]

def build_list(left, right):
    stop = right[-1]
    res, i = [],  0
    while 3 ** i <= stop :
        if 3 ** i in left:
            res.append("L")
        elif 3 ** i in right:
            res.append("R")
        else:
            res.append("-")
        i += 1
    return res    
Now, this works. Only problem is – this approach of trying different y’s could get really expensive.
In fact, if I limited the number “retries” to a small number – I could get almost all the test-cases to pass – all but one…
But I wasn’t ready to give up 🙂
So I started googling and found this post which provided a good direction but to be honest – I didn’t understand:
1. how balanced ternary representation helps to solve our problem
2. how to translate from ternary representation to balanced ternary
These questions are coupled, and it took me some extra-reading to figure it out and what I’d like to do now – is save you the time that I spent, by trying to explain it a little better than that post, wikipedia and other resources I ran into during my investigation.
Let’s start with the second question which is pure technical, by comparing numbers in different bases, we’ll represent “balanced” with 1,0,-1:
decimal      1    2      3     4      5
ternary      1    2      10    11     12
balanced     1    1-1    10    11     1-1-1
Trying to follow up on the “balanced” gave me a headache… but after playing with it for a bit, and reading some search results like this answer (which explains how to add numbers in balanced-ternary form), made things much clearer:
in “balanced” we’re still using ternary-base, and the numbers 1/0/-1 are simply the coefficients of the respective power of 3. Lets demonstrate:
2 = 1*3^1 -1*3^0 = 1-1
3 = 1*3^1 + 0*3^0 = 10
4 = 1*3^1 + 1*3^0 = 11
5 = 1*3^2 1*3^1 1*3^0 = 9 – 3 – 1 = 1-1-1
etc
Once you realize the answer to #2 it’s only another small step to understand the connection to the code challenge:
Given that number x, once we find its balanced-ternary representation we can express it as a… sum of distinct powers of 3!
Now we have only a small obstacle: this sum has positive and negative elements – how do we handle it?
The challenge was to balance weights, which means that instead of “subtracting” the left side we can “add” the weight to the right side – like balancing equations!
The meaning of this is that the balanced-ternary form is actually the answer we’re looking for, lets take the number 5 from the previous example:
5 = 1*3^2 1*3^1 1*3^0 = 9 – 3 – 1 = 1-1-1
in the form of our answer – we’re looking at the result from right to left, which means that
1-1-1 = [-1, -1, 1] = 1*3^0 1*3^1 + 1*3^2
and the way it helps us solving the problem is by using ‘R’ as 1*3^2 and balancing it with ‘L’ by adding to ‘L’ 1*3^0 + 1*3^1 which ends up as simply replacing letters in the reversed version of the balanced-ternary form
from: [-1, -1, 1]
to:   [L,  L,  R]
 
Further, if we have a zero in the balanced-ternary form it means that neither L nor R should add the respective power of 3, so we can simply replace any zero with “-” (dash).
We’re now ready to look at the code:
def answer(x):
    return to_balanced(to_ternary(x))

def to_ternary(x):
    """
    convert decimal into ternary
    """
    res = ""
    while x > 0:
        res = str(x % 3) + res
        x /= 3
    return int(res) 

def to_balanced(t):
    """
    convert ternary number into balanced-ternary
    and then to L/-/R representation
    """
    res = map(int, str(t))[::-1] # iterating the digit from right --> left
    for i in range(0, len(res)):
        # zeros will remain zeros and ones remain ones
        # the only thing we need to convert is 2 into +- (== +1-1)
        if res[i] == 2:
            res[i] = -1
            res = inc(res, i + 1)

    # convert from (1,0,-1) representation to (R,-,L)
    return ['-' if x == 0 else ('R' if x == 1 else 'L') for x in res]


def inc(arr, ind):
    """
    calculate the carryover
    """
    if len(arr) == ind:
        arr += [1] # prepend the last carryover
    elif arr[ind] == -1:
        arr[ind] = 0
    elif arr[ind] == 0:   
        arr[ind] = 1
    elif arr[ind] == 1: 
        arr[ind] = -1
        arr = inc(arr, ind + 1)
    else: # arr[ind] == 2
        arr[ind] = 0
        arr = inc(arr, ind + 1)
    return arr

Needless to say that this solution worked perfectly.
foo.bar: Peculiar balance

Facebook interview questions

FB

Last week a friend asked me to help him prepare for a job-interview at FB. The first thing I did was pointing him to Cracking the Coding Interview which provides an excellent overview of all the main subjects an Engineer might be asked during an interview. My recommendation was to take the time and go through each and every chapter, and try to solve the questions. It’s very important not to “cheat” and try hard to solve them by yourself – otherwise you won’t improve!

Another important thing is, in case you weren’t able to solve a question and you checked out the answer – DO NOT try to memories it! instead, go on and continue solving other questions, and after a few days, try again to solve that same question. You might be surprised to find out that even though you saw the answer – it’s not always trivial to re-write it. Further, this is a good exercise that sharpens another important skill: being able to write code that expresses the algorithm we have in mind.

There’s plenty of good information available online, just Google it and you’ll find excellent blog posts such as this one, and many others.

Another smart thing my friend did was to go over job-interview websites and forums and copy questions that other candidates wrote they were asked during their interviews in FB. The following is a list of such questions:

  1. Calculate the square root of a double
  2. Given n intervals [si, fi], find the maximum number of overlapping intervals.
  3. Print all the paths from root to every leaf in a binary tree.
  4. Print the sum of all the numbers at every vertical level in a binary tree
  5. Given a set of n jobs with [start time, end time, cost] find a subset so that no 2 jobs overlap and the cost is maximum ?
  6. Given 1 trillion messages on FB and each message has at max 10 words, how do you build the index table and how many machines do you need on the cluster to store the index table ?
  7. Given an input array and another array that describes a new index for each element, mutate the input array so that each element ends up in their new index. Discuss the runtime of the algorithm and how you can be sure there won’t be any infinite loops.
  8. Find all the anagrams in an array of strings
  9. Combinations(n, k) – print all combinations of k out of n items
  10. Printing a binary tree L-R,
  11. Implementing a comparator function to sort files based on a certain naming convention.
  12. Implement a LRU cache
  13. Design FB newsfeed
  14. How would you query for all the Places near a given coordinate? The focus is on how to scale this to a large number of places while keeping response time to within acceptable user expectations
  15. Given a list of words, group anagrams.
  16. Find all 3 items that sum to 0 in an array.
  17. Write a function that calculates input strings with operators +,-,*,/ eg. “5+5*6″ should output 35
  18. Printing the nodes of a linked list in reverse
  19. Finding the longest palindrome in a given string
  20. Finding maximum subarray sum (similar to Kadane’s Algorithm) with the constraint that two numbers in the array that form the max sum cannot be next to each other
  21. How do you use Facebook app and what are the problems with it? How would you fix it?
  22. Design a Facebook travel app
  23. designing a system to do spam detection work and describing it in a huge flowchart, as might be done in an early but detailed product planning session. Be prepared to think on large scales.
  24. Write a palindrome-checking function
  25. Print a tree, level by level.
  26. Write all solutions for a^3+b^3 = c^3 + d^3, where a, b, c, d lie between [0, 10^5]
  27. Print a list in reverse
  28. Compute maximum profit for buying selling a stock given an array of prices for n days.
  29. Check if two binary trees are Isomorphic
  30. How would you come to an agreement if you and another senior level engineer couldn’t agree on a technical design?
  31. Write a function that takes 2 arguments: a binary tree and an integer n, it should return the n-th element in the inorder traversal of the binary tree.
  32. Add up two big integers represented in arrays.
  33. Design a system to support Facebook status update.
  34. Design the recommendation system for search keywords
  35. Given 2 very large numbers, each of which is so large it can only be represented as an array of integers, write a function to multiply them.
  36. Write a function to display a JSON string in pretty format.
  37. Write a function to check if polygon is simple based on given list of points
  38. Function to compute the number of ways to climb a flight of n steps. Taking 1, 2, or 3 steps at a time. Do it in Linear time and constant space. Example:
    n = 3.
    1 1 1
    1 2
    2 1
    3
    Ans = 4
  39. Interweave a linked list. Do it in Linear time and constant space.
    Input: A->B->C->D->E
    Output: A->E->B->D->C
  40. Given a dictionary based simple password, create all possible (special character) passwords based on a provided mapping.
    Input: face
    Map: {a -> @, 4, A}
    Output: f@ce, f4ce, fAce
  41. Get numeric number out of a roman string, linear time
    Given: mapping I = 1, V = 5, X = 10, L = 50, C = 100, D = 500, M = 1000
    Input/Output:
    II = 2, III = 3, IV = 4
    VI, VII, VIII
    IX = 9
    XL = 40,
    XLIX = 49
  42. Merge ‘k’ sorted arrays, each array may have max ‘n’ elements
  43. What would you change in Facebook?
  44. Given a sorted array, write a program to decide if two elements sum up to a third.
  45. Determine the 10 most frequent words given a terabyte of strings

Good Luck!

Facebook interview questions