My Depressed Brain is an Asshole

I'm sitting in my car. My heart's pounding, and I'm scared. Not like, mildy aprehensive, but literally full on panicked, scared. I'm confused too. This isn't me. Who is this person? This girl, who's on the brink of tears at the prospect of this. I'm fearless. There's very little I'm scared of, and most of those fears I can reason about. I can convince myself that there's really nothing to be afraid of.

But this? This is something else. This is paralyzing. This is completely unwarranted, and completely out of character. I can't reason about it, I can't even understand. All I know is that my brain is shouting at me. "No. No. No. Go away. Run. No."

How did I get here? Who is this girl in my skin?

That was me, today, as I tried to go into a coffee shop. Yep, that's right. I had a panic attack and nearly burst into tears over the idea of working in a coffee shop for a few hours.

See, this is something that my new therapist has been telling me to do since November, when I started seeing her. It's something that my old therapist recommended back in September, until I stopped seeing her. Something that I know will help me. My husband's suggested it, my boss has recommended it. I've said I'm going to do it.

But I never do. I am amazingly good at coming up with excuses not to. This comes up, or that comes up, or I need the three monitor set up that I have in my home office. I can come up with excuses for days. And that's how it's been, for months on end now. I leave my house to shop, generally for groceries, and to take my kids to and from their respective daytime activites. That's it. And it's been this way since I started working remotely in June.

I know who that girl walking around in my skin is. I've met her before. I've fought her before, I've won before. But now, it seems more and more that she's in control. That scares me too. I call her my depressed brain, and she's a complete asshole.

She's the one that sits by as I slowly put on weight, and then laughes uproariously when I look in the mirror and hate what I see. She reminds me of everything that's wrong with me, and every mistake I've ever made. She hates me, and she makes me hate me. She's loud, too, and persuasive.

Just like anyone would, she really only cares about one thing: her own survival. Which means, it's her job to make sure I stay like this. That I stay isolated, and I stay scared, and I forget more and more the woman that I used to be.

A few months ago, she convinced me that I wasn't worth the air I was breathing. I was convinced that I was only good at hurting people, and breaking things. That I was a detriment to anyone that knew me, and that by simply existing I was making their lives worse.

She convinced me that I should stop doing that. Hurting people, breaking things. The clear solution was just to stop existing. And everyone I'd ever known or loved, or broken, would be so much better off once that happened.

And so I tried.

Clearly, since I'm typing this, I didn't succeed. My husband found me, and stopped me, and then took me to the hospital. I was numb, in a daze. I don't remember the first day, much. I spent 10 days there, in total. I learned a lot. I saw other people, there for the same reason or similar reasons. I felt less alone. I even made a friend or two. And most importantly, I got some help. They set me up with a therapist, and a psychiatrist. Now I see my therapist twice a month, and I take medicine to help me sleep, and to help regulate my mood.

I still struggle, some days more than others. But I never forget what happened. Because she nearly won. That can't happen. There are people that I love, who love me, and not only would they not be better off without me, some of them need me just as much as I need them.

All that to say: My depressed brain is an asshole. She gives zero fucks about me, or my family, or my friends, or my well being. She'll try as hard as she can to win. And I have to try harder to make sure she doesn't.

If you have a depressed brain, I can just about guarantee that they're an asshole too. Don't forget that. Ever. Don't let them win, not by a mile or an inch. Don't give them anything, because you matter. You matter a whole lot.

Thanks to some good friends, I went into that coffee house today, one small baby step at a time. It was lovely. There were people there, none of whom I was related or married to. There was music, and coffee, and I got some work done. Today, I won a small victory against the bitch who sometimes walks around in my skin. Tomorrow I'll wake up and start that battle over again. Hopefully, someday, it'll get easier.

The Product of Permutations in Cycle Notation and Other Words That Don't Make Sense.

Permutations.

What is a permutation you ask?

According to Mathworld [1]:

A permutation, also called an "arrangement number" or "order," is a rearrangement of the elements of an ordered list S into a one-to-one correspondence with S itself. The number of permutations on a set of n elements is given by n!

Great, now in english please? It's essentially a rearrangement of a set of things.

So, you have a set S = {a, b, c} a permutation of that could be one of 6 different options (which is, 3 factorial or 3!) : abc, acb, bac, bca, cab, cba

When you think of permutations from the point of "rearrangment" or "renaming" of these objects, it makes sense to show the transition from before, to after. This is often done in two-line notation which looks something like this:

image

The top line here is the "before", and the bottom line is the "after". So you would read this as "a becomes c, b becomes d, c becomes f" etc.

You can also apply permutations one after the other. This is actually called "multiplying permutations". Personally, I think that's rather odd, as I don't see any actual multiplication going on, but I'll defer to brains bigger than my own and just go with it. You can feel free to do the same. Moving on.

If you have:

image

You apply the first permutation, and then the second.

You end up with this:
image

If that's still a bit fuzzy, let me explain it in a different way.
Step 1: we start with a b c d e f
Step 2: That becomes c d f b e a
Step 3: Step 2 becomes c a e d f b

So the final result of that multiplication can be read as a b c d e f becomes c a e d f b

Also, it ought to be noted that this multiplication is not commutative. That is,

image is not equal to image

In this case, the order matters!

Cycle Notation

As I stated earlier, the notation we've been using up until now is called two-line notation, for obvious reasons. There is another notation which is a little harder to read at first, but I actually prefer it: cycle notation.

Going back to our first example:

image

This can be written in cycle notation as: (acf)(bd) This represents two "cycles", each enclosed in (). Each character in a cycle becomes the character to it's right, with the last character in the cycle becoming the first character, in a circular fashion.

It looks something like this:
image

So again, just as with the two-line notation: a becomes c, c becomes f, f becomes a.

Permutations in cycle notation can be written in a number of different ways. (bd)(acf), (cfa)(bd) and (db)(fac) are all equivalent, however (afc)(db) wouldn't be because it says a becomes f.

Looking at that, you might think I'm mistaken. I mean, I JUST said that order in multiplication of permutations matters, didn't I? Generally, in cycle notation, that's still true. However, the two cycles above ((bd) and (acf)) don't intersect. That is, neither of them have any letters in common. So, in this case, order doesn't matter.

Now, if you compare the cycle notation representation of (acf)(bd) to the original set (abcdef) you should notice that one letter is missing: e. This is because in the permutation, e becomes e, or doesn't change. You can see that easily in the two line notation. This is called a singleton cycle, and can be included or left off in cycle notation.

Interesting side note: A permutation that's made up entirely of singleton cycles is called an identity permutation, and it's usually denoted as ().

Multiplication in Cycle form.

The above multiplcation problem that we did in two-line form can be easily translated into cycle notation as follows:
image

Becomes: (acf)(bd)(abd)(ef) = (acefb)

We don't need that pesky x sign here either, since it's pretty clear that (acf)(bd) means (acf) times (bd).

Let's walk through this together. We know from our earlier definition that the multiplication of permutations is just applying one after the other.

Our alorithm will terminate once each character in the input has been seen, so we 'tag' each character as they become the "current" character. Once the number of unique characters in the input == the number of tags, we have determined the final positions of each of the input characters, and so we're done. That's why I use the words "tag" and "untag" here.

  1. Starting at the left, Select the first (untagged) character of the input, in this case: a. We'll call this our "current" item. We're trying to find out what happens to that letter as each of the permutations is applied to it.
  2. Move to the right 1. We have a becomes c
    >The final position of a isn't necessarily c, though. We need to continue moving to the right along the input to see what c becomes, because that will be the final stop for a. In this case, c doesn't appear again. After step 2, our output is (ac.
  3. Remember that we read this partial output as "a becomes c", The next logical question is what happens to c? c becomes our 'current' character, and we move back to step 2.
  4. Eventually, you will come to a point where the last letter of the output "becomes" the first letter of the current cycle. This is a "closed cycle", you output a right parenthesis to indicate that the cycle is closed. If there is more input to be processed, you start a new cycle.

These 4 steps above are repeated until all of the input characters have been tagged, and we have found our answer.

We can walk through the above example: (acf)(bd)(abd)(ef)
a -> c
c -> f -> e
e -> f
f -> a -> b
b -> d -> a
d -> b -> d and so, our final answer is: (a c e f b), in this case, d is a singleton cycle and can be left off or included as you please.

This is easy enough to do by hand, and it doesn't take very long to do it. But, let's be honest: image

So instead, we create an algorithm that can handle this work for us.

Building an Algorithm

Let's find the product of the following permutations:

(acfg)(bcd)(aed)(fade)(bgfae)

**Step 1: Clean the input: ** Remember that once you reach the end of a cycle, the last letter points back in a circular fasion to the first letter of the cycle? That's a bit irritating to explain to a computer. Instead, we'll replace all right parentheses with the first character of that cycle. Remove all left parentheses. After this step, our original input looks like this:

acfga bcdb aeda fadef bgfaeb

Notice that all the parentheses are gone, and we've copied the first letter of each cycle to the end: acfg became acfga, etc.

In code this is pretty simple:

function parseInput($input)
{
    $input = explode(')(', $input);

    $input_array = array_map(
        function ($piece)
        {
            $piece = trim($piece, '()');
            $piece = $piece . $piece[0];

            return $piece;
        },
        $input
    );

    $character_array = preg_split('//', implode('', $input_array), -1, PREG_SPLIT_NO_EMPTY);

    return $character_array;
}

Step 2 Open the cycle: Searching from left to right, find the first untagged element of input. Define a variable $start equal to this value, and tag it. Add a left parentheses and the element to the output.

function openPermutation($character_array, $tags)
{
    list($initial_position, $start) = getNextUntaggedElement($character_array, $tags);
    $tags[] = $start;

    $position = $initial_position + 1;
    $current = $character_array[$position];

    $output[] = '(';
    $output[] = $start;
}

You can see from these first two steps that I've chosen to construct both the output and input as arrays. Why? Because I like arrays. I'd be very happy to hear other implementations using different data structures. (Yup, there are more ways to skin this cat, as with every cat)

At this point, output should be ['(', 'a']

Step 3: Set the current equal to the next untagged element in the input. This step was actually also performed above by $current = $character_array[$position]

Step 4: Find the next right instance Continuing left to right, scan the input until you either reach the end, or you find an element equal to the current. In the latter case, tag that element and return to step 3.

list($position, $current) = findNextRightInstance($character_array, $current, $position);

Step 5: Check Current Against start? At this point $current holds the character that we are currently searching for. $start holds the first character of the current cycle.

5a. If $current DOESN'T == start, we aren't ready to close the cycle. So, add $current to the output array and go back to step 4.

$output[] = $current;
$tags[$current] = $current;
$position = array_search($current, $character_array);

$position++;
$current = $character_array[$position];

5b: If $current DOES == $start, then we have reached the end of this cycle. Add a right parenthesis to the output and go back to Step 2.

if ($current == $start)
{
    $output[] = ")";
}

You keep going through that round about until all of the elements in the input appear in the tagged array. At that point, you're finished.

Given our initial input of (acfg)(bcd)(aed)(fade)(bgfae) you end up with (adg)(ceb)(f)

Because I'm a curious soul, I wanted to see how this work with different kinds of permutations. Knowing what we do about multiplication, we know that $a x $b x $c is equal to ($a x $b) x $c. Does the same hold true of Permutations?

I implemented my algorithm so that the multiply() function would work with a single string representation of the permutation, as seen above, or two separate permutations to be multiplied together.

The first method, with a single string Permutation:

$p6 = new Permutation('(acfg)(bcd)(aed)(fade)(bgfae)');
echo multiply($p6)->raw();

This indeed outputs what we expected:

(adg)(ceb)(f)

However, when I did it with each permutation separately:

$p1 = new Permutation("(acfg)");
$p2 = new Permutation("(bcd)");
$p3 = new Permutation("(aed)");
$p4 = new Permutation("(fade)");
$p5 = new Permutation("(bgfae)");
echo multiply(multiply(multiply(multiply($p1, $p2), $p3), $p4), $p5)->raw();

This time it gave a slightly different output:

(adg)(bce)(f)

Looking out those two outputs, the only difference is the second cycle ceb vs bce. As we discussed earlier, these are actually equivalent. In both cases, c becomes e, b becomes c, and e becomes b.

Tada! We've now written an algorithm to calculate the product of permutations in cycle notation. Go forth, conquer the world. As always, if you want to discuss this or you have questions, feel free to find me on the twitter: @kayladnls

If you'd like to see the full code that I used to implement this, go here.

References
1. Permutation, Wolfram Mathworld, http://mathworld.wolfram.com/Permutation.html
2. The majority of this information, including the algorithm and a deeper explanation of permutations was learned from The Art Of Computer Programming, Vol 1 by Donald E Knuth. Find it, buy it, read it, love it, and then come find me so we can talk about it.

Method of Least Squares: Using squares to calculate a hat.

In my previous post on Machine Learning, I mentioned supervised and unsupervised methods. We also discussed the two main kinds of supervised methods: continuous and categorical. Today we're going to talk about one of the continuous methods: Method of Least Squares. (LSM). LSM is a method of linear regression, which was also briefly mentioned in the previous post.

Linear Regression is the process of finding a line/curve that best fits a set of data.

LSM is probably the most popular technique in statistics, and also one of the oldest. How does it work? It operates off the principle that the mean of a distribution is the value that minimizes the sum of squared deviations of the scores. [1]

Now, maybe you're thinking: say what? Great! Then I'm not the only one.

So what is it really saying? The goal, when doing regression is to try to find a line (or curve) that best fits your training dataset. The idea being that this line will provide a reasonable estimate for future real world data, and that you will be able to make predictions based on that line. Given a value (x) you will be able to estimate the resultant (y) value, also referred to as Y-hat.

With LSM, you draw that line by minimizing the squared distance between each point in your distribution (dataset) and your estimate. Why squares and not just the distance value itself? Squares allow all of the residuals to be treated as a continuous differentiable quantity [2].

The values along the x axis are referrred to as the dependant variable, and the y axis is the independant variable.

You want to create a function that best estimates future y, given an x value. That function can be represented by a line. In slope intercept form:

(1) image

Where Y is the predicted value, m is the slope and b is the y-intercept of the line,

You want a line that is going to be closest to your data, that is, where the distance between the points and the estimation line is the smallest. The distance between the actual points and your estimation line is the error, and obviously, you want to minimize error, to increase accuracy.

(2) image
Where E represents the error. you're minimizing.

Knowing that a quadratic equation reaches it's minimum when as it's derivatives approaches zero, we'll take the derivative of E with respect to both unknowns, a and b and set each of them to 0. These are called the normal equations.

(3) image

and

(4) image

Solve those normal equations and you will end up with:

(5) image
Where My and Mx are the mean of x and y, respectively.

and

(6) image

Now, all of that is a rather math-y representation of the problem, with greek letters a-plenty. It's hard to read, and a bit hard to understand, in my opinion. I much prefer to look at it in code.

Show me the codez

  1. Calculate the mean (average) of x and y. (bar-x and bar-y, respectively. These values were My and Mx in the equations above)
  2. Take the distance from each observation in your training data, to the bar-x and the bar-y.
  3. Find the slope m.
  4. Use the point where bar-x and bar-y intersect (mean-intersect), along with the slope to calculate the y-intercept of your line.
  5. Tada! You have a function. Punch in x values and out pop estimated y values ( Y-Hat).

I want to show you what that looks like in code. I've included the table below as a reference, so you can see all of the values that I will be using for this example, as well as bar-x and bar-y.

X Y x - bar-x y - bar-y (x-bar-x)2 (x-bar-x)(y-bar-y)
1 2 -2 -2 4 4
2 4 -1 0 1 0
3 5 0 1 0 0
4 4 1 0 1 0
5 5 2 1 4 2

Step 1: Calculate bar-x and bar-y:

 $mean_x = $this->mean($x);
 $mean_y = $this->mean($y);

 function mean(array $vals) {
        return array_sum($vals) / count($vals);
}

Step 2: Calculate the distance between each point and bar-x and bar-y Here we're constructing two arrays that will hold the distance to bar-x and bar-y, for each value of x and y. Keep in mind that each x and y correspond to the x and y coordinates of points on a graph.

$x_minus_barx = array_map(function($x) use($mean_x) {
                return $x - $mean_x;
            }, $x);

$y_minus_bary = array_map(function($y) use($mean_y) {
                return $y - $mean_y;
            }, $y);

Step 3: Find the slope: To calculate the slope, you sum the square of $x_minus_barx, this becomes your denimonator. You then sum the values of $x_minus_barx and $y_minus_bary, this becomes your numerator. This fraction becomes your slope.

This corresponds directly to equation (6) above.

$slope = $this->calcSlope($x_minus_barx, $y_minus_bary);

function calcSlope($x_minus_barx, $bary)
{
    $barxsq = $this->squareArr($x_minus_barx);

    $barx_times_bary = $this->multArrs($x_minus_barx, $y_minus_bary);

    return array_sum($barx_times_bary) / array_sum($barxsq);
}

Step 4: Use the mean-intercept and the slope to calculate the Y Intercept. The mean intercept is the point where bar-x and bar-y meet. All regression lines must pass through this point. Since we know that point is on our line, we can you it's X and Y values to calculate the Y-intercept of the line. You do this by rearranging the Y = mX+b function into : b = Y - Mx.

This corresponds directly to equation (5) above.

$y_int = $this->calcYInt($this->slope, $mean_intercept);
function calcYInt($slope, $mean_intercept)
{
    return $mean_intercept[1] - ($slope * $mean_intercept[0]);
}

Step 5: Tada; At this point, we know the values for our Y-intercept b, and our slope: m. Given any value for X, we can calculate Y.

function estimate($x)
{
    return $this->y_int + ($this->slope * $x);
}

Given the input points {1, 2}, {2, 4}, {3, 5}, {4, 4}, {5, 5}, we find that the slope = .6 and the y-intercept is 2.2.

Graphically this looks like this:
image

Each point on the graph represents a datapoint, and the line represents our estimation line. You can see that the line intercepts the y-axis just where we said it would (2.2).

What does this all mean?

image

Let me take a moment to illustrate with an arbitrary example. You have a house you want to sell, you also have the sales data for houses in your town, along with their square footage. Train your function on this dataset, such that square footage vales are along the x-axis and price is along the y-axis. Input the square footage of your soon to be sold home and out comes a reasonable estimate of what your home would sell for.

A little further reading
Ian Barber did a post covering Linear Regression that uses a different alogirthm. I wanted to verify my results against his. If you punch in his input data:

// data x, y
$data = array(
    array(5, 21),
    array(6, 25),
    array(7, 30),
    array(8, 31),
    array(10, 41),
    array(12, 50)
);

You can verify that the method I've outlined here results in y = 0.29 + 4.08x, which is consistent with the results that he got. The post he wrote is a very good one, and I highly suggest reading it.

There are many other flavors of Least Squares that I will cover in later posts. This method can be extended to non-linear functions, and also to apply to more than one variable, but that will have to wait for another day.

Hopefully you've learned a lot, and as always, feel free to ask questions if anything is unclear.

References
1. Abdi, Herve, The Method of Least Squares http://www.utd.edu/~herve/Abdi-LeastSquares06-pretty.pdf
2. Least Squares Fitting, Wolfram Math World, http://mathworld.wolfram.com/LeastSquaresFitting.html