Monday, November 17, 2014

Function Whose Second Derivative is the Square of it's First Dervative

I was working on different forms of the Laplacian of a general product of scalar functions (which I might share in a later post). And I came to a point that I wanted to ask myself when is $$\nabla{f}\cdot\nabla{f}=\Delta{f}$$ which is similar to asking if $$\frac{\partial^2 f}{\partial x^2} = \left(\frac{\partial f}{\partial x}\right)^2$$ In fact, the latter is a special case of the former. And if the partials with respect to $x,y,z$ all follow this, then we have a term-wise equality between the dot product of the gradient and the Laplacian. But when does this happen? Well, let's start with a function of 1 variable. So we have the ODE $$f''(x)- \left(f'(x)\right)^2=0$$ First, substitute $g(x) = f'(x)$. $$g'-g^2=0$$ Clearly we can write the following $$\begin{align*} g&=g\\ g'&=g^2\\ g''=2gg'&=2g^3\\ g^{(3)}=6g^2g'&=6g^4\\ g^{(4)}=24g^3g'&=24g^5\\ &\vdots \end{align*}$$ And in general is appears that $$g^{(n)}=n!g^{n+1}$$ We can prove that with induction. The base cases are done. And the induction step is fairly straight forward $$g^{(n+1)} = \left(g^{(n)}\right)' = \left(n!g^{n+1}\right)' = \left(n+1\right)n!g^ng' = \left(n+1\right)!g^{n+2}$$ Let's try to write the Taylor series for $g$ now letting $g_0=g(x_0)$. $$g(x) = \sum_{n=0}^\infty\frac{g^{(n)}(0)}{n!}\left(x-x_0\right)^n = \sum_{n=0}^\infty\left(g_0\right)^{n+1}\left(x-x_0\right)^n=g_0\sum_{n=0}^\infty\left(g_0\left(x-x_0\right)\right)^n$$ This last part looks an awful lot like the geometric series $(1-x)^{-1}$. But we shift $x$ by $x_0$ scale the value by $g_0$. So really we have $$g(x) = \frac{g_0}{1-g_0\left(x-x_0\right)}=\frac{-a}{ax+b}$$ where $a=-g_0$ and $b=1+g_0x_0$. Now since we said that $g=f'$ we just integrate once to find $f$ $$f(x) = \int\frac{-a}{ax+b}\,\mathrm dx$$ Simple $u$-substitution for $u=ax+b$ yields $$f(x) = c - \ln(ax+b)$$ Double checking reveals that this does, in fact, solve our ODE. We could also separate variables and integrate to arrive at this same answer. Also, a buddy of mine shared the intuition almost immediately that $\ln(x)$ is a basic function that gets us close to the solution and putting a negative out front fixes the one sign issue that shows up. So we can just guess that our function looks like $$f(x) = -\ln(g(x))$$ for some function $g(x)$. Putting this into our original equation yields $$\frac{-g''(x)g(x)+g'(x)^2}{g(x)^2}=\frac{g'(x)^2}{g(x)^2}$$ So either $g(x) = 0$ or $g''(x)=0$. The former means we'd be dividing by $0$ (plus it's a boring answer), but the latter just implies that our function is linear. So $g(x)=ax+b$. Lastly, we just note that we only restrict the derivative, so we can slap a constant out front. $$f(x) = c - \ln(ax+b)$$ The same thing we got before! But, how can we come up with a special case where for $x,y,z$ we have $\nabla{f}\cdot\nabla{f}=\Delta{f}$? Well, a really simple solution would be $$f(x,y,z) = e - \ln(ax+by+cz+d)$$ with $a,b,c,d,e$ constants. The $by+cz+d$ group would be a "constant" when taking the partial derivative with respect to $x$. And the rest of the partials would follow similarly!

Now, I don't know if this is the only solution to our original equation because we worked from the single variable case up. But it's a simple solution.

Tuesday, September 16, 2014

Every square matrix is the sum of a unique symmetric and a unique skew-symmetric matrix

This one is going to be alone brief post. Came across this interesting property in my advanced solids course, as well.

Consider the square matrix $\mathbf{A}$. Clearly we can write $\mathbf{A}$ like this $$\begin{align*} \mathbf{A}&=\frac{1}{2}\left(\mathbf{A}+\mathbf{A}+\mathbf{A}^T-\mathbf{A}^T\right)\\ &=\frac{1}{2}\left(\mathbf{A}+\mathbf{A}^T\right)+\frac{1}{2}\left(\mathbf{A}-\mathbf{A}^T\right) \end{align*}$$ Let $\mathbf{B}=\frac{1}{2}(\mathbf{A}+\mathbf{A}^T)$ and $\mathbf{C}=\frac{1}{2}(\mathbf{A}-\mathbf{A}^T)$. So $\mathbf{A}=\mathbf{B}+\mathbf{C}$. Based on the property that $(\mathbf{A}+\mathbf{B})^T=\mathbf{A}^T+\mathbf{B}^T$ it's easy to verify that $\mathbf{B}$ is symmetric $(\mathbf{B}=\mathbf{B}^T)$ and that $\mathbf{C}$ is skew-symmetric $(\mathbf{C}=-\mathbf{C}^T)$. Therefore a solution exists.

But, is this solution unique? Assume we have two solutions: $\mathbf{A}=\mathbf{B}_1+\mathbf{C}_1=\mathbf{B}_2+\mathbf{C}_2$ where $\mathbf{B}_i$ is symmetric and $\mathbf{C}_i$ is skew-symmetric. Since $\mathbf{B}_1+\mathbf{C}_1=\mathbf{B}_2+\mathbf{C}_2$ we know that $$\mathbf{B}_1-\mathbf{B}_2=\mathbf{C}_2-\mathbf{C}_1$$ It's easy to show that the sum of two symmetric matrices is symmetric and the sum of two skew-symmetric matrices is skew-symmetric. Therefore, the left hand side is symmetric and the right hand size is skew-symmetric. Because they are equal, it implies that they are both the zero matrix. $$\mathbf{B}_1-\mathbf{B}_2=\mathbf{C}_2-\mathbf{C}_1=\mathbf{0}$$ And lastly, we see that $$\mathbf{B}_1=\mathbf{B}_2$$ $$\mathbf{C}_1=\mathbf{C}_2$$ And really, we were looking at the same representation. Therefore, it is unique!

In the course we were really talking about how the displacement gradient $\left(\nabla\mathbf{u}\right)$ is really the sum of the infinitesimal strain tensor $\left(\mathbf{\varepsilon}\right)$ that is symmetric and the infinitesimal rotation tensor $\left(\mathbf{\omega}\right)$ that is skew-symmetric. But the same principle applies to the much easier concept of matrices.

Wednesday, September 10, 2014

Temperature Distribution in a Semi-Infinite Wall

Any introductory heat transfer class solves for temperature distributions in plates, solids, etc.. The temperature distribution is a solution to the heat diffusion partial differential equation. The general form of the heat diffusion equation with no internal heat generation in Cartesian coordinates is $$\frac{\partial^2 T}{\partial x^2} + \frac{\partial^2 T}{\partial y^2} + \frac{\partial^2 T}{\partial z^2} = \frac{1}{\alpha}\frac{\partial T}{\partial t}$$ The first series of problems typically tackled in the intro classes are the steady-state solutions. These are the cases where $\partial T / \partial t = 0$. And the very first problems even restrict us to only one dimension. Slowly we add in dimensions and time dependence. However, one problem that I find particularly interesting is the semi-infinite wall. The problem involves a half-space that represents a wall. A very thick wall. One might call it "infinite". So thick that the back end of the wall never receives any heat in the time span that we are concerned with. Therefore, we approximate it as an infinitely thick wall. Additionally, we will assume the wall's area is very large (or that it's boundary is so far away from the area that we're concerned with) and therefore the face of the wall is infinite. And due to symmetry, we can restrict the problem to be in only 1 space dimension: perpendicular to the face of the wall. Our partial differential equations can be reduced to $$\frac{\partial^2 T}{\partial x^2}=\frac{1}{\alpha}\frac{\partial T}{\partial t}$$ This is second order in position and first order in time. Therefore we require two boundary conditions and an initial condition. The initial condition is simple: we will assume that the wall starts off uniform in temperature. $$T(x,0)=T_0$$ Next we have to come up with the boundary conditions. In heat transfer there are three primary surface conditions: constant temperature, constant flux, and convection. The choice of surface condition depends on the physical context that you want to model. However, for the purpose of this solution we will assume that the surface condition has a constant temperature. Meaning that after $t=0$ we instantly change the temperature of the face of the wall to $$T(0,t)=T_s$$ Assuming either of the other two surface conditions changes very little, but requires us to introduce new constants and more complex algebra. As for the last boundary condition, we have already hinted at its nature: the wall at infinity is the same as it is at time $0$. $$T(x\to\infty,t)=T_0$$ But now we have to solve the PDE. But what if we don't know how to solve PDEs? This is what I find fascinating about this problem; we can turn this PDE of two variables into an ODE of 1 variable with a similarity variable [1] $$\eta=\frac{x}{\sqrt{4\alpha t}}$$ We now try to get $\frac{\partial^2 T}{\partial x^2}$ and $\frac{\partial T}{\partial t}$ which are in terms of $T,x,t$ into terms of just $T$ and $\eta$. $$\begin{align*}\frac{\partial^2 T}{\partial x^2}&=\frac{\partial}{\partial x}\left(\frac{\partial T}{\partial x}\right)=\frac{\partial}{\partial x}\left(\frac{\partial T}{\partial \eta}\frac{\partial \eta}{\partial x}\right)=\frac{\partial}{\partial x}\left(\frac{\partial T}{\partial \eta}\frac{1}{\sqrt{4\alpha t}}\right)\\ &=\frac{1}{\sqrt{4\alpha t}}\frac{\partial}{\partial x}\left(\frac{\partial T}{\partial \eta}\right)=\frac{1}{\sqrt{4\alpha t}}\left(\frac{\partial^2 T}{\partial x\partial \eta}\right)=\frac{1}{\sqrt{4\alpha t}}\left(\frac{\partial^2 T}{\partial \eta\partial x}\right)\\ &=\frac{1}{\sqrt{4\alpha t}}\frac{\partial}{\partial \eta}\left(\frac{\partial T}{\partial x}\right)=\frac{1}{\sqrt{4\alpha t}}\frac{\partial}{\partial \eta}\left(\frac{\partial T}{\partial \eta}\frac{\partial \eta}{\partial x}\right)\\ &=\frac{1}{\sqrt{4\alpha t}}\frac{\partial}{\partial \eta}\left(\frac{\partial T}{\partial \eta}\frac{1}{\sqrt{4\alpha t}}\right)=\frac{1}{4\alpha t}\frac{\partial^2 T}{\partial \eta^2} \end{align*}$$ The above depends on the continuity of $T$ and the independence of $x$ and $t$. Both are very appropriate assumptions for our problem. Likewise we can show $$\frac{\partial T}{\partial t}=\frac{\partial T}{\partial \eta}\frac{\partial \eta}{\partial t}=-\frac{x}{2t\sqrt{4\alpha t}}\frac{\partial T}{\partial \eta}$$ Putting these back into our heat equation $$\frac{1}{4\alpha t}\frac{\partial^2 T}{\partial \eta^2}=-\frac{1}{\alpha}\frac{x}{2t\sqrt{4\alpha t}}\frac{\partial T}{\partial \eta}$$ which can simplify to $$\begin{align*} \frac{\partial^2 T}{\partial \eta^2}&=-\frac{x\sqrt{4\alpha t}}{2\alpha t}\frac{\partial T}{\partial \eta}=-\frac{x}{x}\frac{x\sqrt{4\alpha t}}{2\alpha t}\frac{\partial T}{\partial \eta}=-2\frac{\sqrt{4\alpha t}}{x}\frac{x^2}{4\alpha t}\frac{\partial T}{\partial \eta}\\ &=-2\frac{x}{\sqrt{4\alpha t}}\frac{\partial T}{\partial \eta}=-2\eta\frac{\partial T}{\partial \eta} \end{align*}$$ And since it's all in terms of $\eta$ we can turn the partials back into total derivatives $$\frac{\mathrm{d}^2 T}{\mathrm{d} \eta^2}=-2\eta\frac{\mathrm{d}T}{\mathrm{d}\eta}$$ Now we are left to convert our boundary conditions into $\eta$. Our surface boundary condition, the condition where $x=0$ clearly corresponds to the case where $\eta=0$ and therefore $T(0)=T_s$. And the two conditions, $t=0$ and $x\to\infty$ clearly corresponds to the case where $\eta\to\infty$. Therefore, $T(\eta\to\infty)=T_0$. Thus the three conditions on the PDE are now two conditions in the ODE. I'm not going to solve this ODE because it's a standard separation of variables solution. I was only interested in sharing the conversion from the PDE of two variables into the ODE of one variable. I will however, spoil the ending and share the solution: $$ \DeclareMathOperator\erf{erf} \frac{T(\eta)-T_s}{T_0-T_s}=\erf{\eta}$$ where $\erf{\eta}$ is the error function (Go figure!). However, we can always substitute back in $x$ and $t$ because $T(\eta)=T(x,t)$. Therefore, in conclusion $$\frac{T(x,t)-T_s}{T_0-T_s}=\erf{\left(\frac{x}{\sqrt{2\alpha t}}\right)}$$

[1] Incropera, Frank P. Fundamentals of Heat and Mass Transfer. Hoboken, NJ: Wiley, 2013. Print.

Friday, September 5, 2014

Alternative Definition for a Skew-Symmetric Matrix

I was working on some advanced solid mechanics homework and a problem asked to prove a particular identity for $2^{\text{nd}}$ order tensors. I really enjoyed the solution that I came up with. It's the type of proof that goes
Suppose some statement $A$ is true for all $x$. We want to show that statement $B$ is true. Consider the very special case where $x=(\text{randomnesspulled}$ $\text{outofthinairbutmakestheproblemverysimple})$... and the proof goes on very smoothly because of that particular choice of $x$.
Well, I finally managed to come up with one of those proofs! I believe that book only intended for the problem to be solved for the 3-dimensional case where writing out the explicit possibilities is very feasible and most likely the expected solution. I will share my (arguably prettier) solution. But I will translate it into terms of linear algebra, matrices, and vectors instead of tensors. But first some background.

Definition 1. A skew-symmetric matrix is a matrix $\mathbf{A}$ such that $\mathbf{A}=-\mathbf{A}^T$.

Where $^T$ denotes the transpose operator (i.e. flip along the main diagonal). We will use the notation $A_{ij}$ to denote the element of $\mathbf{A}$ in the $i^{\text{th}}$ row and $j^{\text{th}}$ column. Therefore, this definition can be written as $$A_{ij}=-A_{ji}$$ Here are a few examples of skew symmetric matrices: $$\left[\begin{matrix}0&1&1\\ -1&0&1\\ -1&-1&0\end{matrix}\right] \quad\left[\begin{matrix}0&1&-2&3\\ -1&0&-4&5\\ 2&4&0&6\\ -3&-5&-6&0\end{matrix}\right] \quad\left[\begin{matrix}0&a\\ -a&0\end{matrix}\right]$$ Note that all have $0$'s down the main diagonal because $0$ is the only number that solves $x=-x$. Now, the alternative definition to this type of matrix is:

Definition 2. For all vectors $\mathbf{u}$, $\mathbf{A}$ is skew-symmetric if $\mathbf{u}^T\mathbf{Au}=0$.

And to show that this definition is the same as the previous one, all we need to do it show the following theorem. Which is the problem in the book but in terms of matrices and vectors and not tensors (maybe another day!).

Theorem 1. For all vectors $\mathbf{u}$, $\mathbf{u}^T\mathbf{Au}=0$ if and only if $A_{ij}=-A_{ji}$.

Proof.
$(\Leftarrow)$ Assume that $A_{ij}=-A_{ji}$. We want to show that for all $\mathbf{u}$, $\mathbf{u}^T\mathbf{Au}=0$. Consider $\mathbf{u}^T\mathbf{Au}$. It can be shown that $$\mathbf{u}^T\mathbf{Au}=\sum_{i=1}^n\sum_{j=1}^nA_{ij}u_iu_j$$ where the vectors are of length $n$ and the matrix is of size $n\times n$. And since $A_{ij}=-A_{ji}$, we know that $$\sum_{i=1}^n\sum_{j=1}^nA_{ij}u_iu_j=\sum_{i=1}^n\sum_{j=1}^n-A_{ji}u_iu_j$$ But $i$ and $j$ are just dummy variable names and we can swap them, as long as we swap all of them. So: $$\sum_{i=1}^n\sum_{j=1}^nA_{ij}u_iu_j=\sum_{j=1}^n\sum_{i=1}^n-A_{ij}u_ju_i$$ But summation order doesn't matter, multiplication is commutative, and we can pull the negative out. $$\left(\sum_{i=1}^n\sum_{j=1}^nA_{ij}u_iu_j\right)=-\left(\sum_{i=1}^n\sum_{j=1}^nA_{ij}u_iu_j\right)$$ And, as stated above, the only solution to $x=-x$ is $0$. Thus $\mathbf{u}^T\mathbf{Au}=0$. That was the easy direction. Now for the fun one!
$(\Rightarrow)$ Assume $\mathbf{u}^T\mathbf{Au}=0$ for all $\mathbf{u}$. We want to show that $A_{ij}=-A_{ji}$. Going back to the summations we have $$\mathbf{u}^T\mathbf{Au}=\sum_{i=1}^n\sum_{j=1}^nA_{ij}u_iu_j=0$$ Now, consider the vectors $\mathbf{u}^{(r,s)}$ such that $u_i=\delta_{ri}+\delta_{si}$ where $\delta_{ij}$ is the Kronecker delta function. $$\delta_{ij}=\begin{cases}1&i=j\\ 0&i\ne j\end{cases}$$ Some examples of these vectors are $$\begin{align*} \mathbf{u}^{(1,1)}=\left[2,0,0\right]\\ \mathbf{u}^{(2,1)}=\left[1,1,0\right]\\ \mathbf{u}^{(3,1)}=\left[1,0,1\right] \end{align*}$$ Those were all examples in three dimensions, however, these can be in any number of dimensions. For instance this ten dimensional version $$\mathbf{u}^{(4,7)}=\left[0,0,0,1,0,0,1,0,0,0\right]$$ Basically, if $r=s$, then the vector has all zero components with the exception of the $r^{\text{th}}$ component which is $1+1=2$. Or if $r\ne s$ then the vector still has all zero components with the exception of the $r^{\text{th}}$ component which is $1+0=1$ and the $s^{\text{th}}$ component which is $0+1=1$.

Consider first, the case where $r=s$. We know that $u_i^{(r,r)}=2\delta_{ri}$ and thus $$0=\sum_{i=1}^n\sum_{j=1}^nA_{ij}u_iu_j=\sum_{i=1}^n\sum_{j=1}^nA_{ij}(2\delta_{ri})(2\delta_{rj})$$ Since $\delta_{ri}$ and $\delta_{rj}$ are zero unless $i=r$ and $j=r$, we can drop the summation and just use $i=j=r$ and the deltas become multiplications by $1$. $$0=A_{rr}(2)(1)(2)(1)=4A_{rr}$$ Thus $A_{ii}=0$ and we have zeros down the main diagonal. Step in the right direction!

Now consider the case where $r\ne s$. We know that $u_i^{(r,s)}=\delta_{ri}+\delta_{si}$. Thus $$0=\sum_{i=1}^n\sum_{j=1}^nA_{ij}u_iu_j=\sum_{i=1}^n\sum_{j=1}^nA_{ij}(\delta_{ri}+\delta_{si})(\delta_{rj}+\delta_{sj})$$ We can foil and distribute and get $$0=\sum_{i=1}^n\sum_{j=1}^n\left(A_{ij}\delta_{ri}\delta_{rj}+A_{ij}\delta_{ri}\delta_{sj}+A_{ij}\delta_{si}\delta_{rj}+A_{ij}\delta_{si}\delta_{sj}\right)$$ The summand is only non-zero for four cases: $(r=i=j)$, $(r=i,s=j)$, $(s=i,r=j)$, $(s=i=j)$. If one of these isn't true, then the summand is $0+0+0+0=0$. Thus, we can take just the four terms with those $i$ and $j$ values and drop the summation. $$\begin{align*} 0=&\left(A_{rr}\delta_{rr}\delta_{rr}+A_{rr}\delta_{rr}\delta_{sr}+A_{rr}\delta_{sr}\delta_{rr}+A_{rr}\delta_{sr}\delta_{sr}\right)+\\ &\left(A_{rs}\delta_{rr}\delta_{rs}+A_{rs}\delta_{rr}\delta_{ss}+A_{rs}\delta_{sr}\delta_{rs}+A_{rs}\delta_{sr}\delta_{ss}\right)+\\ &\left(A_{sr}\delta_{rs}\delta_{rr}+A_{sr}\delta_{rs}\delta_{sr}+A_{sr}\delta_{ss}\delta_{rr}+A_{sr}\delta_{ss}\delta_{sr}\right)+\\ &\left(A_{ss}\delta_{rs}\delta_{rs}+A_{ss}\delta_{rs}\delta_{ss}+A_{ss}\delta_{ss}\delta_{rs}+A_{ss}\delta_{ss}\delta_{ss}\right) \end{align*}$$ Now remember that $A_{ii}=0$ so the first and last lines are all $0$. And we can cross out any term with a $\delta_{rs}$ or $\delta_{sr}$. And we can replace any $\delta_{rr}$ or $\delta_{ss}$ with a multiplication by $1$. This leaves us with $$0=A_{rs}+A_{sr}$$ And this implies that $A_{ij}=-A_{ji}$.

I will say that this was both simpler and much prettier in Einstein tensor notation. But with the choice of $\mathbf{u}$ we had all but $4$ of $n^2$ terms cancel out in one step and the $14$ of $16$ terms cancel out in the second step. The remainder of the proof was easy!

Tuesday, August 19, 2014

The Geometric Mean is just the Arithmetic Mean of the Exponents

This one is going to be brief, but I find it interesting. A buddy of mine a few months back sent me a text about this realization: The geometric mean is just the arithmetic mean of the exponents! I'll explain in a second but just a quick background.

An arithmetic mean is what we commonly refer to as the average: add all the elements up and divide by the number of elements. $$\overline{x}=\frac{x_1+x_2+\cdots+x_n}{n}$$ Alternatively, the we have the geometric mean. Except the geometric mean is the multiplicative analog to the arithmetic mean. The geometric mean is the $n$th root of the product of the elements. $$\overline{y}'=\left(y_1y_2\cdots y_n\right)^{\frac{1}{n}}$$ The connection is as follows. Each number $y_i$ (assuming they are all positive) can be expressed as $y_i=a^{b_i}$ for some $a,b\in\mathbb{R}$. Therefore we have the following $$\begin{align*} \overline{y}'&=\left(y_1y_2\cdots y_n\right)^{\frac{1}{n}}\\ &=\left(a^{b_1}a^{b_2}a^{b_2}\cdots a^{b_n}\right)^{\frac{1}{n}}\\ &=\left(a^{b_1+b_2+\cdots+b_n}\right)^{\frac{1}{n}}\\ &=a^{(b_1+b_2+\cdots+b_n)/n} \end{align*}$$ which states that if each $y_i$ can be expressed as $y_i=a^{b_i}$ for some $a\in\mathbb{R}$ and $b_i\in\mathbb{R}$ then the geometric mean of all $y_i$ is the base, $a$, raised to the arithmetic mean of all exponents $b_i$. $$\overline{y}'=a^{\overline{b}}$$ Pretty interesting, eh?

Saturday, July 12, 2014

Battle Mechanic Computations (Part 1)

The other day on the Mathematics StackExchange, I came across a question about a closed form solution for a certain set of battle mechanics. The problem intrigued me and I found some interesting results. Here's some of what I found.

The battle mechanics are as follows. For battling armies $X$ and $Y$, a battle consists of rounds of (what I will call) conflicts. Each conflict is a $1$ on $1$ interaction between a member of $X$ and a member of $Y$ with an equal chance of victory for both sides. The loser of a conflict is "dead" and the winner can compete in the next conflict. The last army with men is the victor.

It turns out that it's possible to quickly compute the probable outcome. Let $P(x,y)$ be the probably that a battle, between army $X$ with $x$ men and army $Y$ with $y$ men, results in a victory for army $X$. The battle mechanics can be summarized in three rules: $$P(x,0) = 1$$ $$P(0,y) = 0$$ $$P(x,y) = \frac{1}{2}(x,y-1) + \frac{1}{2}P(x-1,y)$$ The first two rules are pretty self explanatory; if army $Y$ has no men, army $X$ wins and visa versa. The third can be understood by looking at a conflict. A conflict has two results: army $X$ wins or army $Y$ wins. If army $X$ wins, then army $Y$ loses a man and the rest of the battle has the same odds as a battle of two armies of size $x$ and $y-1$, respectively (fatigue isn't considered, so the member of army $X$ in the first conflict is indistinguishable from any other member of army $X$ after that first conflict). Likewise a conflict win for army $Y$ results in the rest of the battle to have the odds as that of two armies of size $x-1$ and $y$. And since we have equal odds, there's a $50\%$ chance of the former and a $50\%$ chance of the latter.

I will show that $$P(x,y)=\frac{1}{2^{x+y-1}}\sum_{i=0}^{n-1}\dbinom{x+y-1}{i}$$ To start, let's start with armies of size $x$ and $y$ and start expanding with our third rule. Each of the below are equivalent. $$\frac{1}{1}\left[P(x,y)\right]$$ $$\frac{1}{2}\left[P(x,y-1) + P(x-1,y)\right]$$ $$\frac{1}{4}\left[P(x,y-2) + 2P(x-1,y-1)+P(x-2,y)\right]$$ Let's stop here and explain what we're doing. From the first line to the second is pretty obvious, we just invoked our rule and pulled out the common factor of $1/2$. The progression from the second line to the third line invokes the rule twice (one for each of the evaluations of $P$). The left $$P(x,y-1)=\frac{1}{2}\left[P(x,y-2)+P(x-1,y-1)\right]$$ and the right $$P(x-1,y)=\frac{1}{2}\left[P(x-1,y-1)+P(x-2,y)\right]$$ Putting them together involved pulling out the common factor of $1/2$ (and combining this with the already factored out $1/2$) and then combining both terms of the form $P(x-1,y-1)$. Let's proceed with the expansion, but for the sake of space, We will denote $P(x-a,y-b)=P_{a,b}$ $$\frac{1}{8}\left[P_{0,3} + 3P_{1,2}+3P_{2,1}+P_{3,0}\right]$$ $$\frac{1}{16}\left[P_{0,4} + 4P_{1,3}+6P_{2,2}+4P_{3,1}+P_{4,0}\right]$$ And by now, it should be strikingly apparent that the coefficient are rows of Pascals triangle and we can rewrite $P(x,y)$ as $$P(x,y)=\frac{1}{2^j}\sum_{i=0}^j\dbinom{j}{i}P(x-i,y-j+i)$$ where $j$ is the number of conflicts so far. However, there is a bound on the number of conflicts. For instance, the battle can't be over until at least $\min(x,y)$ conflicts have taken place (otherwise, no one is out of men). And as an upper bound, no more than $x+y-1$ conflicts can happen. If $x+y$ conflicts take place, then a total of $x+y$ men are dead and both armies are dead. Therefore, we take one less conflict and this lone man is the victor! With this upper bound, we can write $$\begin{align*}P(x,y)=&\frac{1}{2^{x+y-1}}\sum_{i=0}^{x+y-1}\dbinom{x+y-1}{i}P(x-i,y-(x+y-1)+i)\\ &=\frac{1}{2^{x+y-1}}\sum_{i=0}^{x+y-1}\dbinom{x+y-1}{i}P(x-i,1+i-x)\end{align*}$$ However, for all $i\geq x$ we have $P(0,1+i-x)=0$ so we can ignore any $i\geq x$. $$P(x,y)=\frac{1}{2^{x+y-1}}\sum_{i=0}^{x-1}\dbinom{x+y-1}{i}P(x-i,1+i-x)$$ However, we can see the for $i\leq x-1$, we have $P(x-i,0)=1$. Therefore we have removed the dependency on $P$ on the RHS. $$P(x,y)=\frac{1}{2^{x+y-1}}\sum_{i=0}^{x-1}\dbinom{x+y-1}{i}$$ We have shown what I've wanted to show here. This equation, in words, is the probability that army $X$ with $x$ soldiers wins over army $Y$ with $y$ soldiers is the sum of the first $x$ elements in the $(x+y-1)$th row of Pascal's triangle divided by the total sum of the $(x+y-1)$th row of Pascal's triangle! I will discuss the properties and shortcuts to this equation in the next part. Just as a preview: $$P(x,x) = \frac{1}{2}$$ $$P(x,y) + P(y,x) = 1$$ $$P(x,1) = \frac{2^x-1}{2^x}$$ $$P(x,2) = \frac{2^{x+1}-x-2}{2^{x+1}}$$ $$P(x,3) = \frac{2^{x+3}-x^2-5x-8}{2^{x+3}}$$

Monday, June 23, 2014

Sum of Three Squares and Factors of Two

Background

I've been working a lot recently with the sums of squares (I have a previous post about finding the Exhaustive List of the Sums of Four-Squares). This work has been for some research and also for Project Euler. However, I came across an interesting result: If $n$ is even and $a^2 + b^2 + c^2 = n^2$, then $a,b,$ and $c$ are also even.

To kind of tip-toe around the issue, we know that there is at least one solution $a,b,c$ such that this is true: $a=n, b=c=0$. Others may and do exist, but we won't fret too much right now about what they are. All we need to know is that there is at least one solution. Since all natural numbers are either even or odd, we have four possibilities: all $a,b,c$ are odd, two are odd and one is even, one is odd and two are even, or all three are even.

We can easily rule out the case for which one is odd and three are odd. Because we know $n$ is even, $n^2$ must also be even. However, there are an odd number of odds on the other side (remember an odd squared is still odd), which means the entire other side is odd. And, since an odd cannot equal an even, we have a contradiction. However, in order to eliminate the case where two are odd, we must do a little more work.

Assume, without loss of generality, that $a$ and $b$ are odd and $c$ is even and we will prove a contradiction. Let $d,e,f,g\in\mathbb{N}$ be such that $$\begin{array}{c} a = 2d + 1 \\ b = 2e + 1 \\ c = 2f \\ n = 2g \end{array} $$ We know that. $$n^2 = (2g)^2 = 4g^2$$ And we know that $$\begin{array}{r l} n^2 &= a^2 + b^2 + c^2 \\ &= (2d+1)^2 + (2e+1)^2 + (2f)^2 \\ & = 4d^2 + 4d + 1 + 4e^2 + 4e + 1 + 4f^2 \\ &= 4(d^2 + d + e^2 + e + f^2) + 2 \end{array}$$ Putting the two together: $$4g^2 = 4(d^2 + d + e^2 + e + f^2) + 2$$ Clearly, the left hand side is a multiple of $4$, however, the righthand side is not. Therefore, these two cannot be equal and we cannot have two odd squares to sum. We won't try to eliminate the last possibility (that all three be even) because we know that at least one solution exists and it has to fall into one of those categories.

Now, this probably doesn't seem like that interesting of a result. And honestly speaking, it's really not particularly interesting. However, we can generalizes this:

Theorem


If $2^k\vert n$ for some $k\in\mathbb{N}$ and $a^2 + b^2 + c^2 = n^2$ for $a,b,c\in \mathbb{N}$, then $2^k\vert a,b,c$.


Proof

For $k=1$, this simplifies to the above proof and $k=0$ is trivial. As such, we have base cases and we will proceed by induction. So, we assume that if $2^{k-1}\vert n$ for some $k\in \mathbb{N}$ and $a^2 + b^2 + c^2 = n^2$, then $2^{k-1}\vert a,b,c$. Now assume that $2^k\vert n$ and that $a^2 + b^2 + c^2 = n^2$. We know that there exists $m$ such that $n=2^km$. We also can show that $$n = 2^km = 2^{k-1}(2m)$$ which means that $2^{k-1}\vert n$. And, since $a^2 + b^2 + c^2 = n^2$ by our induction hypothesis, we can conclude that $2^{k-1}\vert a,b,c$. This implies there exists $i,j,l\in \mathbb{N}$ such that $$\begin{array}{c} a = 2^{k-1}i \\ b = 2^{k-1}j \\ c = 2^{k-1}l \\ \end{array}$$ We can now classify $a,b,c$ based on the respective parity of $i,j,l$ (because again, $i,j,l$ must either be even or odd). We, again, have four possibilities: all three are odd, two are odd and one is even, one is odd and two are even, and all three are even.

Case 1

Assume that all three are odd. Therefore, there exists $i_0, j_0, l_0$ such that $$\begin{array}{c} i = 2i_0 + 1 \\ j = 2j_0 + 1 \\ l = 2l_0 + 1 \end{array}$$ and furthermore $$\begin{array}{c} a = 2^{k-1}(2i_0 + 1) = 2^ki_0 + 2^{k-1} \\ b = 2^{k-1}(2j_0 + 1) = 2^kj_0 + 2^{k-1} \\ c = 2^{k-1}(2l_0 + 1) = 2^kl_0 + 2^{k-1} \\ \end{array}$$ Consider $$\begin{array}{r l} n^2 &= a^2 + b^2 + c^2 \\ & = (2^ki_0 + 2^{k-1})^2 + (2^kj_0 + 2^{k-1})^2 + (2^kl_0 + 2^{k-1})^2 \\ &= 2^{2k}(i_0^2+i_0+j_0^2+j_0+l_0^2+l_0) + 3(2^{2k-2}) \\ \end{array}$$ Note that this is not divisible by $2^{2k}$ because of the last term. On the contrary, note that $$n^2 = (2^km)^2 = 2^{2k}m^2$$ and therefore $2^{2k}\vert n^2$; a contradiction. Therefore, all three cannot be odd.

Case 2

Assume that two are odd and one is even. Therefore there exists $i_0,j_0, l_0$ such that $$\begin{array}{c} i = 2i_0 + 1 \\ j = 2j_0 + 1 \\ l = 2l_0 \end{array}$$ As before, we can substitute and write $a,b,c$ in terms of $i_0, j_0,l_0$ and as a result $n^2$ in terms of $i_0, j_0,l_0$. $$n^2 = 2^{2k}(i_0^2+i_0+j_0^2+j_0+l_0^2) + 2(2^{2k-2})$$ Note that this is not divisible by $2^{2k}$ because of the last term only having a factor of $2^{2k-1}$. But again, because know that $2^{2k}\vert n^2$ we have a contradiction. Therefore, two cannot be odd.

Case 3

Assume that one is odd and two are even. Therefore there exists $i_0,j_0, l_0$ such that $$\begin{array}{c} i = 2i_0 + 1 \\ j = 2j_0 \\ l = 2l_0 \end{array}$$ Once again, we can substitute and write $n^2$ in terms of $i_0, j_0,l_0$. $$n^2 = 2^{2k}(i_0^2+i_0+j_0^2+l_0^2) + 2^{2k-2}$$ Note that this is not divisible by $2^{2k}$ because of the last term. And again, because know that $2^{2k}\vert n^2$ we have a contradiction. Therefore, one cannot be odd.

Case 4

That only leaves the option that all three are even. Since $i,j,k$ are even, and we have $i_0,j_0, l_0$ such that $$\begin{array}{c} i = 2i_0 \\ j = 2j_0 \\ l = 2l_0 \end{array}$$ we can now write $$\begin{array}{c} a = 2^{k-1}(2i_0) = 2^ki_0 \\ b = 2^{k-1}(2j_0) = 2^kj_0 \\ c = 2^{k-1}(2l_0) = 2^kl_0 \\ \end{array}$$ and we have shown that $2^k\vert a,b,c$!

Implications

What does this mean, though? It means that if we want to find the three integers whose squares sum to $n^2$(such as lattice points on a sphere), then we can omit multiples of two from our search and work with smaller numbers. This saves a great deal of computing time.

For instance, if we know that $2^k\vert n$ and we want to find $a,b,c$ such that $n^2 = a^2+b^2+c^2$, we can really just search for $d,e,f$ such that $m^2=d^2+e^2+f^2$ ($n = 2^km$). We jump back to the original problem by multiplying by $2^{2k}$ on both sides: $$\begin{array}{r l} m^2 & = d^2 + e^2 + f^2\\ 2^{2k}m^2 & = 2^{2k}(d^2 + e^2 + f^2) \\ (2^km)^2&= 2^{2k}d^2 + 2^{2k}e^2 + 2^{2k}f^2 \\ n^2 & = (2^kd)^2 + (2^ke)^2 + (2^kf)^2\\ n^2 & = a^2 + b^2 + c^2 \end{array} $$ Where clearly $d,e,f$ is the respective result of the division of $a,b,c$ by $2^k$.