How many quadratic equations with real roots are there? (Part II)

May 4, 2009

In the last post, we started looking at the question: of all possible quadratic equations with real coefficients, what fraction of them, f_{real}, have real roots?

To review, the roots of a quadratic equation, ax^2+bx+c are given by the quadratic formula: x=\frac{-b\pm\sqrt{b^2-4ac}}{2a}. The roots are real when the thing inside the square-root, the discriminant, D=b^2-4ac is positive.

Last time, we tried using some probability arguments to see if we could make sense of f_{real}, but we didn’t get anywhere useful.  This time, let’s explore what we learn by noting that the condition that the disciminant is positive for real roots defines a surface in (a,b,c)-space that separates sets that give real roots from those that don’t, and so the ratio of the volume of the real-root space to the whole space ought to give us the fraction of possible quadratics with real roots.

How big is the volume of (a,b,c)-space? Infinite.  Let’s avoid that by assuming the space is finite and then we’ll take the limit to infinity later.  Specifically, let’s treat the space as a cube bounded in each direction by \pm r.  Then, the volume of (a,b,c)-space is:


Now, to the space of real roots.  The condition that the discriminant is positive corresponds to the region defined by:

a\le \frac{b^2}{4c}.

The volume of this region is given by an integral:

\begin{array}{ll} V_{real} &=\displaystyle \int_{-r}^r dc \int_{-r}^ db \int_{-r}^{\frac{b^2}{4c}} da \\&\\&=\displaystyle \int_{-r}^r dc \int_{-r}^r db \left(\frac{b^2}{4c}+r\right) \\&\\ &=\displaystyle \int_{-r}^r dc \left(\frac{r^3}{6c}+2r^2\right) \\&\\& = 4r^3+\frac{r^3}{6}\left(\text{ln}(r)-\text{ln}(-r)\right) \\&\\& =4r^3+\frac{r^3}{6}\text{ln}(-1).\end{array}

Hmm, that’s funny.  What’s that \text{ln}(-1) doing there?  Doesn’t \text{ln}(-1)=i\pi?  But volume had better be real.  Did we make a silly mistake?

If we take a look at the calculation above, the only place shenanigans might show up is from the integration over c.  Let’s take a closer look.  One thing we ought to be able to do is split the integral like this:

\displaystyle \int _{-r}^r \frac{dc}{c}=\int_{-r}^0\frac{dc}{c}+\int_0^r\frac{dc}{c}.

Furthermore, in the integral over the negative range, we can make a change of variable:

\begin{array}{l} u=-c \\ dc=-du, \end{array}

and so we can rewrite the integral as:


but there’s no reason we can’t also call the symbol u the symbol c instead!  So, let’s make that change and do a couple extra things to find:

\begin{array}{ll} \displaystyle\int_{-r}^r\frac{dc}{c} &=\displaystyle\int_{r}^0\frac{dc}{c}+\int_0^r\frac{dc}{c} \\ &\\&= \displaystyle-\int_{0}^r\frac{dc}{c}+\int_0^r\frac{dc}{c} \\&\\ & =0 .\end{array}

What the … happened?  Where’s the \text{ln}(-1)?  On the bright side, if this integral is zero, then V_{real}=4r^3, which is real.  And, on top of that, we find for the fraction of quadratics with real coefficients with real roots:


which is to say that there are exactly as many quadratics with real roots as without!

Since we’re asking a question about volumes, the answer had to be real, and so the second version of the integral over c as got to be right.  It’s also the answer you’d get if you picture the volume defined by a\le\frac{b^2}{4c} and think hard for a couple minutes.

But wait, so the integral of \frac{1}{c} over a symmetric domain should be zero you say, but we still have that result from before that \text{ln}(r)-\text{ln}(-r)=\text{ln}(-1)\not= 0.  What do we do about that?

Let’s take a look at what we mean by \text{ln}.  The logarithm is defined by:


So, for what value of x do we get e^x=-1? Euler tells us that e^{i\theta} = \text{cos}(\theta)+i\text{sin}(\theta) (and it’s not too hard to prove it, but the proof is reasonably rigorous, and thus has no place in this post!).  We get -1 for \theta = \pi, and so \text{ln}(-1)=i\pi.

But wait! We also get e^x=-1 for x=-i\pi or, most generally, for x= (2n+1)i\pi for all integer n.

So, \text{ln}(-1)=(2n+1)i\pi!  So many answers, and none are zero!

I bet Euler would say, if we’ve got an infinite number of perfectly good answers, the average is really what we’re looking for.  What’s the average answer? For every positive answer, there’s a corresponding negative answer, and so the average is zero!

So there you have it, another reason to think that \displaystyle \int_{-r}^r\frac{dc}{c}=0. (In general this answer for that integral is known as it’s Cauchy Principle Value.)

If it’s not clear by now, I don’t understand how modern mathematicians’ clear this confusion up.  For the problem of interest, we get zero for that integral three ways (change of variables, picturing the volume, and averaging the multivalued function), but not all ways.  Anyone care to explain?

To to summarize, if you shenanigan the integral correctly, one finds that the fraction of quadratics with real coefficients that have real roots is a half.

And if you’re still unhappy, there’s (at least) one more reason to believe that f_{real}=\frac{1}{2}.  For every choice of (b,c) a choice of a is either above or below the boundary set by a=\frac{b^2}{4c}.  Since the space of possible a is infinite, then partioning that space at any one point yields two semi-infinite spaces, and so half of the total space leads to real roots.

How fun is it that such a simple question leads to such sneaky mathematics?

Next time, let’s look at something a little less messy.

Update (5/4/2009):  The internet has a little chatter about this question and it seems f_{real}=1/2 is not the most popular answer.  Take a look at this thread or this one to see why

f_{real}=\frac{1}{2} + \frac{1}{12} \left(\frac{5}{6}+ \text{ln}(2)\right)\approx 0.627

is a good answer too.

Question for the masses: how can there be multiple good answers?


How many quadratic equations with real roots are there? (Part I)

May 3, 2009

The answer: infinity.

Okay, that wasn’t a useful question, so let’s ask a better one.  Of all possible quadratic equations with real coefficients, what fraction of them, f_{real}, have real roots?

As all children know, the quadratic equation is a x^2 +bx +c and its roots are the values of x for which the equation equals zero.  Everyone, adults included, also knows that the roots can be found with the quadratic formula:

x = \frac{-b \pm \sqrt{b^2-4ac}}{2a}

For real coefficients, the roots are real only when the quantity inside the square-root is positive.  This quantity is called the discriminant, and so we’ll give it the symbol D and talk about it for the rest of the post.


The question about real roots is a question about the discriminant: of all possible choices of a, \, b,\, and c, how often will the discriminant be positive?

To get a feel for what the answer might be and to show an example of how to introduce probability into interesting places where it ought to have no business, as a first step, we might ask, what’s the average value of the discriminant?

More specifically, let’s assume a,\,b and c are each drawn from a uniform distribution, and, since infinite domains are a pain, let’s assume the allowed choices of each coefficient are bounded by \pm r.  We’ll take the limit as r\rightarrow \infty later.  This is a model for a process like: “I throw a dart at a number line three times, and each point of impact gives me a coefficient.”  We could choose other distributions, but this one is in some sense the most “symmetrical” and thus best.

Under the uniform selection assumption, the average value of the discriminant is easy to find:

\bar{D}=\frac{1}{8r^3}\displaystyle \int_{-r}^r dc\int_{-r}^r db\int_{-r}^r da \left(b^2-4ac\right)= \frac{r^2}{3}

Well, that’s interesting.  On the one hand, if we take r to infinity, we get \bar{D}\rightarrow \infty, which is bad—screwy stuff happens when we try to think about things with infinite mean.  That the mean is infinite probably isn’t suprising.  Unless it’s exactly zero, then either b^2 or 4ac is bigger on average, and since both can go to infinity, the only plausible answers for \bar{D} are 0,\pm\infty

On the other hand, for any finite r, \bar{D} is positive, and so we might expect that quadratics with real roots are more common than quadratics with imaginary roots, and so maybe f_{real}>\frac{1}{2}.

Further ignoring 200 years of mathematics, let’s look at the standard deviation, \sigma_D, of the discriminant, since, if it’s small relative to the mean, maybe we can trust the mean anyway.  The idea is that sometimes things that give mathematicians strokes are still informative.  As they say in regard to somewhat related issues:

There are in this world optimists who feel that any symbol that starts off with an integral sign must necessarily denote something that will have every property that they should like an integral to possess. This of course is quite annoying to us rigorous mathematicians; what is even more annoying is that by doing so they often come up with the right answer.

McShane, E. J.
Bulletin of the American Mathematical Society, v. 69, p. 611, 1963.

Calculated in the usual way, we find:

\sigma_D= \frac{2}{15}\sqrt{105}r^2

Well, that’s infinite too in the limit, but how big of an infinite is it?  If we look at the standard deviation over the mean, we see:

\frac{\sigma_D}{\bar{D}}=\frac{2}{5}\sqrt{105}\approx 4.1

and so, infinite or not, the standard deviation is bigger than the mean, which tells us that I probably shouldn’t bet on f_{real}>\frac{1}{2} no matter how much I adore a nice, sloppy argument.

Some of you may be screaming by now: this can’t be the best way to do this! And you’d be right.  The condition that the discriminant is positive defines a surface in (a,b,c)-space that separates coefficient sets that lead to real roots from those that don’t.  It seems plausible that the volume of real-root space relative to the volume of the whole space will tell us what fraction of quadratics have real roots.  We’ll pick this up in the next post, wherein we’ll also learn more about that quote by McShane.

Elliptical Chicken Problem

January 17, 2009

I referred to a problem the other day (the LN model of the QIF) as an “elliptical chicken” problem, after having solved the spherical cow problem of the LN model for an OU process.  Based on Google, cuil, and surfwax searches, no one has used that phrase on the internet.  This post serves as evidence that I coined the term “elliptical chicken” in reference to a slightly less simplified model.

Ditching this blog for awhile

January 8, 2009

The science blogging thing isn’t working out because it’s too much like work.

New cooking blog:


Tree Physics 1: capillary action, the height of trees, and the optimal placement of branches

August 5, 2008

One of the fundamental problems trees have to solve is how to transport water up to the leaves. The ability to solve this problem poses a constraint on the maximum height of trees. The standard freshman physics textbook answer to the question, “What’s the mechanism that lifts the water?”, is capillary action (see Halliday and Resnick for example). Capillary action is the term for the ability of a narrow tube (a capillary) to draw liquid into itself via the adhesive force between the liquid and the tube walls. In plants, the relevant tubes are known as Xylem and the fluid is water. In trees, capillary action alone is not the only mechanism to transport water. First, without evaporation, capillary action would only transport water until mechanical equilibrium is reached and the flow stops. There’s also osmotic pressure at the roots drawing the water in from the soil. There are probably also osmotic effects throughout the tree itself.

Anyway, ignoring all that, let’s see where the freshman physics treatment gets us–what’s the maximum possible height of a tree if capillary action is the limiting factor? To solve this problem, we’d need to think about branching and the fact that pores in the leaves constitute the open ends of the “capillaries”. In the absense of any knowledge about trees, this is a hard problem. Let’s then ignore all the complications and just start with a single capillary tube. We’ll put branching back in to some extent a bit later.

So, what’s the height of a column of water in a single tube with a circular cross-section with radius R? The height of the tube results from the balance of two forces: the force of gravity pulling down on the column of water and the adhesive force between the walls of the tube and the water column pulling up. The force of gravity is easy, it’s simply the weight of a column of water of height H:

F_{g,1}=\rho g \pi R^2 H

where \rho is the density of water and g is the acceleration due to gravity (and the “1” indicates that this result is for a single tube). The adhesion force is due to the surface tension at the top of the column of water pulling against the walls of the tube. The surface tension, \gamma, is the force per unit length at the interface of two materials. If the surface of the water makes an angle \theta with respect to the normal to the wall, the upward component of the adhesive force is:

F_{ad,1}=\gamma 2 \pi R \cos{\theta}

Since \theta is somewhere between 0 and 90 degrees, let’s ignore the \cos{\theta}. Equating the two forces, we find the height of the water column is:

H=\frac{2 \gamma}{\rho g R}\approx \frac{0.14 cm}{R} (1)

where the 0.14 cm comes from plugging in lookupable values. The question now remaining is, “what’s R?” According to Emmanual Mapfumo at the Department of Horticulture, Viticulture and Oenology at the Waite Agricultural Research Institute in Australia, the radius of xylem in vitus vinifera is 12 \mu m or so. While that’s not a tree, it’s a good enough estimate for me for now. So, the equilibrium height of a column of water with the dimensions of xylem is about h\approx 10 m, or, much too short to be the maximum height of a tree.

Okay, so a tree is not composed of a set of parallel tubes of constant diameter from top to bottom. If we bring back in some of the reality of trees, does capillary action postdict something observably true?

Branching should change the story because the height as calculated above is determined by the surface-to-volume ratio of the column of water. If that ratio goes up, the height goes up too. How can we crank the ratio up? If the xylem in upper branches are narrower than those lower down and there are enough of them (is this true? probably), that’d do the trick.

Let’s now model the tree as a very simplified system–a “tree” of height H, composed of a single bottom tube of height h_1 and radius R that branches once into N upper tubes of height h_2=H-h_1, each with radius r. To further simplify things, let’s assume the total radius of the assembly stays constant throughout the entire tree. This constrains r to satisfy:

N \pi r^2 = \eta \pi R^2

where \eta is the packing fraction. As an example, for a 2D hexagonal close packed arrangement, \eta= \frac{\pi}{2\sqrt{3}} \approx 0.9.* This model is consistent with the assumption that the mass per unit height of a tree is constant throughout. This assumption is not generally believed to be true (see next post), but anyway.

We can now calculate the forces for this model and see what the results might tell us about trees. The upward force—the total due to the adhesive force in each of the N upper tubes—is:

F_{ad,N}=N \gamma 2 \pi r=\gamma 2 \pi R \sqrt{\eta N}=\sqrt{\eta N} F_{ad,1} (2)

We see that the upward force scales like the square-root of the number of tubes in the upper branches and is independent of the height. The gravitational force is due to the mass of all the water in the branched tree and is given by:

F_{g,N}=\rho g \left(h_1 \pi R^2 + N h_2 \pi r^2\right) = F_{g,1} \left(1-\left(1-\eta\right)\frac{h_2}{H}\right) (3)

We see that the downward force is sensitive to the overall height of the tree and the location of branching, but it is not sensitive to the number of upper tubes.

Actually making a numerical prediction about the maximum height would be futile here as trees don’t branch only once and I’d have to assume a few more numbers I don’t know, but we can use this analysis to make a qualitative prediction about branching.

Combining the results of Equations (2) and (3) makes a prediction for the structure of a tree if the tree’s structure tries to both maximize the height (as may be the case in dense forrests, where maximizing the height may also maximize the amount of light getting to the leaves) and maximize the amount of water transported simultaneously. The upward force on the water grows like \sqrt{N} regardless of the height of branching as long as branching takes place. The volume of water taken up into the tree is maximized when the branching happens high up the tree–when h_2\rightarrow 0. Thus, to maximize both the height of the tree given that the tree is also trying to maximize its water uptake, the tree should have lots of branches very high up and none very low. Does that prediction plan out? Take a look at the Sequoia and see for yourself.**

*Here’s how to calculate the packing fraction for circles in a 2D hexagonal close packed (HCP) arrangement. In english, the HCP structure is the “honeycomb nesting”, in which a row of touching circles sits in the interstices of the adjacent rows (consider just the “A” sheet in this picture). The packing fraction, \eta, is the ratio of the area of the circles to the total area covered. Consider a lattice with M rows, each composed of N disks of radius r. The combined area of the disks in the lattice is M N \pi r^2. The total area covered is the area of the rectangle covered by the M rows of N disks, which is the length of the rows times M times the height per row. The height per row is the same as the distance between the centers of the rows. The centers of 3 adjacent disks mark the vertices of an equilateral triangle with side length 2r. The height of that triangle and thus the distance between rows is \sqrt{3}r. So, the area of the covered rectangle is (2 N r)(M \sqrt{3} r). The packing fraction is the ratio of those areas, \eta= \frac{\pi}{2\sqrt{3}}.

There’s probably lots of interesting things about packing of objects. I remember hearing about M&M packing in my undergrad condensed matter course, and that’s pretty fun. The big thing there is oblate spheroids randomly pack a lot tighter than spheres do.

**Do I really believe this is the mechanism behind the branch distribution of the sequoia? I don’t know. It’d help if I actually knew something about trees. It’s at least consistent with this paper that claims that tree branch ratios are all about getting light to the leaves efficiently, where efficiently is defined in the paper..