Mathematics Worksheet

Description

1.Simple answers are NOT sufficient, you need to show all the relevant work
2. The exercises with a * require the use of a computer.(What sofware you use it up to you)
3. More information, notes will be provided after I confirm you
Please see the file I provided first, make sure you can do it before you choose this work
I will send you a message before I confirm you, please check it

3 attachmentsSlide 1 of 3attachment_1attachment_1attachment_2attachment_2attachment_3attachment_3

Unformatted Attachment Preview



MAT 3380: Intro. to Numerical Methods
Winter 2022
Diane Guignard
HOMEWORK 1
MNNNSO
• This assignment is due Friday, January 20
21 on Brightspace. It covers Sections 3.1, 3.2
and part of 3.3.
• Simple answers are NOT sufficient, you need to show all the relevant work to receive
full credit.
烑NNONNNNNNNh懳NNrts•
• Please write the following information at the top of the front page: your full name, the
course number (MAT 3380) and the number of the assignment.
• The exercises with a * require the use of a computer. For Exercises 1 and 2, you can
use the function bisection.m provided on Brightspace. Attach the code you wrote
to your report.
Exercise 1⇤
Let f (x) = (x + 2)(x + 1)2 x(x 1)3 (x 2). To which zero of f does the bisection method
converge when applied on the following intervals?
a.
[ 1.5, 2.5]
b.
[ 0.5, 2.4]
c.
[ 0.5, 3]
d.
[ 3, 0.5].
Exercise 2⇤
p
Use the bisection method on [1, 2] to find an approximation of 3 correct to within 10 4 .
Indicate which function f you used and report the values of x0 , x1 and x2 , the final output and
the number of iterations.
Exercise 3⇤
Let f (x) = x2
a.
2x
xk =
3. To find a root of f , the following three fixed point method are proposed
3
xk
1
b.
2
xk = x2k
1
xk
1
3
c.
xk =
x2k
2xk
+3
.
2
1
1
For each method, compute (if possible) the iterates x1 , x2 , x3 and x4 starting from x0 = 0.
Report the values you obtain in a table. Which methods seem to be appropriate? Among those,
which one seems to converge the fastest?
Exercise 4
Prove Corollary 3.4 of the lecture notes (see Section 3.3), namely the estimates
|xk
and
x⇤ |  ⇢k max{x0
|xk
x⇤ | 
⇢k
1

|x1
a, b
x0 |,
x0 },
k
k
0,
1.
Exercise 5
Consider the function
1p
3
x+8
3
i) Show that g has a unique fixed point in [0, 1].
ii) Assuming that we start from x0 = 12 , find a bound for the number of fixed point
iterations needed to achieve 10 6 accuracy.
g(x) =
1
i
i
i
Downloaded 01/15/22 to 137.122.8.73 Redistribution subject to SIAM license or copyright; see https://epubs.siam.org/page/terms
i
Chapter 3
Nonlinear Equations in One
Variable
This chapter is concerned with finding solutions to the scalar, nonlinear equation
f (x) = 0,
where the variable x runs in an interval [a, b]. The topic provides us with an opportunity to discuss
various issues and concepts that arise in more general circumstances.
There are many canned routines that do the job of finding a solution to a nonlinear scalar
equation; the one in MATLAB is called fzero; type help fzero to see how this function is
used and what input parameters it requires.
Following an extended introduction to our topic in Section 3.1 are three sections, 3.2–3.4, each
devoted to a different method or a class of methods for solving our nonlinear equation. A closely
related problem is that of minimizing a function in one variable, and this is discussed in Section 3.5.
3.1 Solving nonlinear equations
Referring to our prototype problem introduce above, f (x) = 0, let us further assume that the function
f is continuous on the interval, denoted f ∈ C[a, b]. Throughout our discussion we denote a solution
of the equation (called root, or zero) by x ∗ .
Note: Why introduce nonlinear equations before introducing their easier comrades, the
linear ones?! Because one linear equation in one unknown is just too easy and not particularly illuminating, and systems of equations bring a wave of complications with them. We
have a keen interest in solving scalar nonlinear equations not only because such problems
arise frequently in many applications, but also because it is an opportunity to discuss various issues and concepts that arise in more general circumstances and highlight ideas that
are extensible well beyond just one equation in one unknown.
Example 3.1. Here are a few simple functions and their roots.
1. f (x) = x − 1, on the interval [a, b] = [0, 2].
Obviously there is one root for this linear equation: x ∗ = 1.
39
i
i
i
i
i
i
i
i
40
Chapter 3. Nonlinear Equations in One Variable
f(x)
0
−1
0
2
4
6
8
10
12
14
x
f(x)
4000
2000
0
−2000
0
2
4
6
8
10
12
14
16
18
20
2
4
6
8
10
x
100
f(x)
Downloaded 01/15/22 to 137.122.8.73 Redistribution subject to SIAM license or copyright; see https://epubs.siam.org/page/terms
1
50
0
−10
−8
−6
−4
−2
0
x
Figure 3.1. Graphs of three functions and their real roots (if there are any): (i) f (x) =
sin(x) on [0, 4π ], (ii) f (x) = x 3 − 30x 2 + 2552 on [0, 20], and (iii) f (x) = 10 cosh(x/4) − x on
[−10, 10].
2. f (x) = sin(x).
Since sin(nπ ) = 0 for any integer n, we have

(a) On the interval [a, b] = [ π2 , 3π
2 ] there is one root, x = π .
(b) On the interval [a, b] = [0, 4π ] there are five roots; see Figure 3.1.
3. f (x) = x 3 − 30x 2 + 2552,
0 ≤ x ≤ 20.
In general, a cubic equation with complex coefficients has three complex roots. But if the
polynomial coefficients are real and x is restricted to be real and lie in a specific interval, then
there is no general a priori rule as to how many (real) roots to expect. A rough plot of this
function on the interval [0, 20] is given in Figure 3.1. Based on the plot we suspect there is
precisely one root x ∗ in this interval.
4. f (x) = 10 cosh(x/4) − x,
−∞ < x < ∞, where the function cosh is defined by cosh(t) = et +e−t 2 . Not every equation has a solution. This one has no real roots. Figure 3.1 indicates that the function φ(x) = 10 cosh(x/4) − x has a minimum. To find the minimum we differentiate and equate to 0. Let f (x) = φ (x) = 2.5 sinh(x/4)−1 (where, analogously t −t to cosh, we define sinh(t) = e −e 2 ). This function should have a zero in the interval [0, 4]. For more on finding a function’s minimum, see Section 3.5. i i i i i i i i 3.1. Solving nonlinear equations 41 Downloaded 01/15/22 to 137.122.8.73 Redistribution subject to SIAM license or copyright; see https://epubs.siam.org/page/terms Example 3.2. Here is the MATLAB script that generates Figure 3.1. t = 0:.1:4*pi; tt = sin(t); ax = zeros(1,length(t)); xrt = 0:pi:4*pi; yrt = zeros(1,5); subplot(3,1,1) plot(t,tt,’b’,t,ax,’k’,xrt,yrt,’rx’); xlabel(’x’) ylabel(’f(x)’) t = 0:.1:20; tt = t.^3 - 30*t.^2 + 2552; ax = zeros(1,length(t)); subplot(3,1,2) plot(t,tt,’b’,t,ax,’k’,11.8615,0,’rx’); xlabel(’x’) ylabel(’f(x)’) t = -10:.1:10; tt = 10 * cosh(t ./4) - t; ax = zeros(1,length(t)); subplot(3,1,3) plot(t,tt,’b’,t,ax,’k’); xlabel(’x’) ylabel(’f(x)’) This script should not be too difficult to read like text. Note the use of array arguments, for instance, in defining the array tt in terms of the array t. Iterative methods for finding roots It is not realistic to expect, except in special cases, to find a solution of a nonlinear equation by using a closed form formula or a procedure that has a finite, small number of steps. Indeed, it is enough to consider finding roots of polynomials to realize how rare closed formulas are: they practically exist only for very low order polynomials. Thus, one has to resort to iterative methods: starting with an initial iterate x0 , the method generates a sequence of iterates x1 , x2 , . . . , xk , . . . that (if all works well) converge to a root of the given, continuous function. In general, our methods will require rough knowledge of the root’s location. To find more than one root, we can fire up the same method starting from different initial iterates x0 . To find approximate locations of roots we can roughly plot the function, as done in Example 3.1. (Yes, it sounds a little naive, but sometimes complicated things can be made simple by one good picture.) Alternatively, we can probe the function, i.e., evaluate it at several points, looking for locations where f (x) changes sign. If f (a) · f (b) < 0, i.e., f changes sign on the interval [a, b], then by the Intermediate Value Theorem given on page 10 there is a number c = x ∗ in this interval for which f (c) = s = 0. To see this intuitively, imagine trying to graph a scalar function from a positive to a negative value, without lifting the pen (because the function is continuous): somewhere the curve has to cross the x-axis! Such simple procedures for roughly locating roots are unfortunately not easy to generalize to several equations in several unknowns. i i i i i i i i 42 Chapter 3. Nonlinear Equations in One Variable Downloaded 01/15/22 to 137.122.8.73 Redistribution subject to SIAM license or copyright; see https://epubs.siam.org/page/terms Stopping an iterative procedure Typically, an iterative procedure starts with an initial iterate x0 and yields a sequence of iterates x1 , x2 , . . . , xk , . . . . Note that in general we do not expect the iterative procedure to produce the exact solution x ∗ exactly. We would conclude that the series of iterates converges if the values of | f (xk )| and/or of |xk − xk−1 | decrease towards 0 sufficiently fast as the iteration counter k increases. Correspondingly, one or more of the following general criteria are used to terminate such an iterative process successfully after n iterations: |xn − xn−1 | < atol and/or |xn − xn−1 | < rtol|xn | and/or | f (xn )| < ftol, where atol, rtol, and ftol are user-specified constants. Usually, but not always, the second, relative criterion is more robust than the first, absolute one. A favorite combination uses one tolerance value tol, which reads |xn − xn−1 | < tol(1 + |xn |). The third criterion is independent of the first two; it takes the function value into account. The function f (x) may in general be very flat, or very steep, or neither, near its root. Desired properties of root finding algorithms When assessing the qualities of a given root finding algorithm, a key property is its efficiency. In determining an algorithm’s efficiency it is convenient to concentrate on the number of function evaluations, i.e., the evaluations of f (x) at the iterates {xk }, required to achieve convergence to a given accuracy. Other details of the algorithm, which may be considered as overhead, are then generally neglected.5 Now, if the function f (x) is as simple to evaluate as those considered in Example 3.1, then it is hard to understand why we concentrate on this aspect alone. But in these circumstances any of the algorithms considered in this chapter is very fast indeed. What we really keep in the back of our minds is a possibility that the evaluation of f (x) is rather costly. For instance, think of simulating a space shuttle returning to earth, with x being a control parameter that affects the distance f (x) of the shuttle’s landing spot from the location of the reception committee awaiting this event. The calculation of f (x) for each value of x involves a precise simulation of the flight’s trajectory for the given x, and it may then be very costly. An algorithm that does not require too many such function evaluations is then sought. Desirable qualities of a root finding algorithm are the following: • Efficient—requires a small number of function evaluations. • Robust—fails rarely, if ever. Announces failure if it does fail. • Requires a minimal amount of additional data such as the function’s derivative. • Requires f to satisfy only minimal smoothness properties. • Generalizes easily and naturally to many equations in many unknowns. No algorithm we are aware of satisfies all of these criteria. Moreover, which of these is more important to honor depends on the application. So we study several possibilities in the following sections. 5 An exception is the number of evaluations of the derivative f (x) required, for instance, by Newton’s method. See Section 3.3. i i i i i i i i 3.2. Bisection method 43 Downloaded 01/15/22 to 137.122.8.73 Redistribution subject to SIAM license or copyright; see https://epubs.siam.org/page/terms 3.2 Bisection method The method developed in this section is simple and safe and requires minimal assumptions on the function f (x). However, it is also slow and hard to generalize to higher dimensions. Suppose that for a given f (x) we know an interval [a, b] where f changes sign, i.e., f (a) · f (b) < 0. The Intermediate Value Theorem given on page 10 then assures us that there is a root x ∗ such that a ≤ x ∗ ≤ b. Now evaluate f ( p), where p = a+b 2 is the midpoint, and check the sign of f (a) · f ( p). If it is negative, then the root is in [a, p] (by the same Intermediate Value Theorem), so we can set b ← p and repeat; else f (a) · f ( p) is positive, so the root must be in [ p, b], hence we can set a ← p and repeat. (Of course, if f (a) · f ( p) = 0 exactly, then p is the root and we are done.) In each such iteration the interval [a, b] where x ∗ is trapped shrinks by a factor of 2; at the kth step, the point xk is the midpoint p of the kth subinterval trapping the root. Thus, after a total of n −n iterations, |x ∗ − xn | ≤ b−a 2 · 2 . Therefore, the algorithm is guaranteed to converge. Moreover, if required to satisfy |x ∗ − xn | ≤ atol for a given absolute error tolerance atol > 0, we may determine the number of iterations n needed
−n ≤ atol. Multiplying both sides by 2n /atol and taking log, we see that it
by demanding b−a
2 ·2
takes

b−a
n = log2
2 atol
iterations to obtain a value p = xn that satisfies
|x ∗ − p| ≤ atol.
The following MATLAB function does the job:
function [p,n] = bisect(func,a,b,fa,fb,atol)
%
% function [p,n] = bisect(func,a,b,fa,fb,atol)
%
% Assuming fa = func(a), fb = func(b), and fa*fb < 0, % there is a value root in (a,b) such that func(root) = 0. % This function returns in p a value such that % | p - root | < atol % and in n the number of iterations required. % check input if (a >= b) | (fa*fb >= 0) | (atol a and
g(b) < b. Then for the continuous function φ(x) = g(x) − x we have φ(a) > 0, φ(b) < 0. (Note that φ does not have to coincide with the function f that we have started with; there are lots of φ’s for a given f .) Hence, by the Intermediate Value Theorem given on page 10, just as before, there is a root a < x ∗ < b such that φ(x ∗ ) = 0. Thus, g(x ∗ ) = x ∗ , so x ∗ is a fixed point. We have established the existence of a root: f (x ∗ ) = 0. Next, suppose that g is not only continuous but also differentiable and that there is a positive number ρ < 1 such that |g (x)| ≤ ρ, a < x < b. Then the root x ∗ is unique in the interval [a, b], for if there is also y ∗ which satisfies y ∗ = g(y ∗ ), then |x ∗ − y ∗ | = |g(x ∗ ) − g(y ∗ )| = |g (ξ )(x ∗ − y ∗ )| ≤ ρ|x ∗ − y ∗ |, where ξ is an intermediate value between x ∗ and y ∗ . Obviously, this inequality can hold with ρ < 1 only if y ∗ = x ∗ . We can summarize our findings so far as the Fixed Point Theorem. i i i i i i i i Downloaded 01/15/22 to 137.122.8.73 Redistribution subject to SIAM license or copyright; see https://epubs.siam.org/page/terms 3.3. Fixed point iteration 47 Theorem: Fixed Point. If g ∈ C[a, b], g(a) ≥ a and g(b) ≤ b, then there is a fixed point x ∗ in the interval [a, b]. If, in addition, the derivative g exists and there is a constant ρ < 1 such that the derivative satisfies |g (x)| ≤ ρ ∀ x ∈ (a, b), then the fixed point x ∗ is unique in this interval. Convergence of the fixed point iteration Turning to the fixed point iteration and the third question on the preceding page, similar arguments establish convergence (now that we know that there is a unique solution): under the same assumptions we have |xk+1 − x ∗ | = |g(xk ) − g(x ∗ )| ≤ ρ|xk − x ∗ |. This is a contraction by the factor ρ. So |xk+1 − x ∗ | ≤ ρ|xk − x ∗ | ≤ ρ 2 |xk−1 − x ∗ | ≤ · · · ≤ ρ k+1 |x0 − x ∗ |. Since ρ < 1, we have ρ k → 0 as k → ∞. This establishes convergence of the fixed point iteration. Example 3.4. For the function g(x) = e−x on [0.2, 1.2], Figure 3.2 shows the progression of iterates ∗ towards the fixed point x ∗ satisfying x ∗ = e−x . 1.4 1.2 y=x 1 0.8 0.6 0.4 g(x)=e−x 0.2 0 0 x1 0.2 0.4 x2 0.6 x0 0.8 1 1.2 1.4 x x2 = e Figure 3.2. Fixed point iteration for x = e−x , starting from x0 = 1. This yields x1 = e−1 , , . . . . Convergence is apparent. −e−1 Note the contraction effect. To answer the question about the speed of convergence, it should be clear from the bounds derived before Example 3.4 that the smaller ρ is, the faster the iteration converges. In case of the i i i i i i i i Downloaded 01/15/22 to 137.122.8.73 Redistribution subject to SIAM license or copyright; see https://epubs.siam.org/page/terms 48 Chapter 3. Nonlinear Equations in One Variable bisection method, the speed of convergence does not depend on the function f (or any g, for that matter), and it is in fact identical to the speed of convergence obtained with ρ ≡ 1/2. In contrast, for the fixed point iteration, there is dependence on |g (x)|. This, and the (negative) answer to our fifth question on page 46, are demonstrated next. Example 3.5. Consider the function f (x) = α cosh(x/4) − x, where α is a parameter. For α = 10 we saw in Example 3.1 that there is no root. But for α = 2 there are actually two roots. Indeed, setting g(x) = 2 cosh(x/4) and plotting g(x) and x as functions of x, we obtain Figure 3.3, which suggests not only that there are two fixed points (i.e., roots of f )—let’s call them x1∗ and x2∗ —but also that we can bracket them, say, by 2 ≤ x1∗ ≤ 4, 8 ≤ x2∗ ≤ 10. Next we apply our trusted routine bisect, introduced in the previous section, to f (x) with α = 2. This yields x1∗ ≈ 2.35755106, x2∗ ≈ 8.50719958, correct to an absolute tolerance 1.e-8, in 27 iterations for each root. (You should be able to explain why precisely the same number of iterations was required for each root.) 14 12 10 8 6 y=2cosh(x/4) y=x 4 2 0 0 1 2 3 4 5 x 6 7 8 9 10 Figure 3.3. The functions x and 2 cosh(x/4) meet at two locations. i i i i i i i i Downloaded 01/15/22 to 137.122.8.73 Redistribution subject to SIAM license or copyright; see https://epubs.siam.org/page/terms 3.3. Fixed point iteration 49 Now, it is very tempting, and rather natural here, to use the same function g defined above in a fixed point iteration, thus defining xk+1 = 2 cosh(xk /4). For the first root, on the interval [2, 4] we have the conditions of the Fixed Point Theorem holding. In fact, near x1∗ , |g (x)| < 0.4, so we expect faster convergence than with the bisection method (why?). Indeed, starting from x0 = 2 the method requires 16 iterations, and starting from x0 = 4 it requires 18 iterations to get to within 1.e-8 of the root x1∗ . For the second root, however, on the interval [8, 10] the conditions of the Fixed Point Theorem do not hold! In particular, |g (x2∗ )| > 1. Thus, a fixed point iteration using this g will not converge
to the root x2∗ . Indeed, starting with x0 = 10 the iteration diverges quickly, and we obtain overflow
after 3 iterations, whereas starting this fixed point iteration with x0 = 8 we do obtain convergence,
but to x1∗ rather than to x2∗ .
It is important to realize that the root x2∗ is there, even though the fixed point iteration with the
“natural g” does not converge to it. Not everything natural is good for you! There are other choices
of g for the purpose of fixed point iteration that perform better here.
Note: A discussion of rates of convergence similar to that appearing here is also given in
Section 7.3. There it is more crucial, because here we will soon see methods that converge
faster than a rate of convergence can suitably quantify.
Rate of convergence
Suppose now that at a particular root of a given fixed point iteration, ρ = |g (x ∗ )| satisfies 0 < ρ < 1. Starting with x0 sufficiently close to x ∗ we can write xk − x ∗ ≈ g (x ∗ )(xk−1 − x ∗ ). Hence we get |xk − x ∗ | ≈ ρ|xk−1 − x ∗ | ≈ · · · ≈ ρ k |x0 − x ∗ |. To quantify the speed of convergence in terms of ρ we can ask, how many iterations does it take to reduce the error by a fixed factor, say, 10? To answer this we set |xk − x ∗ | ≈ 0.1|x0 − x ∗ |, obtaining ρ k ≈ 0.1. Taking log10 of both sides yields k log10 ρ ≈ −1. Let us define the rate of convergence as rate = − log10 ρ. (3.1) Then it takes about k = 1/rate iterations to reduce the error by more than an order of magnitude. Obviously, the smaller ρ the higher the convergence rate and correspondingly, the smaller the number of iterations required to achieve the same error reduction effect. Example 3.6. The bisection method is not exactly a fixed point iteration, but it corresponds to an iterative method of a similar sort with ρ = 0.5. Thus its convergence rate according to (3.1) is rate = − log10 0.5 ≈ 0.301. This yields k = 4, and indeed the error reduction factor with this many bisections is 16, which is more than 10. i i i i i i i i 50 Chapter 3. Nonlinear Equations in One Variable Downloaded 01/15/22 to 137.122.8.73 Redistribution subject to SIAM license or copyright; see https://epubs.siam.org/page/terms For the root x1∗ of Example 3.5 we have ρ ≈ 0.3121 and rate = − log10 0.3121 ≈ 0.506. Here it takes only two iterations to roughly reduce the error by a factor of 10 if starting close to x ∗ . For the root x2∗ of Example 3.5 we obtain from (3.1) a negative rate of convergence, as is appropriate for a case where the method does not converge. Of course, it makes no sense to calculate ρ or the convergence rate so precisely as in Example 3.6: we did this only so that you can easily follow the algebra with a calculator. Indeed, such accurate estimation of ρ would require knowing the solution first! Moreover, there is nothing magical about an error reduction by a particularly precise factor. The rate of convergence should be considered as a rough indicator only. It is of importance, however, to note that the rate becomes negative for ρ > 1, indicating no convergence of the fixed point iteration, and that it becomes infinite
for ρ = 0, indicating that the error reduction per iteration is faster than by any constant factor. We
encounter such methods next.
Specific exercises for this section: Exercises 3–4.
3.4 Newton’s method and variants
The bisection method requires the function f to be merely continuous (which is good) and makes
no further use of further information on f such as availability of its derivatives (which causes it
to be painfully slow at times). At the other end of the scale is Newton’s method, which requires
more knowledge and smoothness of the function f but which converges much faster in appropriate
circumstances.
Newton’s method
Newton’s method is the most basic fast method for root finding. The principle we use below to
derive it can be directly extended to more general problems.
Assume that the first and second derivatives of f exist and are continuous: f ∈ C 2 [a, b].
Assume also that f can be evaluated with sufficient ease. Let xk be a current iterate. By Taylor’s
expansion on page 5 we can write
f (x) = f (xk ) + f (xk )(x − xk ) + f (ξ (x))(x − xk )2 /2,
where ξ (x) is some (unknown) point between x and xk .
Now, set x = x ∗ , for which f (x ∗ ) = 0. If f were linear, i.e., f ≡ 0, then we could find
the root by solving 0 = f (xk ) + f (xk )(x ∗ − xk ), yielding x ∗ = xk − f (xk )/ f (xk ). For a nonlinear
function we therefore define the next iterate by the same formula, which gives
xk+1 = xk −
f (xk )
,
f (xk )
k = 0, 1, 2, . . . .
This corresponds to neglecting the term f (ξ (x ∗ ))(x ∗ − xk )2 /2 when defining the next iterate; however, if xk is already close to x ∗ , then (x ∗ − xk )2 is very small, so we would expect xk+1 to be much
closer to x ∗ than xk is.
A geometric interpretation of Newton’s method is that xk+1 is the x-intercept of the tangent
line to f at xk ; see Figure 3.4.
i
i
i
i
i
i
i
i
3.4. Newton’s method and variants
51
Downloaded 01/15/22 to 137.122.8.73 Redistribution subject to SIAM license or copyright; see https://epubs.siam.org/page/terms
1.5
1
0.5
0
x1
x2
x0
−0.5
−1
−1
−0.8
−0.6
−0.4
−0.2
0
x
0.2
0.4
0.6
0.8
1
Figure 3.4. Newton’s method: the next iterate is the x-intercept of the tangent line to f at
the current iterate.
Algorithm: Newton’s Method.
Given a scalar differentiable function in one variable, f (x):
1. Start from an initial guess x0 .
2. For k = 0, 1, 2, . . . , set
xk+1 = xk −
f (xk )
,
f (xk )
until xk+1 satisfies termination criteria.
Example 3.7. Consider the same function as in Example 3.5, given by
f (x) = 2 cosh(x/4) − x.
The Newton iteration here is
2 cosh(xk /4) − xk
.
0.5 sinh(xk /4) − 1
We use the same absolute tolerance of 1.e-8 and the same four initial iterates as in Example 3.5.
xk+1 = xk −
• Starting from x0 = 2 requires 4 iterations to reach x1∗ to within the given tolerance.
• Starting from x0 = 4 requires 5 iterations to reach x1∗ to within the given tolerance.
• Starting from x0 = 8 requires 5 iterations to reach x2∗ to within the given tolerance.
• Starting from x0 = 10 requires 6 iterations to reach x2∗ to within the given tolerance.
i
i
i
i
i
i
i
i
52
Chapter 3. Nonlinear Equations in One Variable
Downloaded 01/15/22 to 137.122.8.73 Redistribution subject to SIAM license or copyright; see https://epubs.siam.org/page/terms
The values of f (xk ), starting from x0 = 8, are displayed below:
k
0
1
2
3
4
5
f (xk )
−4.76e-1
8.43e-2
1.56e-3
5.65e-7
7.28e-14
1.78e-15
This table suggests (indirectly) that the number of significant digits essentially doubles at each iteration. Of course, when roundoff level is reached, no meaningful improvement can be obtained upon
heaping more floating point operations, and the improvement from the 4th to the 5th iteration in this
example is minimal.
Speed of convergence
How fast does a nonlinear iteration converge, if it does? Let us assume that the method indeed
converges and that xk is already “close enough” to the root x ∗ . We define convergence orders as
follows. The method is said to be
• linearly convergent if there is a constant ρ < 1 such that |xk+1 − x ∗ | ≤ ρ|xk − x ∗ | for all k sufficiently large; • quadratically convergent if there is a constant M such that |xk+1 − x ∗ | ≤ M|xk − x ∗ |2 for all k sufficiently large; • superlinearly convergent if there is a sequence of constants ρk → 0 such that |xk+1 − x ∗ | ≤ ρk |xk − x ∗ | for all k sufficiently large. Note that for quadratic convergence we can set ρk = M|xk − x ∗ | →k→∞ 0; thus the quadratic order implies superlinear order. Superlinear convergence may require less than this but more than linear convergence. Higher convergence orders (e.g., cubic) may be defined similarly to the quadratic order. However, normally there is really no practical reason to consider such higher convergence orders: methods of this sort would typically require more work per iteration, and roundoff error sets in quickly even for quadratically convergent methods, as we have seen in Example 3.7. For Newton’s method we have already noted that in the derivation a quadratic term was dropped. Indeed, it turns out that if f ∈ C 2 [a, b] and there is a root x ∗ in [a, b] such that f (x ∗ ) = 0, f (x ∗ ) = 0, then there is a number δ such that, starting with x0 from anywhere in the neighborhood [x ∗ − δ, x ∗ + δ], Newton’s method converges quadratically. See Exercise 9. How is the general fixed point iteration xk+1 = g(xk ), considered in the previous section, related to Newton’s method? We have seen that the speed of convergence may strongly depend on the choice of g (recall Example 3.5, for instance). It is not difficult to see (Exercise 3) that if, in addition to the assumptions that guarantee convergence in the Fixed Point Theorem, also g (x ∗ ) = 0, then the method converges linearly. In this case the size of |g (x ∗ )| matters and this is quantified by the rate of convergence, defined in Section 3.3. The method may converge faster than linearly if g (x ∗ ) = 0. This is precisely the case with Newton’s method. Here g(x) = x − f (x)/ f (x), and i i i i i i i i 3.4. Newton’s method and variants Downloaded 01/15/22 to 137.122.8.73 Redistribution subject to SIAM license or copyright; see https://epubs.siam.org/page/terms 2 53 hence g = 1 − ( f () f− )2f f = ( ff f )2 . Now, since x = x ∗ is the root of f it follows immediately that g (x ∗ ) = 0. Newton’s method does have two disadvantages: (i) the need to know not only that the derivative f exists but also how to evaluate it, and (ii) the local nature of the method’s convergence. The first of these two aspects is addressed (or, rather, circumvented) in what follows. Secant method The requirement that not only the function f but also its derivative f be supplied by the user of Newton’s method is at times simple to satisfy, as in our Example 3.7. But at other instances it can be a drag. An extra effort is required by the user, and occasionally the evaluation of f can be much more costly than the evaluation of f at a given argument value. Moreover, sometimes the derivative is simply not available, for example, in certain experimental setups where it is possible to evaluate only the function itself. The secant method is a variant of Newton’s method, where f (xk ) is replaced by its finite difference approximation based on the evaluated function values at xk and at the previous iterate xk−1 . Assuming convergence, observe that near the root f (xk ) ≈ f (xk ) − f (xk−1 ) . xk − xk−1 (A similar formula was used for different purposes in Example 1.2.) Substitution of this approximation into the formula for Newton’s method yields the Secant method, also referred to as quasiNewton, given by xk+1 = xk − f (xk )(xk − xk−1 ) , f (xk ) − f (xk−1 ) k = 1, 2, . . . . Algorithm: Secant Method. Given a scalar differentiable function in one variable, f (x): 1. Start from two initial guesses x0 and x1 . 2. For k = 1, 2, . . . , set xk+1 = xk − f (xk )(xk − xk−1 ) f (xk ) − f (xk−1 ) until xk+1 satisfies termination criteria. It is worthwhile to compare at this point the Secant algorithm to the Newton algorithm on page 51. Example 3.8. Consider again the same function as in Example 3.5, given by f (x) = 2 cosh(x/4) − x. We apply the secant method with the same stopping criterion and the same tolerance as in Examples 3.5 and 3.7. Note that the secant method requires two initial iterates, x0 and x1 . • Starting from x0 = 2 and x1 = 4 requires 7 iterations to reach x1∗ to within the given tolerance. • Starting from x0 = 10 and x1 = 8 requires 7 iterations to reach x2∗ to within the given tolerance. i i i i i i i i 54 Chapter 3. Nonlinear Equations in One Variable Downloaded 01/15/22 to 137.122.8.73 Redistribution subject to SIAM license or copyright; see https://epubs.siam.org/page/terms The values of f (xk ), starting from x0 = 10 and x1 = 8, are displayed below: k 0 1 2 3 4 5 6 f (xk ) 2.26 −4.76e-1 −1.64e-1 2.45e-2 −9.93e-4 −5.62e-6 1.30e-9 This table suggests that the number of significant digits increases more rapidly as we get nearer to the root (i.e., better than a linear order), but not as fast as with Newton’s method. Thus we observe indeed a demonstration of superlinear convergence. Theorem: Convergence of the Newton and Secant Methods. If f ∈ C 2 [a, b] and there is a root x ∗ in [a, b] such that f (x ∗ ) = 0, f (x ∗ ) = 0, then there is a number δ such that, starting with x0 (and also x1 in case of the secant method) from anywhere in the neighborhood [x ∗ − δ, x ∗ + δ], Newton’s method converges quadratically and the secant method converges superlinearly. This theorem specifies the conditions under which Newton’s method is guaranteed to converge quadratically, while the secant method is guaranteed to converge superlinearly. Constructing secant-like modifications for Newton’s method for the case of systems of equations is more involved but possible and practically useful; we will discuss this in Section 9.2. The case of a multiple root The fast local convergence rate of both Newton and secant methods is predicated on the assumption that f (x ∗ ) = 0. What if f (x ∗ ) = 0? This is the case of a multiple root. In general, if f (x) can be written as f (x) = (x − x ∗ )m q(x), where q(x ∗ ) = 0, then x ∗ is a root of multiplicity m. If m > 1, then obviously f (x ∗ ) = 0. As it
turns out, Newton’s method may still converge, but only at a linear (rather than quadratic) rate.
Example 3.9. For the monomial f (x) = x m , m > 1, we get
x
f (x)
= .
f (x) m
Newton’s method reads
xk+1 = xk −
m −1
xk
=
xk .
m
m
Since the root is x ∗ = 0, we can somewhat trivially write
|xk+1 − x ∗ | =
m −1
|xk − x ∗ |,
m
so the method is linearly convergent, with the contraction factor ρ = m−1
m . The corresponding rate
,
and
it
is
not
difficult
to
calculate
these values and see that
of convergence is rate = − log10 m−1
m
they deteriorate as m grows. If f (x ∗ ) = 0, i.e., m = 2, then ρ = 0.5. See Example 3.6.
i
i
i
i
i
i
i
i
3.5. Minimizing a function in one variable
55
Downloaded 01/15/22 to 137.122.8.73 Redistribution subject to SIAM license or copyright; see https://epubs.siam.org/page/terms
Globalizing Newton’s method
Both Newton’s method and the secant method are guaranteed to converge under appropriate conditions only locally: there is the assumption in the theorem statement that the initial iterate x0 is
already “close enough” to an isolated root. This is in contrast to the situation for the bisection
method, which is guaranteed to converge provided only that a bracketing interval is found on which
f changes sign. The δ-neighborhood for which Newton’s method is guaranteed to converge may be
large for some applications, as is the case for Example 3.7, but it may be small for others, and it is
more involved to assess ahead of time which is the case for a given problem.
A natural idea, then, is to combine two methods into a hybrid one. We start with a rough plot
or probe of the given function f in order to bracket roots. Next, starting from a given bracket with
f (x0 ) · f (x1 ) < 0, initiate a Newton or a secant method, monitoring the iterative process by requiring sufficient improvement at each iteration. This “sufficient improvement” can, for instance, be that | f (xk+1 )| < 0.5| f (xk )| or the bisection guaranteed improvement ratio |xk+1 − xk | < 0.5|xk − xk−1 |. If a Newton or secant iteration is applied but there is no sufficient improvement, then we conclude that the current iterate must be far from where the superior fire power of the fast method can be utilized, and we therefore return to x0 and x1 , apply a few bisection iterations to obtain new x0 and x1 which bracket the root x ∗ more tightly, and restart the Newton or secant routine. This hybrid strategy yields a robust solver for scalar nonlinear equations; see Exercise 13. Convergence and roundoff errors Thus far in this chapter we have scarcely noticed the possible existence of roundoff errors. But rest assured that roundoff errors are there, as promised in Chapter 2. They are simply dominated by the convergence errors, so long as the termination criteria of our iterative methods are well above rounding unit, and so long as the problem is well-conditioned. In the last entry of the table in Example 3.7, the convergence error is so small that roundoff error is no longer negligible. The result is certainly not bad in terms of error magnitude, but the quadratic convergence pattern (which does not hold for roundoff errors, barring a miracle) is destroyed. Our essential question here is, given that the error tolerances are well above rounding unit in magnitude, do we ever have to worry about roundoff errors when solving a nonlinear equation? The answer is that we might, if the problem is such that roundoff error accumulates significantly. When could ill-conditioning occur for the problem of finding roots of a scalar nonlinear function? This can happen if the function f (x) is very flat near its root, because small changes to f may then significantly affect the precise location of the root. Exercise 2 illustrates this point. The general case is that of a multiple root. Specific exercises for this section: Exercises 5–12. 3.5 Minimizing a function in one variable A major source of applications giving rise to root finding is optimization. In its one-variable version we are required to find an argument x = x̂ that minimizes a given objective function φ(x).6 For a simple instance, we have already briefly examined in Example 3.1 the minimization of the function φ(x) = 10 cosh(x/4) − x over the real line. 6 We arbitrarily concentrate on finding minimum rather than maximum points. If a given problem is naturally formulated as finding a maximum of a function ψ(x), say, then define φ(x) = −ψ(x) and consider minimizing φ. i i i i i i i i Downloaded 01/15/22 to 137.122.8.73 Redistribution subject to SIAM license or copyright; see https://epubs.siam.org/page/terms 56 Chapter 3. Nonlinear Equations in One Variable Another, general, instance is obtained from any root finding or fixed point problem, g(x) = x, by setting φ(x) = [g(x) − x]2 . But here we concentrate on cases where the given objective is not necessarily that of root finding. Note that in general φ(x ∗ ) = 0. Conditions for a minimum point Assume that φ ∈ C 2 [a, b], i.e., that φ and its first two derivatives are continuous on a given interval [a, b]. Denote f (x) = φ (x). An argument x ∗ satisfying a < x ∗ < b is called a critical point if f (x ∗ ) = 0. Now, for a parameter h small enough so that x ∗ + h ∈ [a, b] we can expand in Taylor’s series on page 5 and write φ(x ∗ + h) = φ(x ∗ ) + hφ (x ∗ ) + = φ(x ∗ ) + h 2 ∗ φ (x ) + · · · 2 h 2 ∗ [φ (x ) + O(h)]. 2 (See the discussion on page 7 regarding order O notation.) Since |h| can be taken arbitrarily small, it is now clear that at a critical point: • If φ (x ∗ ) > 0, then x̂ = x ∗ is a local minimizer of φ(x). This means that φ attains minimum
at x̂ = x ∗ in some neighborhood which includes x ∗ .
• If φ (x ∗ ) < 0, then x ∗ is a local maximizer of φ(x). This means that φ attains maximum at x̂ = x ∗ in some neighborhood which includes x ∗ . • If φ (x ∗ ) = 0, then a further investigation at x ∗ is required. Algorithms for function minimization It is not difficult to see that if φ(x) attains a minimum (or maximum) at a point x̂, then this point must be critical, i.e., f (x̂) = 0. Thus, the problem of finding all minima of a given function φ(x) can be solved by finding all the critical roots7 and then checking for each if it is a minimum by examining the sign of the second derivative of φ. Example 3.10. For the function φ(x) = 10 cosh(x/4) − x we have 10 sinh(x/4) − 1, 4 10 φ (x) = f (x) = cosh(x/4). 16 φ (x) = f (x) = 7 Note that for quadratic convergence of Newton’s method, φ(x) must have three continuous derivatives. i i i i i i i i Downloaded 01/15/22 to 137.122.8.73 Redistribution subject to SIAM license or copyright; see https://epubs.siam.org/page/terms 3.5. Minimizing a function in one variable 57 Since φ (x) > 0 for all x, we can have only one minimum point. (Think of how the curve
must look like if we had more than one minimum.) Applying any of the algorithms described in
Sections 3.2–3.4, we obtain the unique minimum point as the root of f (x), given by
x̂ ≈ 1.5601412791.
At this point, φ(x̂) ≈ 9.21018833518615.
Example 3.11. For an example with many critical points, consider φ(x) = cos(x).
The critical points are where f (x) = φ (x) = − sin(x) = 0; thus
x ∗ = jπ ,
Since
j = 0, ±1, ±2, . . . .
φ (x ∗ ) = − cos( jπ ) = (−1) j+1 ,
j = 0, ±1, ±2, . . . ,
we have minimum points where j is odd, i.e., at every second critical point: the local minimizers
are
x̂ = (2l + 1)π , l = 0, ±1, ±2, . . . .
The other critical points are maximum points.
For simple problems this could mark the end of this section. However, for problems where
function evaluations are really dear we note that there are two imperfections in the simple procedure
outlined above:
• We are investing a considerable effort into finding all critical points, even those that end up
being local maxima. Is there a way to avoid finding such unwanted points?
• No advantage is taken of the fact that we are minimizing φ(x). Can’t we gauge how good an
iterate xk is by checking φ(xk )?
These objections become more of an issue when considering the minimization of functions of several
variables. Nonetheless, let us proceed to quickly outline some general ideas with Newton’s method
on page 51 for the scalar case.
The kth iteration of Newton’s method applied to f (x) = φ (x) can be written here as
xk+1 = xk + αk ξk , where
φ (xk )
ξk = −
, αk = 1.
φ (xk )
Now, if φ (xk ) > 0 and the step size αk > 0 is small enough, then
φ(xk+1 ) ≈ φ(xk ) + αk ξk φ (xk ) = φ(xk ) − αk φ (xk )2 /φ (xk ) < φ(xk ). Therefore, we have a decrease in φ, i.e., a step in the right direction. Thus, it is possible to design a procedure where we insist that φ(x) decrease in each iteration. This would prevent convergence to critical points which are local maxima rather than minima. In such a procedure we monitor φ (xk ), replacing it by max{φ (xk ), }, where > 0 is a parameter
chosen not large. We then select a step size αk by trying αk = 1 first and decreasing it if necessary
until a sufficient decrease is obtained in φ(xk+1 ) relatively to φ(xk ). We shall have more to say about
this sort of procedure when considering minimizing functions in several variables.
i
i
i
i
i
i
i
i
58
Chapter 3. Nonlinear Equations in One Variable
Downloaded 01/15/22 to 137.122.8.73 Redistribution subject to SIAM license or copyright; see https://epubs.siam.org/page/terms
Local and global minimum
The procedures outlined above find local minima. The problem of finding a global minimizer, i.e.,
a point x̂ such that φ(x̂) ≤ φ(x) for all x ∈ [a, b], is significantly more difficult in general. In the
worst situations, where not much is known about the objective function φ, a suitable algorithm may
involve a complete enumeration, i.e., finding all local minima and then comparing them. At the
other end of the scale, if the objective function is known to be convex, which means that
φ(θ x + (1 − θ )y) ≤ θ φ(x) + (1 − θ )φ(y) ∀ 0 ≤ θ ≤ 1
for any two points x, y ∈ [a, b], then there is a unique minimum and the problems of finding local and
global minima coincide. Such is the instance considered in Example 3.10, but not that considered in
Example 3.11.
Specific exercises for this section: Exercises 21–23.
3.6 Exercises
0. Review questions
(a) What is a nonlinear equation?
(b) Is the bisection method (i) efficient? (ii) robust? Does it (iii) require a minimal amount
of additional knowledge? (iv) require f to satisfy only minimum smoothness properties?
(v) generalize easily to several functions in several variables?
(c) Answer similar questions for the Newton and secant methods.
(d) State at least one advantage and one disadvantage of the recursive implementation of the
bisection method over the iterative nonrecursive implementation.
(e) In what way is the fixed point iteration a family of methods, rather than just one method
like bisection or secant?
(f) What is the basic condition for convergence of the fixed point iteration, and how does
the speed of convergence relate to the derivative of the iteration function g?
(g) Suppose a given fixed point iteration does not converge: does this mean that there is
no root in the relevant interval? Answer a similar question for the Newton and secant
methods.
(h) State at least two advantages and two disadvantages of Newton’s method.
(i) What are order of convergence and rate of convergence, and how do they relate?
(j) State at least one advantage and one disadvantage of the secant method over Newton’s.
(k) In what situation does Newton’s method converge only linearly?
(l) Explain the role that roundoff errors play in the convergence of solvers for nonlinear
equations, and explain their relationship with convergence errors.
(m) State a similarity and a difference between the problem of minimizing a function φ(x)
and that of solving the nonlinear equation φ (x) = 0.
(n) State what a convex function is, and explain what happens if an objective function is
convex.
1. Apply the bisection routine bisect to find the root of the function

f (x) = x − 1.1
starting from the interval [0, 2] (that is, a = 0 and b = 2), with atol = 1.e-8.
i
i
i
i
i
i
i
i
Downloaded 01/15/22 to 137.122.8.73 Redistribution subject to SIAM license or copyright; see https://epubs.siam.org/page/terms
3.6. Exercises
59
(a) How many iterations are required? Does the iteration count match the expectations,
based on our convergence analysis?
(b) What is the resulting absolute error? Could this absolute error be predicted by our convergence analysis?
2. Consider the polynomial function8
f (x) = (x − 2)9
= x 9 − 18x 8 + 144x 7 − 672x 6 + 2016x 5 − 4032x 4 + 5376x 3 − 4608x 2
+ 2304x − 512.
(a) Write a MATLAB script which evaluates this function at 161 equidistant points in the
interval [1.92, 2.08] using two methods:
i. Apply nested evaluation (cf. Example 1.4) for evaluating the polynomial in the
expanded form x 9 − 18x 8 + · · · .
ii. Calculate (x − 2)9 directly.
Plot the results in two separate figures.
(b) Explain the difference between the two graphs.
(c) Suppose you were to apply the bisection routine from Section 3.2 to find a root of this
function, starting from the interval [1.92, 2.08] and using the nested evaluation method,
to an absolute tolerance 10−6 . Without computing anything, select the correct outcome:
i. The routine will terminate with a root p satisfying | p − 2| ≤ 10−6 .
ii. The routine will terminate with a root p not satisfying | p − 2| ≤ 10−6 .
iii. The routine will not find a root.
Justify your choice in one short sentence.
3. Consider the fixed point iteration xk+1 = g(xk ), k = 0, 1, . . . , and let all the assumptions of
the Fixed Point Theorem hold. Use a Taylor’s series expansion to show that the order of
convergence depends on how many of the derivatives of g vanish at x = x ∗ . Use your result
to state how fast (at least) a fixed point iteration is expected to converge if g (x ∗ ) = · · · =
g (r ) (x ∗ ) = 0, where the integer r ≥ 1 is given.
3
4. Consider the function g(x) = x 2 + 16
.
(a) This function has two fixed points. What are they?
(b) Consider the fixed point iteration xk+1 = g(xk ) for this g. For which of the points you
have found in (a) can you be sure that the iterations will converge to that fixed point?
Briefly justify your answer. You may assume that the initial guess is sufficiently close to
the fixed point.
(c) For the point or points you found in (b), roughly how many iterations will be required to
reduce the convergence error by a factor of 10?

5. Write a MATLAB script for computing the cube root of a number, x = 3 a, with only basic
arithmetic operations using Newton’s method, by finding a root of the function f (x) = x 3 −
a. Run your program for a = 0, 2, 10. For each of these cases, start with an initial guess
reasonably close to the solution. As a stopping criterion, require the function value whose
root you are searching to be smaller than 10−8 . Print out the values of xk and f (xk ) in each
iteration. Comment on the convergence rates and explain how they match your expectations.
8 This beautiful example is inspired by Demmel [21]. We gave a modified version of it as an exam question once.
Unfortunately, not everyone thought it was beautiful.
i
i
i
i
i
i
i
i
60
Chapter 3. Nonlinear Equations in One Variable
Downloaded 01/15/22 to 137.122.8.73 Redistribution subject to SIAM license or copyright; see https://epubs.siam.org/page/terms
6.
(a) Derive a third order method for solving f (x) = 0 in a way similar to the derivation
of Newton’s method, using evaluations of f (xn ), f (xn ), and f (xn ). The following
remarks may be helpful in constructing the algorithm:
• Use the Taylor expansion with three terms plus a remainder term.
• Show that in the course of derivation a quadratic equation arises, and therefore two
distinct schemes can be derived.
(b) Show that the order of convergence (under the appropriate conditions) is cubic.
(c) Estimate the number of iterations and the cost needed to reduce the initial error by a
factor of 10m .
(d) Write a script for solving the problem of Exercise 5. To guarantee that your program
does not generate complex roots, make sure to start sufficiently close to a real root.
(e) Can you speculate what makes this method less popular than Newton’s method, despite
its cubic convergence? Give two reasons.
7. Consider Steffensen’s method
xk+1 = xk −
where
g(x) =
f (xk )
,
g(xk )
k = 0, 1, . . . ,
f (x + f (x)) − f (x)
.
f (x)
(a) Show that in general the method converges quadratically to a root of f (x).
(b) Compare the method’s efficiency to the efficiency of the secant method.

8. It is known that the order of convergence of the secant method is p = 1+2 5 = 1.618 . . . and
that of Newton’s method is p = 2. Suppose that evaluating f costs approximately α times
the cost of approximating f . Determine approximately for what values of α Newton’s method
is more efficient (in terms of number of function evaluations) than the secant method. You
may neglect the asymptotic error constants in your calculations. Assume that both methods
are starting with initial guesses of a similar quality.
9. This exercise essentially utilizes various forms of Taylor’s expansion and relies on expertise
in calculus.
(a) Prove that if f ∈ C 2 [a, b] and there is a root x ∗ in [a, b] such that f (x ∗ ) = 0, f (x ∗ ) = 0,
then there is a number δ such that, starting with x0 from anywhere in the neighborhood
[x ∗ − δ, x ∗ + δ], Newton’s method converges quadratically.
(b) This is more challenging: prove the same conclusions under the same assumptions,
except that now f is only assumed to have a first Lipschitz continuous derivative. Thus,
while f may not exist everywhere in [a, b], there is a constant γ such that for any
x, y ∈ [a, b]
| f (x) − f (y)| ≤ γ |x − y|.
10. The function
f (x) = (x − 1)2 e x
has a double root at x = 1.
i
i
i
i
i
i
i
i
Downloaded 01/15/22 to 137.122.8.73 Redistribution subject to SIAM license or copyright; see https://epubs.siam.org/page/terms
3.6. Exercises
61
(a) Derive Newton’s iteration for this function. Show that the iteration is well-defined so
long as xk = −1 and that the convergence rate is expected to be similar to that of the
bisection method (and certainly not quadratic).
(b) Implement Newton’s method and observe its performance starting from x0 = 2.
(c) How easy would it be to apply the bisection method? Explain.
11. In general, when a smooth function f (x) has a multiple root at x ∗ , the function
ψ(x) =
f (x)
f (x)
has a simple root there.
(a) Show that a Newton method for ψ(x) yields the iteration
xk+1 = xk −
f (xk ) f (xk )
.
[ f (xk )]2 − f (xk ) f (xk )
(b) Give one potential advantage and two potential disadvantages for this iteration as compared to Newton’s method for f (x).
(c) Try this method on the problem of Exercise 10. What are your observations?
12. Given a > 0, we wish to compute x = ln a using addition, subtraction, multiplication, division,
and the exponential function, e x .
(a) Suggest an iterative formula based on Newton’s method, and write it in a way suitable
for numerical computation.
(b) Show that your formula converges quadratically.
(c) Write down an iterative formula based on the secant method.
(d) State which of the secant and Newton’s methods is expected to perform better in this
case in terms of overall number of exponential function evaluations. Assume a fair
comparison, i.e., same floating point system, “same quality” initial guesses, and identical
convergence criterion.
13. Write a MATLAB program to find all the roots of a given, twice continuously differentiable,
function f ∈ C 2 [a, b].
Your program should first probe the function f (x) on the given interval to find out where it
changes sign. (Thus, the program has, in addition to f itself, four other input arguments: a,
b, the number npr obe of equidistant values between a and b at which f is probed, and a
tolerance tol.)
For each subinterval [ai , bi ] over which the function changes sign, your program should then
find a root as follows. Use either Newton’s method or the secant method to find the root,
monitoring decrease in | f (xk )|. If an iterate is reached for which there is no sufficient decrease
(e.g., if | f (xk )| ≥ 0.5| f (xk−1 )|), then revert back to [ai , bi ], apply three bisection steps and
restart the Newton or secant method.
The ith root is deemed “found” as xk if both
|xk − xk−1 | < tol(1 + |xk |) and | f (xk )| < tol hold. i i i i i i i i 62 Chapter 3. Nonlinear Equations in One Variable Downloaded 01/15/22 to 137.122.8.73 Redistribution subject to SIAM license or copyright; see https://epubs.siam.org/page/terms Verify your program by finding the two roots (given in Example 3.5) of the function f (x) = 2 cosh(x/4) − x, starting your search with [a, b] = [0, 10] and npr obe = 10. Then use your program to answer other questions in this section where relevant. 14. Find all the roots of the function f (x) = sin(x) x , 1, x = 0, x = 0, in the interval [−10, 10] for tol = 10−7 . [This function is special enough to have a name. It is called the sinc function.] 15. For x > 0 consider the equation
x + ln x = 0.
It is a reformulation of the equation of Example 3.4.
(a) Show analytically that there is exactly one root, 0 < x ∗ < ∞. (b) Plot a graph of the function on the interval [0.1, 1]. (c) As you can see from the graph, the root is between 0.5 and 0.6. Write MATLAB routines for finding the root, using the following: i. The bisection method, with the initial interval [0.5, 0.6]. Explain why this choice of the initial interval is valid. ii. A linearly convergent fixed point iteration, with x0 = 0.5. Show that the conditions of the Fixed Point Theorem (for the function g you have selected) are satisfied. iii. Newton’s method, with x0 = 0.5. iv. The secant method, with x0 = 0.5 and x1 = 0.6. For each of the methods: • Use |xk − xk−1 | < 10−10 as a convergence criterion. • Print out the iterates and show the progress in the number of correct decimal digits throughout the iteration. • Explain the convergence behavior and how it matches theoretical expectations. 16. We saw in Example 3.5 that the function f (x) = α cosh(x/4) − x has two roots for α = 2 and none for α = 10. Is there an α for which there is precisely one root? If yes, then find such an α and the corresponding root; if not, then justify. 17. The derivative of the sinc function is given by f (x) = x cos(x) − sin(x) . x2 (a) Show that near x = 0, this function can be approximated by f (x) ≈ −x/3. The error in this approximation gets smaller as x approaches 0. (b) Find all the roots of f in the interval [−10, 10] for tol = 10−8 . i i i i i i i i 3.6. Exercises 63 0.8 0.6 0.4 f Downloaded 01/15/22 to 137.122.8.73 Redistribution subject to SIAM license or copyright; see https://epubs.siam.org/page/terms 1 0.2 0 −0.2 −0.4 −0.6 0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 x Figure 3.5. Graph of an anonymous function; see Exercise 18. 18. Consider finding the root of a given nonlinear function f (x), known to exist in a given interval [a, b], using one of the following three methods: bisection, Newton, and secant. For each of the following instances, one of these methods has a distinct advantage over the other two. Match problems and methods and justify briefly. (a) f (x) = x − 1 on the interval [0, 2.5]. (b) f (x) is given in Figure 3.5 on [0, 4]. (c) f ∈ C 5 [0.1, 0.2], the derivatives of f are all bounded in magnitude by 1, and f (x) is hard to specify explicitly or evaluate. 19. You are required by a computer manufacturer to write a library function for a given floating point system to find the cube root y 1/3 of any given positive number y. Any such relevant floating point number can be represented as y = a × 2e , where a is a normalized fraction (0.5 ≤ a < 1) and e is an integer exponent. This library function must be very efficient and it should always work. For efficiency purposes it makes sense to store some useful constants ahead of computation time, e.g., the constants 21/3 , 23 , and a/3, should these prove useful. (a) Show how y 1/3 can be obtained, once a 1/3 has been calculated for the corresponding fraction, in at most five additional flops. (b) Derive the corresponding Newton iteration. What is the flop count per iteration? (c) How would you choose an initial approximation? Roughly how many iterations are needed? (The machine rounding unit is 2−52 .) [You might find this exercise challenging.] i i i i i i i i Downloaded 01/15/22 to 137.122.8.73 Redistribution subject to SIAM license or copyright; see https://epubs.siam.org/page/terms 64 Chapter 3. Nonlinear Equations in One Variable 20. Suppose that the division button of your calculator has stopped working, and you have addition, subtraction, and multiplication only. Given a real number b = 0, suggest a quadratically convergent iterative formula to compute b1 , correct to a user-specified tolerance. Write a MATLAB routine that implements your algorithm, using |xk − xk−1 | < 10−10 as a convergence criterion, and apply your algorithm to b = π (that is, we compute π1 ), with two different initial guesses: (a) x0 = 1 ; and (b) x0 = 0.1. Explain your results. 21. Write a MATLAB program to minimize a smooth, scalar function in one variable φ(x) over a given interval. The specifications are similar to those in Exercise 13. Your program should first find and display all critical points, i.e., zeros of φ (x). Then it should determine which of these correspond to a local minimum by checking the sign of φ , and display the local minimum points. Finally, it should determine the global minimum point(s), value, and arguments by minimizing over the local minimum values. 22. Apply your program from Exercise 21 to find the (unique) global minimum of the function φ(x) = cos(x) + x/10 over the interval [−10, 10]. What are x̂ and φ(x̂)? 23. Use your program from Exercise 21 to minimize (a) φ(x) = sin(x) and (b) φ(x) = −sinc(x) = − sin(x) x over the interval [−10, 10] with tol = 10−8 . You might find that determining the global minimum of the second function is significantly more challenging, even though the global minimum for the first one is not unique. Explain the source of difficulties and how you have addressed them. [Exercises 17 and 14 are relevant here.] 3.7 Additional notes Just about any introductory text on numerical methods or numerical analysis covers the topics introduced in this chapter, although the goals and the manner of these treatments may vary significantly. For different flavors, see Heath [39], Cheney and Kincaid [12], and Burden and Faires [11], as well as the classics Conte and de Boor [13] and Dahlquist and Björck [15]. For us this chapter is mainly an opportunity to introduce and demonstrate, within a simple context, several basic notions and issues arising also in much more complex numerical computations. We have thus avoided methods that do not seem to shed much light on other problems as well. There is a (somewhat more complicated) analogue to the bisection method for function minimization. This classical algorithm is called golden section search and is described, together with extensions, in Press et al. [59]; alternatively, see the crisper Heath [39]. i i i i Purchase answer to see full attachment Tags: mathematics function methods User generated content is uploaded by users for the purposes of learning and should be used following Studypool's honor code & terms of service.

Reviews, comments, and love from our customers and community:

This page is having a slideshow that uses Javascript. Your browser either doesn't support Javascript or you have it turned off. To see this page as it is meant to appear please use a Javascript enabled browser.

Peter M.
Peter M.
So far so good! It's safe and legit. My paper was finished on time...very excited!
Sean O.N.
Sean O.N.
Experience was easy, prompt and timely. Awesome first experience with a site like this. Worked out well.Thank you.
Angela M.J.
Angela M.J.
Good easy. I like the bidding because you can choose the writer and read reviews from other students
Lee Y.
Lee Y.
My writer had to change some ideas that she misunderstood. She was really nice and kind.
Kelvin J.
Kelvin J.
I have used other writing websites and this by far as been way better thus far! =)
Antony B.
Antony B.
I received an, "A". Definitely will reach out to her again and I highly recommend her. Thank you very much.
Khadija P.
Khadija P.
I have been searching for a custom book report help services for a while, and finally, I found the best of the best.
Regina Smith
Regina Smith
So amazed at how quickly they did my work!! very happy♥.