Skip to content

Update step length selection in LAED4 overshoot fallback#1191

Open
angsch wants to merge 1 commit intoReference-LAPACK:masterfrom
angsch:issue-1166
Open

Update step length selection in LAED4 overshoot fallback#1191
angsch wants to merge 1 commit intoReference-LAPACK:masterfrom
angsch:issue-1166

Conversation

@angsch
Copy link
Collaborator

@angsch angsch commented Feb 16, 2026

Description

On behalf of the NVIDIA cuSolver team, I am proposing a fix of #1166.

When LAED4 overshoots, bisection is used as fallback. This was introduced with LAPACK 3.0. This MR updates the heuristic to account for cases where the bisection step is a poor choice: The relative distance is large (details in the issue), and the step can be too large, increasing the objective function value a lot. The new heuristic exploits that we know that the root is close to a cell boundary if we enter the fallback branch. As a consequence, we know that a plain Newton step is too short; a bisection step is be too long if we are far away from the cell boundary; so we combine both by using the geometric mean for a more balanced step length selection. If we are close to cell boundary, the bisection step and the Newton step are essentially the same, so there is no difference in that case.

iter W relative position
3 -1.51832E+01 9.3224424664962507E-012
4 -7.25413E+00 2.8321247865713130E-011
5* 8.18857E+01 3.0407842273038627E-006
6 4.09510E+01 1.5161499861086540E-006
7 -2.46969E+00 1.0185715189871375E-010
8 -1.22757E+00 2.1967449887941141E-010
9* 3.65801E-01 1.3670968645180088E-008
10 1.60693E-01 7.1757532425621201E-009
11 -4.84878E-02 2.6748870916100162E-009
12 -5.24972E-03 3.3293874740295193E-009
13 7.41175E-05 3.4204917465753645E-009
14 1.46778E-08 3.4192075727019548E-009
15 1.07595E-12 3.4192073183864720E-009

The iterations marked with * go into the fallback branch; the objective function value W is much better than with plain bisection (compare with table in bug report). Instead of 31 iterations and 3 overshoots as in the bug report, we now have 15 iterations and only 2 overshoots.

Checklist

  • The documentation has been updated.
  • If the PR solves a specific issue, it is set to be closed on merge.

@angsch angsch linked an issue Feb 16, 2026 that may be closed by this pull request
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Convergence failure in secular equation DLAED4

2 participants

Comments