**Are you an EPFL student looking for a semester project?**

Work with us on data science and visualisation projects, and deploy your project as an app on top of GraphSearch.

Person# Zvonimir Bujanovic

This page is automatically generated and may contain information that is not correct, complete, up-to-date, or relevant to your search query. The same applies to every other page on this website. Please make sure to verify the information with EPFL's official sources.

Related units

Loading

Courses taught by this person

Loading

Related research domains

Loading

Related publications

Loading

People doing similar research

Loading

Related publications (3)

Loading

Loading

Loading

Related research domains (1)

Algorithm

In mathematics and computer science, an algorithm (ˈælɡərɪðəm) is a finite sequence of rigorous instructions, typically used to solve a class of specific problems or to perform a computation. Algo

People doing similar research (163)

, , , , , , , , ,

Courses taught by this person

No results

Related units (1)

Zvonimir Bujanovic, Daniel Kressner

The Schur decomposition of a square matrix A is an important intermediate step of state-of-the-art numerical algorithms for addressing eigenvalue problems, matrix functions, and matrix equations. This work is concerned with the following task: Compute a (more) accurate Schur decomposition of A from a given approximate Schur decomposition. This task arises, for example, in the context of parameter-dependent eigenvalue problems and mixed precision computations. We have developed a Newton-like algorithm that requires the solution of a triangular matrix equation and an approximate orthogonalization step in every iteration. We prove local quadratic convergence for matrices with mutually distinct eigenvalues and observe fast convergence in practice. In a mixed low-high precision environment, our algorithm essentially reduces to only four high-precision matrix-matrix multiplications per iteration. When refining double to quadruple precision, it often needs only 3-4 iterations, which reduces the time of computing a quadruple precision Schur decomposition by up to a factor of 10-20.

Zvonimir Bujanovic, Daniel Kressner

The QZ algorithm for computing eigenvalues and eigenvectors of a matrix pencil A - lambda B requires that the matrices first be reduced to Hessenberg-triangular (HT) form. The current method of choice for HT reduction relies entirely on Givens rotations regrouped and accumulated into small dense matrices which are subsequently applied using matrix multiplication routines. A nonvanishing fraction of the total flop-count must nevertheless still be performed as sequences of overlapping Givens rotations alternately applied from the left and from the right. The many data dependencies associated with this computational pattern leads to inefficient use of the processor and poor scalability. In this paper, we therefore introduce a fundamentally different approach that relies entirely on (large) Householder reflectors partially accumulated into block reflectors, by using (compact) WY representations. Even though the new algorithm requires more floating point operations than the state-of-the-art algorithm, extensive experiments on both real and synthetic data indicate that it is still competitive, even in a sequential setting. The new algorithm is conjectured to have better parallel scalability, an idea which is partially supported by early small-scale experiments using multithreaded BLAS. The design and evaluation of a parallel formulation is future work.

Zvonimir Bujanovic, Daniel Kressner

Any symmetric matrix can be reduced to antitriangular form in finitely many steps by orthogonal similarity transformations. This form reveals the inertia of the matrix and has found applications in, e.g., model predictive control and constraint preconditioning. Originally proposed by Mastronardi and Van Dooren, the existing algorithm for performing the reduction to antitriangular form is primarily based on Householder reflectors and Givens rotations. The poor memory access pattern of these operations implies that the performance of the algorithm is bound by the memory bandwidth. In this work, we develop a block algorithm that performs all operations almost entirely in terms of level 3 BLAS operations, which feature a more favorable memory access pattern and lead to better performance. These performance gains are confirmed by numerical experiments that cover a wide range of different inertia.