Skip to content

Support AD on CUDA #376

@Ryo-wtnb11

Description

@Ryo-wtnb11

Hi developers.

I have recently noticed that the new version of TensorKit supports running on CUDA, and I got a massive speed up when I tried. It's wonderful!
However, when combining GPU-backed TensorMaps with AD (Zygote), the backward pass fails because several pullback functions assume CPU arrays due to scalar indexing.

using TensorKit
using Zygote
using CUDA
using cuTENSOR
using Adapt


V = Rep[U₁](1//2 => 2, -1//2 => 2)
A = randn(ComplexF64, V ← V)
A_gpu = adapt(CuArray, A)

# Forward works
D, U = eigh_trunc((A_gpu + A_gpu') / 2; trunc = truncrank(2))

@show D

# Backward fails
Zygote.gradient(A_gpu) do a
    D, U = eigh_trunc((a + a') / 2; trunc = truncrank(2))
    return real(tr(D))
end

And the essentially same issue occurs in many of functions for tensor manipulation, such as flip() and twist().
Maybe the scalar indexing is the best way to go on CPU with help of Strided.jl, so we may need to implement special pullbacks for CUDA extensions? It'd be annoying...

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions