Precompiling a model #185

mabuni1998 · 2025-09-06T22:03:38Z

mabuni1998
Sep 6, 2025

Hi,

I have recently begun using your package, and I think it is awesome so far! I am experimenting with quite large problems (see below), and I am wondering what approaches you would suggest to circumvent the long compile times that occur in the first run of the optimizer (I assume that the long first run time is because the Jacobian and Hessian functions are getting compiled). Would you simply include the code for generating the model in a module and make a precompilation step where we call the optimizer with for example max_iter=1. This seems like it might be a bit overkill...?

Also, what would you consider to be the limit of the package now? Like, what are the largest possible models that one could consider? I have below a smaller model which runs fine on my personal computer, and the larger one I have yet to generate results on a cluster. Both have like on the order of 20 to 100 constraints signatures.

Smaller model:

Problem name: Generic
All variables: ████████████████████ 52120 All constraints: ████████████████████ 48096
free: ███████████████████⋅ 48096 free: ⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅ 0
lower: ⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅ 0 lower: ⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅ 0
upper: ⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅ 0 upper: ⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅ 0
low/upp: ██⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅ 4024 low/upp: ⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅ 0
fixed: ⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅ 0 fixed: ████████████████████ 48096
infeas: ⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅ 0 infeas: ⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅ 0
nnzh: ( 99.73% sparsity) 3628840 linear: ⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅ 0
nonlinear: ████████████████████ 48096
nnzj: ( 99.95% sparsity) 1152096

Larger:

Problem name: Generic
All variables: ████████████████████ 187368 All constraints: ████████████████████ 184920
free: ████████████████████ 184920 free: ⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅ 0
lower: ⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅ 0 lower: ⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅ 0
upper: ⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅ 0 upper: ⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅ 0
low/upp: █⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅ 2448 low/upp: ⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅ 0
fixed: ⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅ 0 fixed: ████████████████████ 184920
infeas: ⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅ 0 infeas: ⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅ 0
nnzh: ( 99.80% sparsity) 35065824 linear: ⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅ 0
nonlinear: ████████████████████ 184920
nnzj: ( 99.97% sparsity) 9922520

sshin23 · 2025-09-07T17:51:30Z

sshin23
Sep 7, 2025
Maintainer

Hi @mabuni1998, this depends on your use case. If you want to make it an application that users can directly call, I think it could be useful to include a precompilation script. Have you checked PackageCompilers.jl? What solver are you using along with ExaModels.jl? On CPU or GPU?

Regarding the limit, I haven't experimented with how many constraint signatures can be handled. Probably 100 constraint signatures would be ok, though the compilation can take quite some time. Please feel free to make a PR on ExaModelsExamples.jl so that we can use it as a benchmarking instance in the future

0 replies

mabuni1998 · 2025-09-08T07:24:57Z

mabuni1998
Sep 8, 2025
Author

For now, it's just for personal use, but I might, for example, want to tweak the weighting in my objective function or stuff like that (using your parameters). So I was thinking about precompilation only to minimize the start time of my optimizations. I have run mostly on CPU, both on single and multi-core, but have also played with GPU locally (right now my cluster has CUDA drivers 13.0, which CUDSS.jl does not allow - we need the newest CUDA.jl version for that).

Regarding the benchmarking and pull request, I can put in the smaller model, perhaps? It runs in a reasonable time locally. The larger model took like 8 hours on a cluster using 8 cores.

I am working with quantum optimal control, so the (linear) ODE we are trying to optimize can get really large, really fast. Right now, the systems have 20 and 100 states, respectively, where each state has a different dynamic, which therefore gives 20 to 100 different signatures. I wanted to hear your thoughts on the limitations, because I thought the usage of GPU might enable us to push to even larger systems (maybe like 1000 states). But it seems that the complexity in compiling the long dynamical expressions might not make this approach viable?

0 replies

sshin23 · 2025-09-08T13:45:20Z

sshin23
Sep 8, 2025
Maintainer

Hi @mabuni1998, indeed 1000 constraint signature can be prohibitive. Do you have a code you can share? I can take a look at whether we can reduce the number of constraint signatures. Have you been using constraint!? This is quite useful for reducing constraint signatures

0 replies

mabuni1998 · 2025-09-09T07:45:45Z

mabuni1998
Sep 9, 2025
Author

I haven't been using constraint!, perhaps you can take a look at my code below and see whether it would make sense for me? Any inputs are welcome. I generate the dynamics equations symbolically, but they are just generated by multiplying a matrix M(u,v) (depends on control and variable) times the state at time step ti (the dynamics are dx/dt=M(u,v) x). I am wondering if it would be more efficient to use your bivariate function definition and just derive the derivatives with respect to some matrix by hand. But maybe this won't work well with the exa variables?

For the small problem, the code is attached below (this runs reasonably fast). I have also attached file which defines the larger problem, but this might be painful to run (takes 5-10 min to create the examodel, as for solving it might take several hours). Here, my guess is that the dynamics functions get too large expression wise and the GPU compiled code is too large for CUDA to compile (it complains: ERROR: Kernel invocation uses too much parameter memory.
52.219 KiB exceeds the 31.996 KiB limit imposed by sm_86 / PTX v8.8.

Larger model file:
examodels_github_larger.zip

).

using NLPModelsIpopt
using ExaModels

N_cav = 4
N_phot = 2

#Define initial and final states
initial_states =  [[0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0],
 [0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0],
 [0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0],
 [0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0]]

target_states =  [[0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0],
 [0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0],
 [0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0],
 [0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0]]

#Define bounds on controls and variables
control_lower_bounds = vcat([(0) for i in 1:N_cav], [(-10.0) for i in 1:N_cav])
control_upper_bounds = vcat([(10.0) for i in 1:N_cav], [(10.0) for i in 1:N_cav])

variable_lower_bounds = vcat([(-pi) for i in 1:N_cav^2])
variable_upper_bounds = vcat([( Float64(pi)) for i in 1:N_cav^2])

#Determine number of controls and variables
n_controls,n_param= 2*N_cav,N_cav^2

#Define hyper parameters
control_scale = 0.01
fidelity_weight = 1000
T_final = 10.0
grid_size = 200


#Define dynamics functions
function dyn_1(x, u, v, ti, ni)
    return -(-2.0000000000000004 - 2.0000000000000004u[5,ti] - 2.0000000000000004(u[1,ti]^2)*cos(v[1]))*x[11,ti,ni] + 1.4142135623730951u[1,ti]*u[2,ti]*x[12,ti,ni]*cos(v[6]) + 1.4142135623730951u[1,ti]*u[2,ti]*x[2,ti,ni]*cos(v[5]) + 1.4142135623730951u[1,ti]*u[3,ti]*x[14,ti,ni]*cos(v[8]) + 1.4142135623730951u[1,ti]*u[3,ti]*x[4,ti,ni]*cos(v[7]) + 1.4142135623730951u[1,ti]*u[4,ti]*x[17,ti,ni]*cos(v[10]) + 1.4142135623730951u[1,ti]*u[4,ti]*x[7,ti,ni]*cos(v[9])
end
function dyn_2(x, u, v, ti, ni)
    return (u[5,ti] + u[6,ti] + (u[1,ti]^2)*cos(v[1]) + (u[2,ti]^2)*cos(v[2]))*x[12,ti,ni] - 1.4142135623730951u[1,ti]*u[2,ti]*x[1,ti,ni]*cos(v[5]) + 1.4142135623730951u[1,ti]*u[2,ti]*x[11,ti,ni]*cos(v[6]) + 1.4142135623730951u[1,ti]*u[2,ti]*x[13,ti,ni]*cos(v[6]) + 1.4142135623730951u[1,ti]*u[2,ti]*x[3,ti,ni]*cos(v[5]) + u[1,ti]*u[3,ti]*x[15,ti,ni]*cos(v[8]) + u[1,ti]*u[3,ti]*x[5,ti,ni]*cos(v[7]) + u[1,ti]*u[4,ti]*x[18,ti,ni]*cos(v[10]) + u[1,ti]*u[4,ti]*x[8,ti,ni]*cos(v[9]) + u[2,ti]*u[3,ti]*x[14,ti,ni]*cos(v[12]) + u[2,ti]*u[3,ti]*x[4,ti,ni]*cos(v[11]) + u[2,ti]*u[4,ti]*x[17,ti,ni]*cos(v[14]) + u[2,ti]*u[4,ti]*x[7,ti,ni]*cos(v[13])
end
function dyn_3(x, u, v, ti, ni)
    return 1.4142135623730951u[1,ti]*u[2,ti]*x[12,ti,ni]*cos(v[6]) - 1.4142135623730951u[1,ti]*u[2,ti]*x[2,ti,ni]*cos(v[5]) - (-2.0000000000000004 - 2.0000000000000004u[6,ti] - 2.0000000000000004(u[2,ti]^2)*cos(v[2]))*x[13,ti,ni] + 1.4142135623730951u[2,ti]*u[3,ti]*x[15,ti,ni]*cos(v[12]) + 1.4142135623730951u[2,ti]*u[3,ti]*x[5,ti,ni]*cos(v[11]) + 1.4142135623730951u[2,ti]*u[4,ti]*x[18,ti,ni]*cos(v[14]) + 1.4142135623730951u[2,ti]*u[4,ti]*x[8,ti,ni]*cos(v[13])
end
function dyn_4(x, u, v, ti, ni)
    return -(-u[5,ti] - u[7,ti] - (u[1,ti]^2)*cos(v[1]) - (u[3,ti]^2)*cos(v[3]))*x[14,ti,ni] + u[1,ti]*u[2,ti]*x[15,ti,ni]*cos(v[6]) + u[1,ti]*u[2,ti]*x[5,ti,ni]*cos(v[5]) - 1.4142135623730951u[1,ti]*u[3,ti]*x[1,ti,ni]*cos(v[7]) + 1.4142135623730951u[1,ti]*u[3,ti]*x[11,ti,ni]*cos(v[8]) + 1.4142135623730951u[1,ti]*u[3,ti]*x[16,ti,ni]*cos(v[8]) + 1.4142135623730951u[1,ti]*u[3,ti]*x[6,ti,ni]*cos(v[7]) + u[1,ti]*u[4,ti]*x[19,ti,ni]*cos(v[10]) + u[1,ti]*u[4,ti]*x[9,ti,ni]*cos(v[9]) + u[2,ti]*u[3,ti]*x[12,ti,ni]*cos(v[12]) - u[2,ti]*u[3,ti]*x[2,ti,ni]*cos(v[11]) + u[3,ti]*u[4,ti]*x[17,ti,ni]*cos(v[16]) + u[3,ti]*u[4,ti]*x[7,ti,ni]*cos(v[15])
end
function dyn_5(x, u, v, ti, ni)
    return u[1,ti]*u[2,ti]*x[14,ti,ni]*cos(v[6]) - u[1,ti]*u[2,ti]*x[4,ti,ni]*cos(v[5]) + u[1,ti]*u[3,ti]*x[12,ti,ni]*cos(v[8]) - u[1,ti]*u[3,ti]*x[2,ti,ni]*cos(v[7]) + (u[6,ti] + u[7,ti] + (u[2,ti]^2)*cos(v[2]) + (u[3,ti]^2)*cos(v[3]))*x[15,ti,ni] + 1.4142135623730951u[2,ti]*u[3,ti]*x[13,ti,ni]*cos(v[12]) + 1.4142135623730951u[2,ti]*u[3,ti]*x[16,ti,ni]*cos(v[12]) - 1.4142135623730951u[2,ti]*u[3,ti]*x[3,ti,ni]*cos(v[11]) + 1.4142135623730951u[2,ti]*u[3,ti]*x[6,ti,ni]*cos(v[11]) + u[2,ti]*u[4,ti]*x[19,ti,ni]*cos(v[14]) + u[2,ti]*u[4,ti]*x[9,ti,ni]*cos(v[13]) + u[3,ti]*u[4,ti]*x[18,ti,ni]*cos(v[16]) + u[3,ti]*u[4,ti]*x[8,ti,ni]*cos(v[15])
end
function dyn_6(x, u, v, ti, ni)
    return 1.4142135623730951u[1,ti]*u[3,ti]*x[14,ti,ni]*cos(v[8]) - 1.4142135623730951u[1,ti]*u[3,ti]*x[4,ti,ni]*cos(v[7]) + 1.4142135623730951u[2,ti]*u[3,ti]*x[15,ti,ni]*cos(v[12]) - 1.4142135623730951u[2,ti]*u[3,ti]*x[5,ti,ni]*cos(v[11]) - (-2.0000000000000004 - 2.0000000000000004u[7,ti] - 2.0000000000000004(u[3,ti]^2)*cos(v[3]))*x[16,ti,ni] + 1.4142135623730951u[3,ti]*u[4,ti]*x[19,ti,ni]*cos(v[16]) + 1.4142135623730951u[3,ti]*u[4,ti]*x[9,ti,ni]*cos(v[15])
end
function dyn_7(x, u, v, ti, ni)
    return (u[5,ti] + u[8,ti] + (u[1,ti]^2)*cos(v[1]) + (u[4,ti]^2)*cos(v[4]))*x[17,ti,ni] + u[1,ti]*u[2,ti]*x[18,ti,ni]*cos(v[6]) + u[1,ti]*u[2,ti]*x[8,ti,ni]*cos(v[5]) + u[1,ti]*u[3,ti]*x[19,ti,ni]*cos(v[8]) + u[1,ti]*u[3,ti]*x[9,ti,ni]*cos(v[7]) - 1.4142135623730951u[1,ti]*u[4,ti]*x[1,ti,ni]*cos(v[9]) + 1.4142135623730951u[1,ti]*u[4,ti]*x[10,ti,ni]*cos(v[9]) + 1.4142135623730951u[1,ti]*u[4,ti]*x[11,ti,ni]*cos(v[10]) + 1.4142135623730951u[1,ti]*u[4,ti]*x[20,ti,ni]*cos(v[10]) + u[2,ti]*u[4,ti]*x[12,ti,ni]*cos(v[14]) - u[2,ti]*u[4,ti]*x[2,ti,ni]*cos(v[13]) + u[3,ti]*u[4,ti]*x[14,ti,ni]*cos(v[16]) - u[3,ti]*u[4,ti]*x[4,ti,ni]*cos(v[15])
end
function dyn_8(x, u, v, ti, ni)
    return u[1,ti]*u[2,ti]*x[17,ti,ni]*cos(v[6]) - u[1,ti]*u[2,ti]*x[7,ti,ni]*cos(v[5]) + u[1,ti]*u[4,ti]*x[12,ti,ni]*cos(v[10]) - u[1,ti]*u[4,ti]*x[2,ti,ni]*cos(v[9]) - (-u[6,ti] - u[8,ti] - (u[2,ti]^2)*cos(v[2]) - (u[4,ti]^2)*cos(v[4]))*x[18,ti,ni] + u[2,ti]*u[3,ti]*x[19,ti,ni]*cos(v[12]) + u[2,ti]*u[3,ti]*x[9,ti,ni]*cos(v[11]) + 1.4142135623730951u[2,ti]*u[4,ti]*x[10,ti,ni]*cos(v[13]) + 1.4142135623730951u[2,ti]*u[4,ti]*x[13,ti,ni]*cos(v[14]) + 1.4142135623730951u[2,ti]*u[4,ti]*x[20,ti,ni]*cos(v[14]) - 1.4142135623730951u[2,ti]*u[4,ti]*x[3,ti,ni]*cos(v[13]) + u[3,ti]*u[4,ti]*x[15,ti,ni]*cos(v[16]) - u[3,ti]*u[4,ti]*x[5,ti,ni]*cos(v[15])
end
function dyn_9(x, u, v, ti, ni)
    return u[1,ti]*u[3,ti]*x[17,ti,ni]*cos(v[8]) - u[1,ti]*u[3,ti]*x[7,ti,ni]*cos(v[7]) + u[1,ti]*u[4,ti]*x[14,ti,ni]*cos(v[10]) - u[1,ti]*u[4,ti]*x[4,ti,ni]*cos(v[9]) + u[2,ti]*u[3,ti]*x[18,ti,ni]*cos(v[12]) - u[2,ti]*u[3,ti]*x[8,ti,ni]*cos(v[11]) + u[2,ti]*u[4,ti]*x[15,ti,ni]*cos(v[14]) - u[2,ti]*u[4,ti]*x[5,ti,ni]*cos(v[13]) - (-u[7,ti] - u[8,ti] - (u[3,ti]^2)*cos(v[3]) - (u[4,ti]^2)*cos(v[4]))*x[19,ti,ni] + 1.4142135623730951u[3,ti]*u[4,ti]*x[10,ti,ni]*cos(v[15]) + 1.4142135623730951u[3,ti]*u[4,ti]*x[16,ti,ni]*cos(v[16]) + 1.4142135623730951u[3,ti]*u[4,ti]*x[20,ti,ni]*cos(v[16]) - 1.4142135623730951u[3,ti]*u[4,ti]*x[6,ti,ni]*cos(v[15])
end
function dyn_10(x, u, v, ti, ni)
    return 1.4142135623730951u[1,ti]*u[4,ti]*x[17,ti,ni]*cos(v[10]) - 1.4142135623730951u[1,ti]*u[4,ti]*x[7,ti,ni]*cos(v[9]) + 1.4142135623730951u[2,ti]*u[4,ti]*x[18,ti,ni]*cos(v[14]) - 1.4142135623730951u[2,ti]*u[4,ti]*x[8,ti,ni]*cos(v[13]) + 1.4142135623730951u[3,ti]*u[4,ti]*x[19,ti,ni]*cos(v[16]) - 1.4142135623730951u[3,ti]*u[4,ti]*x[9,ti,ni]*cos(v[15]) - (-2.0000000000000004 - 2.0000000000000004u[8,ti] - 2.0000000000000004(u[4,ti]^2)*cos(v[4]))*x[20,ti,ni]
end
function dyn_11(x, u, v, ti, ni)
    return (-2.0000000000000004 - 2.0000000000000004u[5,ti] - 2.0000000000000004(u[1,ti]^2)*cos(v[1]))*x[1,ti,ni] + 1.4142135623730951u[1,ti]*u[2,ti]*x[12,ti,ni]*cos(v[5]) - 1.4142135623730951u[1,ti]*u[2,ti]*x[2,ti,ni]*cos(v[6]) + 1.4142135623730951u[1,ti]*u[3,ti]*x[14,ti,ni]*cos(v[7]) - 1.4142135623730951u[1,ti]*u[3,ti]*x[4,ti,ni]*cos(v[8]) + 1.4142135623730951u[1,ti]*u[4,ti]*x[17,ti,ni]*cos(v[9]) - 1.4142135623730951u[1,ti]*u[4,ti]*x[7,ti,ni]*cos(v[10])
end
function dyn_12(x, u, v, ti, ni)
    return (-u[5,ti] - u[6,ti] - (u[1,ti]^2)*cos(v[1]) - (u[2,ti]^2)*cos(v[2]))*x[2,ti,ni] - 1.4142135623730951u[1,ti]*u[2,ti]*x[1,ti,ni]*cos(v[6]) - 1.4142135623730951u[1,ti]*u[2,ti]*x[11,ti,ni]*cos(v[5]) + 1.4142135623730951u[1,ti]*u[2,ti]*x[13,ti,ni]*cos(v[5]) - 1.4142135623730951u[1,ti]*u[2,ti]*x[3,ti,ni]*cos(v[6]) + u[1,ti]*u[3,ti]*x[15,ti,ni]*cos(v[7]) - u[1,ti]*u[3,ti]*x[5,ti,ni]*cos(v[8]) + u[1,ti]*u[4,ti]*x[18,ti,ni]*cos(v[9]) - u[1,ti]*u[4,ti]*x[8,ti,ni]*cos(v[10]) + u[2,ti]*u[3,ti]*x[14,ti,ni]*cos(v[11]) - u[2,ti]*u[3,ti]*x[4,ti,ni]*cos(v[12]) + u[2,ti]*u[4,ti]*x[17,ti,ni]*cos(v[13]) - u[2,ti]*u[4,ti]*x[7,ti,ni]*cos(v[14])
end
function dyn_13(x, u, v, ti, ni)
    return -1.4142135623730951u[1,ti]*u[2,ti]*x[12,ti,ni]*cos(v[5]) - 1.4142135623730951u[1,ti]*u[2,ti]*x[2,ti,ni]*cos(v[6]) + (-2.0000000000000004 - 2.0000000000000004u[6,ti] - 2.0000000000000004(u[2,ti]^2)*cos(v[2]))*x[3,ti,ni] + 1.4142135623730951u[2,ti]*u[3,ti]*x[15,ti,ni]*cos(v[11]) - 1.4142135623730951u[2,ti]*u[3,ti]*x[5,ti,ni]*cos(v[12]) + 1.4142135623730951u[2,ti]*u[4,ti]*x[18,ti,ni]*cos(v[13]) - 1.4142135623730951u[2,ti]*u[4,ti]*x[8,ti,ni]*cos(v[14])
end
function dyn_14(x, u, v, ti, ni)
    return (-u[5,ti] - u[7,ti] - (u[1,ti]^2)*cos(v[1]) - (u[3,ti]^2)*cos(v[3]))*x[4,ti,ni] + u[1,ti]*u[2,ti]*x[15,ti,ni]*cos(v[5]) - u[1,ti]*u[2,ti]*x[5,ti,ni]*cos(v[6]) - 1.4142135623730951u[1,ti]*u[3,ti]*x[1,ti,ni]*cos(v[8]) - 1.4142135623730951u[1,ti]*u[3,ti]*x[11,ti,ni]*cos(v[7]) + 1.4142135623730951u[1,ti]*u[3,ti]*x[16,ti,ni]*cos(v[7]) - 1.4142135623730951u[1,ti]*u[3,ti]*x[6,ti,ni]*cos(v[8]) + u[1,ti]*u[4,ti]*x[19,ti,ni]*cos(v[9]) - u[1,ti]*u[4,ti]*x[9,ti,ni]*cos(v[10]) - u[2,ti]*u[3,ti]*x[12,ti,ni]*cos(v[11]) - u[2,ti]*u[3,ti]*x[2,ti,ni]*cos(v[12]) + u[3,ti]*u[4,ti]*x[17,ti,ni]*cos(v[15]) - u[3,ti]*u[4,ti]*x[7,ti,ni]*cos(v[16])
end
function dyn_15(x, u, v, ti, ni)
    return -u[1,ti]*u[2,ti]*x[14,ti,ni]*cos(v[5]) - u[1,ti]*u[2,ti]*x[4,ti,ni]*cos(v[6]) - u[1,ti]*u[3,ti]*x[12,ti,ni]*cos(v[7]) - u[1,ti]*u[3,ti]*x[2,ti,ni]*cos(v[8]) + (-u[6,ti] - u[7,ti] - (u[2,ti]^2)*cos(v[2]) - (u[3,ti]^2)*cos(v[3]))*x[5,ti,ni] - 1.4142135623730951u[2,ti]*u[3,ti]*x[13,ti,ni]*cos(v[11]) + 1.4142135623730951u[2,ti]*u[3,ti]*x[16,ti,ni]*cos(v[11]) - 1.4142135623730951u[2,ti]*u[3,ti]*x[3,ti,ni]*cos(v[12]) - 1.4142135623730951u[2,ti]*u[3,ti]*x[6,ti,ni]*cos(v[12]) + u[2,ti]*u[4,ti]*x[19,ti,ni]*cos(v[13]) - u[2,ti]*u[4,ti]*x[9,ti,ni]*cos(v[14]) + u[3,ti]*u[4,ti]*x[18,ti,ni]*cos(v[15]) - u[3,ti]*u[4,ti]*x[8,ti,ni]*cos(v[16])
end
function dyn_16(x, u, v, ti, ni)
    return -1.4142135623730951u[1,ti]*u[3,ti]*x[14,ti,ni]*cos(v[7]) - 1.4142135623730951u[1,ti]*u[3,ti]*x[4,ti,ni]*cos(v[8]) - 1.4142135623730951u[2,ti]*u[3,ti]*x[15,ti,ni]*cos(v[11]) - 1.4142135623730951u[2,ti]*u[3,ti]*x[5,ti,ni]*cos(v[12]) + (-2.0000000000000004 - 2.0000000000000004u[7,ti] - 2.0000000000000004(u[3,ti]^2)*cos(v[3]))*x[6,ti,ni] + 1.4142135623730951u[3,ti]*u[4,ti]*x[19,ti,ni]*cos(v[15]) - 1.4142135623730951u[3,ti]*u[4,ti]*x[9,ti,ni]*cos(v[16])
end
function dyn_17(x, u, v, ti, ni)
    return (-u[5,ti] - u[8,ti] - (u[1,ti]^2)*cos(v[1]) - (u[4,ti]^2)*cos(v[4]))*x[7,ti,ni] + u[1,ti]*u[2,ti]*x[18,ti,ni]*cos(v[5]) - u[1,ti]*u[2,ti]*x[8,ti,ni]*cos(v[6]) + u[1,ti]*u[3,ti]*x[19,ti,ni]*cos(v[7]) - u[1,ti]*u[3,ti]*x[9,ti,ni]*cos(v[8]) - 1.4142135623730951u[1,ti]*u[4,ti]*x[1,ti,ni]*cos(v[10]) - 1.4142135623730951u[1,ti]*u[4,ti]*x[10,ti,ni]*cos(v[10]) - 1.4142135623730951u[1,ti]*u[4,ti]*x[11,ti,ni]*cos(v[9]) + 1.4142135623730951u[1,ti]*u[4,ti]*x[20,ti,ni]*cos(v[9]) - u[2,ti]*u[4,ti]*x[12,ti,ni]*cos(v[13]) - u[2,ti]*u[4,ti]*x[2,ti,ni]*cos(v[14]) - u[3,ti]*u[4,ti]*x[14,ti,ni]*cos(v[15]) - u[3,ti]*u[4,ti]*x[4,ti,ni]*cos(v[16])
end
function dyn_18(x, u, v, ti, ni)
    return -u[1,ti]*u[2,ti]*x[17,ti,ni]*cos(v[5]) - u[1,ti]*u[2,ti]*x[7,ti,ni]*cos(v[6]) - u[1,ti]*u[4,ti]*x[12,ti,ni]*cos(v[9]) - u[1,ti]*u[4,ti]*x[2,ti,ni]*cos(v[10]) + (-u[6,ti] - u[8,ti] - (u[2,ti]^2)*cos(v[2]) - (u[4,ti]^2)*cos(v[4]))*x[8,ti,ni] + u[2,ti]*u[3,ti]*x[19,ti,ni]*cos(v[11]) - u[2,ti]*u[3,ti]*x[9,ti,ni]*cos(v[12]) - 1.4142135623730951u[2,ti]*u[4,ti]*(x[10,ti,ni] + x[3,ti,ni])*cos(v[14]) - 1.4142135623730951u[2,ti]*u[4,ti]*x[13,ti,ni]*cos(v[13]) + 1.4142135623730951u[2,ti]*u[4,ti]*x[20,ti,ni]*cos(v[13]) - u[3,ti]*u[4,ti]*x[15,ti,ni]*cos(v[15]) - u[3,ti]*u[4,ti]*x[5,ti,ni]*cos(v[16])
end
function dyn_19(x, u, v, ti, ni)
    return -u[1,ti]*u[3,ti]*x[17,ti,ni]*cos(v[7]) - u[1,ti]*u[3,ti]*x[7,ti,ni]*cos(v[8]) - u[1,ti]*u[4,ti]*x[14,ti,ni]*cos(v[9]) - u[1,ti]*u[4,ti]*x[4,ti,ni]*cos(v[10]) - u[2,ti]*u[3,ti]*x[18,ti,ni]*cos(v[11]) - u[2,ti]*u[3,ti]*x[8,ti,ni]*cos(v[12]) - u[2,ti]*u[4,ti]*x[15,ti,ni]*cos(v[13]) - u[2,ti]*u[4,ti]*x[5,ti,ni]*cos(v[14]) + (-u[7,ti] - u[8,ti] - (u[3,ti]^2)*cos(v[3]) - (u[4,ti]^2)*cos(v[4]))*x[9,ti,ni] - 1.4142135623730951u[3,ti]*u[4,ti]*x[10,ti,ni]*cos(v[16]) - 1.4142135623730951u[3,ti]*u[4,ti]*x[16,ti,ni]*cos(v[15]) + 1.4142135623730951u[3,ti]*u[4,ti]*x[20,ti,ni]*cos(v[15]) - 1.4142135623730951u[3,ti]*u[4,ti]*x[6,ti,ni]*cos(v[16])
end
function dyn_20(x, u, v, ti, ni)
    return -1.4142135623730951u[1,ti]*u[4,ti]*x[17,ti,ni]*cos(v[9]) - 1.4142135623730951u[1,ti]*u[4,ti]*x[7,ti,ni]*cos(v[10]) - 1.4142135623730951u[2,ti]*u[4,ti]*x[18,ti,ni]*cos(v[13]) - 1.4142135623730951u[2,ti]*u[4,ti]*x[8,ti,ni]*cos(v[14]) - 1.4142135623730951u[3,ti]*u[4,ti]*x[19,ti,ni]*cos(v[15]) - 1.4142135623730951u[3,ti]*u[4,ti]*x[9,ti,ni]*cos(v[16]) + (-2.0000000000000004 - 2.0000000000000004u[8,ti] - 2.0000000000000004(u[4,ti]^2)*cos(v[4]))*x[10,ti,ni]
end

#List of dynamic functions
funcs = [dyn_1,dyn_2,dyn_3,dyn_4,dyn_5,dyn_6,dyn_7,dyn_8,dyn_9,dyn_10,dyn_11,dyn_12,dyn_13,dyn_14,dyn_15,dyn_16,dyn_17,dyn_18,dyn_19,dyn_20]

#Objective for target state
function infidelity(x,target,j,grid_size;weight=1)
    return weight * (1-sum([x[i,grid_size,j]*target[i] for i in 1:length(target)]))^2
end


function build_exa_model(funcs, initial_states,target_states,n_controls,n_param,grid_size;backend=nothing,
    control_lower_bounds=nothing,control_upper_bounds=nothing,
    control_deriv_lower_bounds=nothing,control_deriv_upper_bounds=nothing,
    control_deriv_deriv_lower_bounds=nothing,control_deriv_deriv_upper_bounds=nothing,
    variable_lower_bounds=nothing,variable_upper_bounds=nothing,
    fidelity_weight=100.0,control_scale=0.01,T_final=10)

    println("Grid size: $grid_size")
    x_dim = length(initial_states[1])
    n_states = length(initial_states)
    if isnothing(control_lower_bounds)
        control_lower_bounds_exa = -Inf
    else
        control_lower_bounds_exa = [control_lower_bounds[i] for (i, j) ∈ Base.product(1:n_controls, 0:grid_size)]
    end
    if isnothing(control_upper_bounds)
        control_upper_bounds_exa = Inf
    else
        control_upper_bounds_exa = [control_upper_bounds[i] for (i, j) ∈ Base.product(1:n_controls, 0:grid_size)]
    end
    if isnothing(control_deriv_lower_bounds)
        control_deriv_lower_bounds_exa = -Inf
    else
        control_deriv_lower_bounds_exa = [control_deriv_lower_bounds[i] for (i, j) ∈ Base.product(1:n_controls, 0:grid_size)]
    end
    if isnothing(control_deriv_upper_bounds)
        control_deriv_upper_bounds_exa = Inf
    else
        control_deriv_upper_bounds_exa = [control_deriv_upper_bounds[i] for (i, j) ∈ Base.product(1:n_controls, 0:grid_size)]
    end
    if isnothing(control_deriv_deriv_lower_bounds)
        control_deriv_deriv_lower_bounds_exa = -Inf
    else
        control_deriv_deriv_lower_bounds_exa = [control_deriv_deriv_lower_bounds[i] for (i, j) ∈ Base.product(1:n_controls, 0:grid_size)]
    end
    if isnothing(control_deriv_deriv_upper_bounds)
        control_deriv_deriv_upper_bounds_exa = Inf
    else
        control_deriv_deriv_upper_bounds_exa =[control_deriv_deriv_upper_bounds[i] for (i, j) ∈ Base.product(1:n_controls, 0:grid_size)]
    end

    if isnothing(variable_lower_bounds)
        variable_lower_bounds_exa = -Inf*ones(n_param)
    else
        variable_lower_bounds_exa = variable_lower_bounds
    end
    if isnothing(variable_upper_bounds)
        variable_upper_bounds_exa = Inf*ones(n_param)
    else
        variable_upper_bounds_exa = variable_upper_bounds
    end
    

    #convert_psi_to_x.(initial_states)

    n_states = length(initial_states)
    #funcs = eval.(funcs)

    if backend ==nothing
        c = ExaCore(;backend=backend)
    elseif backend == CPU()
        c = ExaCore(Float64,backend=backend)
    else
        #If the backend is GPU should one use Float32?
        c = ExaCore(Float64,backend=backend)
    end
    
    println("defining x")
    x = variable(c, x_dim, 0:grid_size, n_states,lvar=-Inf,uvar=Inf;start=0.1)

    

    ddu = variable(c, n_controls, 0:grid_size;
                lvar = control_deriv_deriv_lower_bounds_exa,  # Control amplitude bounds
                uvar = control_deriv_deriv_upper_bounds_exa,start=rand(8))
    du = variable(c, n_controls, 0:grid_size;
                lvar = control_deriv_lower_bounds_exa,  # Control amplitude bounds
                uvar = control_deriv_upper_bounds_exa,start=0.1)
    u = variable(c, n_controls, 0:grid_size;
                lvar = control_lower_bounds_exa,  # Control amplitude bounds
                uvar = control_upper_bounds_exa,start=0.1)

    v = variable(c, n_param;
                    lvar = variable_lower_bounds_exa,  # Variable bounds
                    uvar = variable_upper_bounds_exa,start=0.1)
    
    
    dt = T_final / grid_size
    
    
    #Define dynamic constraints
    println("defining dynamic constraints")
    itr1 = ExaModels.convert_array(collect(Base.product(0:(grid_size - 1), 1:n_states)),backend)
    for i in 1:x_dim
        constraint(c, x[i,j+1,k] - x[i,j,k] - funcs[i](x,u,v,j+1,k)*dt for (j, k) ∈ itr1)
    end
    
    #Define initial_states constraints
    itr2 = ExaModels.convert_array(collect(1:x_dim),backend)
    for j in 1:length(initial_states)
        constraint(c,x[i,0,j] for i in itr2;lcon=initial_states[j], ucon=initial_states[j])
    end
    itr_controls = ExaModels.convert_array(collect(1:n_controls),backend)
    
    #Define boundary conditions for controls
    constraint(c,u[i,0] for i in itr_controls;lcon=0, ucon=0)
    constraint(c,u[i,grid_size] for i in itr_controls;lcon=0, ucon=0)

    #Define control dynamics
    itr3 = ExaModels.convert_array(collect(Base.product(0:(grid_size - 1),1:n_controls)),backend)
    constraint(c, du[j,i+1] - du[j,i] - ddu[j,i+1]*dt for (i,j) in itr3)
    constraint(c, u[j,i+1] - u[j,i] - du[j,i+1]*dt for (i,j) in itr3)

    #Lagrange cost on controls
    itr4 = ExaModels.convert_array(collect(Base.product(1:n_controls, 1:grid_size)),backend)
    objective(c,control_scale*dt*(u[i,j]^2 + du[i,j]^2 + ddu[i,j]^2) for (i,j) in itr4)
    
    #Infidelity cost on target states
    for j in 1:n_states
        objective(c,infidelity(x,target_states[j],j,grid_size,weight=fidelity_weight))
    end


    return ExaModel(c),x,u,du,ddu,v,dt
end


m, x,u,du,ddu,v,dt = build_exa_model(funcs, initial_states,target_states,n_controls,n_param,500;backend=nothing,
    control_lower_bounds=control_lower_bounds,control_upper_bounds=control_upper_bounds,
    variable_lower_bounds=variable_lower_bounds,variable_upper_bounds=variable_upper_bounds,
    fidelity_weight=fidelity_weight,control_scale=control_scale,T_final=T_final)

sol = ipopt(m,max_iter=1000)

x_final = solution(sol,x)
x_final[4,end,1]
x_final[8,end,2]
x_final[7,end,3]
x_final[5,end,4]

0 replies

sshin23 · 2025-09-09T14:38:55Z

sshin23
Sep 9, 2025
Maintainer

@mabuni1998 Most likely, you can model it with just a few constraint signatures if you properly use constraint!. You can try to model them by doing (roughly)

c = ExaCore()
c1 = constraint(c, constraint_dimension)
constraint!(
    c, c1,
    i.con_index => i.coefficient * x[i.x_index] * u[i.u_index1] * u[i.u_index2] * cos( v[i.v_index]) 
    for i in array_you_construct_a_priori
)

0 replies

sshin23 · 2025-09-09T14:39:43Z

sshin23
Sep 9, 2025
Maintainer

Please refer to our documentation for constraint! and let us know if it works

0 replies

mabuni1998 · 2025-09-09T14:40:24Z

mabuni1998
Sep 9, 2025
Author

Thanks, I will try that and get back.

0 replies

sshin23 · 2025-09-09T15:00:38Z

sshin23
Sep 9, 2025
Maintainer

@mabuni1998 note that I converted this into a discussion

0 replies

mabuni1998 · 2025-09-09T15:05:00Z

mabuni1998
Sep 9, 2025
Author

@sshin23 That is perfectly alright. Just for my understanding, will it change the performance if I somehow reduce the number of constraint signatures, or is that just to reduce the compilation time?

0 replies

sshin23 · 2025-09-09T15:07:41Z

sshin23
Sep 9, 2025
Maintainer

On the CPU, the performance is likely to be relatively unchanged. On the GPU, this can make a significant difference in runtime, as it reduces the number of GPU kernels launched. The most significant benefit would be being able to scale up without worrying about a compilation time blowup

0 replies

Precompiling a model #185

Uh oh!

mabuni1998 Sep 6, 2025

Replies: 10 comments

Uh oh!

sshin23 Sep 7, 2025 Maintainer

Uh oh!

mabuni1998 Sep 8, 2025 Author

Uh oh!

sshin23 Sep 8, 2025 Maintainer

Uh oh!

mabuni1998 Sep 9, 2025 Author

Uh oh!

Uh oh!

sshin23 Sep 9, 2025 Maintainer

Uh oh!

sshin23 Sep 9, 2025 Maintainer

Uh oh!

mabuni1998 Sep 9, 2025 Author

Uh oh!

sshin23 Sep 9, 2025 Maintainer

Uh oh!

mabuni1998 Sep 9, 2025 Author

Uh oh!

sshin23 Sep 9, 2025 Maintainer

mabuni1998
Sep 6, 2025

sshin23
Sep 7, 2025
Maintainer

mabuni1998
Sep 8, 2025
Author

sshin23
Sep 8, 2025
Maintainer

mabuni1998
Sep 9, 2025
Author

sshin23
Sep 9, 2025
Maintainer

sshin23
Sep 9, 2025
Maintainer

mabuni1998
Sep 9, 2025
Author

sshin23
Sep 9, 2025
Maintainer

mabuni1998
Sep 9, 2025
Author

sshin23
Sep 9, 2025
Maintainer