Using @snoop_inference
to emit manual precompile directives
In a few cases, it may be inconvenient or impossible to precompile using a workload. Some examples might be:
- an application that opens graphical windows
- an application that connects to a database
- an application that creates, deletes, or rewrites files on disk
In such cases, one alternative is to create a manual list of precompile directives using Julia's precompile(f, argtypes)
function.
Manual precompile directives are much more likely to "go stale" as the package is developed–-precompile
does not throw an error if a method for the given argtypes
cannot be found. They are also more likely to be dependent on the Julia version, operating system, or CPU architecture. Whenever possible, it's safer to use a workload.
precompile
directives have to be emitted by the module that owns the method and/or types. SnoopCompile comes with a tool, parcel
, that splits out the "root-most" precompilable MethodInstances into their constituent modules. This will typically correspond to the bottom row of boxes in the flame graph. In cases where you have some that are not naively precompilable, they will include MethodInstances from higher up in the call tree.
Let's use SnoopCompile.parcel
on our OptimizeMe
demo:
julia> using SnoopCompileCore, SnoopCompile # here we need the SnoopCompile path for the next line (normally you should wait until after data collection is complete)
julia> include(joinpath(pkgdir(SnoopCompile), "examples", "OptimizeMe.jl"))
Main.var"Main".OptimizeMe
julia> tinf = @snoop_inference OptimizeMe.main();
lotsa containers:
julia> ttot, pcs = SnoopCompile.parcel(tinf);
julia> ttot
0.064550989
julia> pcs
4-element Vector{Pair{Module, Tuple{Float64, Vector{Tuple{Float64, Core.MethodInstance}}}}}: Core => (1.924e-6, [(1.924e-6, MethodInstance for (NamedTuple{(:sizehint,)})(::Tuple{Int64}))]) Base.Multimedia => (4.198e-6, [(4.198e-6, MethodInstance for MIME(::String))]) Base => (0.0029538160000000006, [(1.623e-6, MethodInstance for LinearIndices(::Tuple{Base.OneTo{Int64}})), (1.623e-6, MethodInstance for IOContext(::IOBuffer, ::IOContext{Base.PipeEndpoint})), (3.767e-6, MethodInstance for IOContext(::IOContext{Base.PipeEndpoint}, ::Base.ImmutableDict{Symbol, Any})), (6.042e-6, MethodInstance for Base.indexed_iterate(::Pair{Symbol, Any}, ::Int64, ::Int64)), (6.111e-6, MethodInstance for Base.indexed_iterate(::Tuple{Int64, Int64}, ::Int64, ::Int64)), (6.363e-6, MethodInstance for Base.indexed_iterate(::Tuple{Any, Int64}, ::Int64, ::Int64)), (6.582e-6, MethodInstance for Base.indexed_iterate(::Tuple{String, Bool}, ::Int64, ::Int64)), (6.953e-6, MethodInstance for getindex(::Tuple{Int64, Int64}, ::Int64)), (7.224e-6, MethodInstance for getindex(::Tuple{Base.OneTo{Int64}}, ::Int64)), (8.666e-6, MethodInstance for getproperty(::Module, ::Symbol)) … (2.5759e-5, MethodInstance for getproperty(::UnionAll, ::Symbol)), (2.6229e-5, MethodInstance for getproperty(::DataType, ::Symbol)), (2.9025e-5, MethodInstance for getproperty(::BitVector, ::Symbol)), (3.1599e-5, MethodInstance for getproperty(::Vector, ::Symbol)), (9.8513e-5, MethodInstance for LinearIndices(::Vector{Float64})), (0.00025691099999999997, MethodInstance for haskey(::IOContext{Base.PipeEndpoint}, ::Symbol)), (0.000266438, MethodInstance for print(::IOContext{Base.PipeEndpoint}, ::Char)), (0.000328927, MethodInstance for get(::IOContext{Base.PipeEndpoint}, ::Symbol, ::Type{Any})), (0.00040065500000000003, MethodInstance for get(::IOContext{Base.PipeEndpoint}, ::Symbol, ::Bool)), (0.0013133589999999998, MethodInstance for string(::String, ::Int64, ::String))]) Main.var"Main".OptimizeMe => (0.023183100999999998, [(7.4248e-5, MethodInstance for Main.var"Main".OptimizeMe.howbig(::Float64)), (0.023108853, MethodInstance for Main.var"Main".OptimizeMe.main())])
ttot
shows the total amount of time spent on type-inference. parcel
discovered precompilable MethodInstances for four modules, Core
, Base.Multimedia
, Base
, and OptimizeMe
that might benefit from precompile directives. These are listed in increasing order of inference time.
Let's look specifically at OptimizeMeFixed
, since that's under our control:
julia> pcmod = pcs[end]
Main.var"Main".OptimizeMe => (0.023183100999999998, Tuple{Float64, Core.MethodInstance}[(7.4248e-5, MethodInstance for Main.var"Main".OptimizeMe.howbig(::Float64)), (0.023108853, MethodInstance for Main.var"Main".OptimizeMe.main())])
julia> tmod, tpcs = pcmod.second;
julia> tmod
0.023183100999999998
julia> tpcs
2-element Vector{Tuple{Float64, Core.MethodInstance}}: (7.4248e-5, MethodInstance for Main.var"Main".OptimizeMe.howbig(::Float64)) (0.023108853, MethodInstance for Main.var"Main".OptimizeMe.main())
This indicates the amount of time spent specifically on OptimizeMe
, plus the list of calls that could be precompiled in that module.
We could look at the other modules (packages) similarly.
SnoopCompile.write
You can generate files that contain ready-to-use precompile
directives using SnoopCompile.write
:
julia> SnoopCompile.write("/tmp/precompiles_OptimizeMe", pcs)
Core: no precompile statements out of 1.924e-6 Base.Multimedia: no precompile statements out of 4.198e-6 Base: precompiled 0.0013133589999999998 out of 0.0029538160000000006 Main.var"Main".OptimizeMe: precompiled 0.023108853 out of 0.023183100999999998
You'll now find a directory /tmp/precompiles_OptimizeMe
, and inside you'll find files for modules that could have precompile directives added manually. The contents of the last of these should be recognizable:
function _precompile_()
ccall(:jl_generating_output, Cint, ()) == 1 || return nothing
Base.precompile(Tuple{typeof(main)}) # time: 0.4204474
end
The first ccall
line ensures we only pay the cost of running these precompile
directives if we're building the package; this is relevant mostly if you're running Julia with --compiled-modules=no
, which can be a convenient way to disable precompilation and examine packages in their "native state." (It would also matter if you've set __precompile__(false)
at the top of your module, but if so why are you reading this?)
This file is ready to be moved into the OptimizeMe
repository and include
d into your module definition.
You might also consider submitting some of the other files (or their precompile
directives) to the packages you depend on.