Reference
Data collection
SnoopCompileCore.@snoopr
— Macrolist = @snoopr expr
Capture method cache invalidations triggered by evaluating expr
. list
is a sequence of invalidated Core.MethodInstance
s together with "explanations," consisting of integers (encoding depth) and strings (documenting the source of an invalidation).
Unless you are working at a low level, you essentially always want to pass list
directly to SnoopCompile.invalidation_trees
.
Extended help
list
is in a format where the "reason" comes after the items. Method deletion results in the sequence
[zero or more (mi, "invalidate_mt_cache") pairs..., zero or more (depth1 tree, loctag) pairs..., method, loctag] with loctag = "jl_method_table_disable"
where mi
means a MethodInstance
. depth1
means a sequence starting at depth=1
.
Method insertion results in the sequence
[zero or more (depth0 tree, sig) pairs..., same info as with delete_method except loctag = "jl_method_table_insert"]
SnoopCompileCore.@snoopi_deep
— Macrotinf = @snoopi_deep commands
Produce a profile of julia's type inference, recording the amount of time spent inferring every MethodInstance
processed while executing commands
. Each fresh entrance to type inference (whether executed directly in commands
or because a call was made by runtime-dispatch) also collects a backtrace so the caller can be identified.
tinf
is a tree, each node containing data on a particular inference "frame" (the method, argument-type specializations, parameters, and even any constant-propagated values). Each reports the exclusive
/inclusive
times, where the exclusive time corresponds to the time spent inferring this frame in and of itself, whereas the inclusive time includes the time needed to infer all the callees of this frame.
The top-level node in this profile tree is ROOT
. Uniquely, its exclusive time corresponds to the time spent not in julia's type inference (codegen, llvm_opt, runtime, etc).
There are many different ways of inspecting and using the data stored in tinf
. The simplest is to load the AbstracTrees
package and display the tree with AbstractTrees.print_tree(tinf)
. See also: flamegraph
, flatten
, inference_triggers
, SnoopCompile.parcel
, runtime_inferencetime
.
Example
julia> tinf = @snoopi_deep begin
sort(rand(100)) # Evaluate some code and profile julia's type inference
end
InferenceTimingNode: 0.110018224/0.131464476 on InferenceFrameInfo for Core.Compiler.Timings.ROOT() with 2 direct children
SnoopCompileCore.@snoopi
— Macroinf_timing = @snoopi commands
inf_timing = @snoopi tmin=0.0 commands
Execute commands
while snooping on inference. Returns an array of (t, linfo)
tuples, where t
is the amount of time spent inferring linfo
(a MethodInstance
).
Methods that take less time than tmin
will not be reported.
SnoopCompileCore.@snoopc
— Macro@snoopc "compiledata.csv" begin
# Commands to execute, in a new process
end
causes the julia compiler to log all functions compiled in the course of executing the commands to the file "compiledata.csv". This file can be used for the input to SnoopCompile.read
.
Analysis of invalidations
SnoopCompile.uinvalidated
— Functionumis = uinvalidated(invlist)
Return the unique invalidated MethodInstances. invlist
is obtained from SnoopCompileCore.@snoopr
. This is similar to filter
ing for MethodInstance
s in invlist
, except that it discards any tagged "invalidate_mt_cache"
. These can typically be ignored because they are nearly inconsequential: they do not invalidate any compiled code, they only transiently affect an optimization of runtime dispatch.
SnoopCompile.invalidation_trees
— Functiontrees = invalidation_trees(list)
Parse list
, as captured by SnoopCompileCore.@snoopr
, into a set of invalidation trees, where parents nodes were called by their children.
Example
julia> f(x::Int) = 1
f (generic function with 1 method)
julia> f(x::Bool) = 2
f (generic function with 2 methods)
julia> applyf(container) = f(container[1])
applyf (generic function with 1 method)
julia> callapplyf(container) = applyf(container)
callapplyf (generic function with 1 method)
julia> c = Any[1]
1-element Array{Any,1}:
1
julia> callapplyf(c)
1
julia> trees = invalidation_trees(@snoopr f(::AbstractFloat) = 3)
1-element Array{SnoopCompile.MethodInvalidations,1}:
inserting f(::AbstractFloat) in Main at REPL[36]:1 invalidated:
mt_backedges: 1: signature Tuple{typeof(f),Any} triggered MethodInstance for applyf(::Array{Any,1}) (1 children) more specific
See the documentation for further details.
SnoopCompile.filtermod
— Functionthinned = filtermod(module, trees::AbstractVector{MethodInvalidations})
Select just the cases of invalidating a method defined in module
.
SnoopCompile.findcaller
— Functionmethinvs = findcaller(method::Method, trees)
Find a path through trees
that reaches method
. Returns a single MethodInvalidations
object.
Examples
Suppose you know that loading package SomePkg
triggers invalidation of f(data)
. You can find the specific source of invalidation as follows:
f(data) # run once to force compilation
m = @which f(data)
using SnoopCompile
trees = invalidation_trees(@snoopr using SomePkg)
methinvs = findcaller(m, trees)
If you don't know which method to look for, but know some operation that has had added latency, you can look for methods using @snoopi
. For example, suppose that loading SomePkg
makes the next using
statement slow. You can find the source of trouble with
julia> using SnoopCompile
julia> trees = invalidation_trees(@snoopr using SomePkg);
julia> tinf = @snoopi using SomePkg # this second `using` will need to recompile code invalidated above
1-element Array{Tuple{Float64,Core.MethodInstance},1}:
(0.08518409729003906, MethodInstance for require(::Module, ::Symbol))
julia> m = tinf[1][2].def
require(into::Module, mod::Symbol) in Base at loading.jl:887
julia> findcaller(m, trees)
inserting ==(x, y::SomeType) in SomeOtherPkg at /path/to/code:100 invalidated:
backedges: 1: superseding ==(x, y) in Base at operators.jl:83 with MethodInstance for ==(::Symbol, ::Any) (16 children) more specific
Analysis of @snoopi_deep
SnoopCompile.flamegraph
— Functionflamegraph(tinf::InferenceTimingNode; tmin=0.0, excluded_modules=Set([Main]), mode=nothing)
Convert the call tree of inference timings returned from @snoopi_deep
into a FlameGraph. Returns a FlameGraphs.FlameGraph structure that represents the timing trace recorded for type inference.
Frames that take less than tmin
seconds of inclusive time will not be included in the resultant FlameGraph (meaning total time including it and all of its children). This can be helpful if you have a very big profile, to save on processing time.
Non-precompilable frames are marked in reddish colors. excluded_modules
can be used to mark methods defined in modules to which you cannot or do not wish to add precompiles.
mode
controls how frames are named in tools like ProfileView. nothing
uses the default of just the qualified function name, whereas supplying mode=Dict(method => count)
counting the number of specializations of each method will cause the number of specializations to be included in the frame name.
Example
We'll use SnoopCompile.flatten_demo
, which runs @snoopi_deep
on a workload designed to yield reproducible results:
julia> tinf = SnoopCompile.flatten_demo()
InferenceTimingNode: 0.002148974/0.002767166 on InferenceFrameInfo for Core.Compiler.Timings.ROOT() with 1 direct children
julia> fg = flamegraph(tinf)
Node(FlameGraphs.NodeData(ROOT() at typeinfer.jl:75, 0x00, 0:3334431))
julia> ProfileView.view(fg); # Display the FlameGraph in a package that supports it
You should be able to reconcile the resulting flamegraph to print_tree(tinf)
(see flatten
).
The empty horizontal periods in the flamegraph correspond to times when something other than inference is running. The total width of the flamegraph is set from the ROOT
node.
SnoopCompile.flatten
— Functionflatten(tinf; tmin = 0.0, sortby=exclusive)
Flatten the execution graph of InferenceTimingNode
s returned from @snoopi_deep
into a Vector of InferenceTiming
frames, each encoding the time needed for inference of a single MethodInstance
. By default, results are sorted by exclusive
time (the time for inferring the MethodInstance
itself, not including any inference of its callees); other options are sortedby=inclusive
which includes the time needed for the callees, or nothing
to obtain them in the order they were inferred (depth-first order).
Example
We'll use SnoopCompile.flatten_demo
, which runs @snoopi_deep
on a workload designed to yield reproducible results:
julia> tinf = SnoopCompile.flatten_demo()
InferenceTimingNode: 0.002148974/0.002767166 on InferenceFrameInfo for Core.Compiler.Timings.ROOT() with 1 direct children
julia> using AbstractTrees; print_tree(tinf)
InferenceTimingNode: 0.00242354/0.00303526 on InferenceFrameInfo for Core.Compiler.Timings.ROOT() with 1 direct children
└─ InferenceTimingNode: 0.000150891/0.000611721 on InferenceFrameInfo for SnoopCompile.FlattenDemo.packintype(::Int64) with 2 direct children
├─ InferenceTimingNode: 0.000105318/0.000105318 on InferenceFrameInfo for MyType{Int64}(::Int64) with 0 direct children
└─ InferenceTimingNode: 9.43e-5/0.000355512 on InferenceFrameInfo for SnoopCompile.FlattenDemo.dostuff(::MyType{Int64}) with 2 direct children
├─ InferenceTimingNode: 6.6458e-5/0.000124716 on InferenceFrameInfo for SnoopCompile.FlattenDemo.extract(::MyType{Int64}) with 2 direct children
│ ├─ InferenceTimingNode: 3.401e-5/3.401e-5 on InferenceFrameInfo for getproperty(::MyType{Int64}, ::Symbol) with 0 direct children
│ └─ InferenceTimingNode: 2.4248e-5/2.4248e-5 on InferenceFrameInfo for getproperty(::MyType{Int64}, x::Symbol) with 0 direct children
└─ InferenceTimingNode: 0.000136496/0.000136496 on InferenceFrameInfo for SnoopCompile.FlattenDemo.domath(::Int64) with 0 direct children
Note the printing of getproperty(::SnoopCompile.FlattenDemo.MyType{Int64}, x::Symbol)
: it shows the specific Symbol, here :x
, that getproperty
was inferred with. This reflects constant-propagation in inference.
Then:
julia> flatten(tinf; sortby=nothing)
8-element Vector{SnoopCompileCore.InferenceTiming}:
InferenceTiming: 0.002423543/0.0030352639999999998 on InferenceFrameInfo for Core.Compiler.Timings.ROOT()
InferenceTiming: 0.000150891/0.0006117210000000001 on InferenceFrameInfo for SnoopCompile.FlattenDemo.packintype(::Int64)
InferenceTiming: 0.000105318/0.000105318 on InferenceFrameInfo for SnoopCompile.FlattenDemo.MyType{Int64}(::Int64)
InferenceTiming: 9.43e-5/0.00035551200000000005 on InferenceFrameInfo for SnoopCompile.FlattenDemo.dostuff(::SnoopCompile.FlattenDemo.MyType{Int64})
InferenceTiming: 6.6458e-5/0.000124716 on InferenceFrameInfo for SnoopCompile.FlattenDemo.extract(::SnoopCompile.FlattenDemo.MyType{Int64})
InferenceTiming: 3.401e-5/3.401e-5 on InferenceFrameInfo for getproperty(::SnoopCompile.FlattenDemo.MyType{Int64}, ::Symbol)
InferenceTiming: 2.4248e-5/2.4248e-5 on InferenceFrameInfo for getproperty(::SnoopCompile.FlattenDemo.MyType{Int64}, x::Symbol)
InferenceTiming: 0.000136496/0.000136496 on InferenceFrameInfo for SnoopCompile.FlattenDemo.domath(::Int64)
julia> flatten(tinf; tmin=1e-4) # sorts by exclusive time (the time before the '/')
4-element Vector{SnoopCompileCore.InferenceTiming}:
InferenceTiming: 0.000105318/0.000105318 on InferenceFrameInfo for SnoopCompile.FlattenDemo.MyType{Int64}(::Int64)
InferenceTiming: 0.000136496/0.000136496 on InferenceFrameInfo for SnoopCompile.FlattenDemo.domath(::Int64)
InferenceTiming: 0.000150891/0.0006117210000000001 on InferenceFrameInfo for SnoopCompile.FlattenDemo.packintype(::Int64)
InferenceTiming: 0.002423543/0.0030352639999999998 on InferenceFrameInfo for Core.Compiler.Timings.ROOT()
julia> flatten(tinf; sortby=inclusive, tmin=1e-4) # sorts by inclusive time (the time after the '/')
6-element Vector{SnoopCompileCore.InferenceTiming}:
InferenceTiming: 0.000105318/0.000105318 on InferenceFrameInfo for SnoopCompile.FlattenDemo.MyType{Int64}(::Int64)
InferenceTiming: 6.6458e-5/0.000124716 on InferenceFrameInfo for SnoopCompile.FlattenDemo.extract(::SnoopCompile.FlattenDemo.MyType{Int64})
InferenceTiming: 0.000136496/0.000136496 on InferenceFrameInfo for SnoopCompile.FlattenDemo.domath(::Int64)
InferenceTiming: 9.43e-5/0.00035551200000000005 on InferenceFrameInfo for SnoopCompile.FlattenDemo.dostuff(::SnoopCompile.FlattenDemo.MyType{Int64})
InferenceTiming: 0.000150891/0.0006117210000000001 on InferenceFrameInfo for SnoopCompile.FlattenDemo.packintype(::Int64)
InferenceTiming: 0.002423543/0.0030352639999999998 on InferenceFrameInfo for Core.Compiler.Timings.ROOT()
As you can see, sortby
affects not just the order but also the selection of frames; with exclusive times, dostuff
did not on its own rise above threshold, but it does when using inclusive times.
See also: accumulate_by_source
.
SnoopCompileCore.exclusive
— Functionexclusive(frame)
Return the time spent inferring frame
, not including the time needed for any of its callees.
SnoopCompileCore.inclusive
— Functioninclusive(frame)
Return the time spent inferring frame
and its callees.
SnoopCompile.accumulate_by_source
— Functionaccumulate_by_source(flattened; tmin = 0.0, by=exclusive)
Add the inference timings for all MethodInstance
s of a single Method
together. flattened
is the output of flatten
. Returns a list of (t, method)
tuples.
When the accumulated time for a Method
is large, but each instance is small, it indicates that it is being inferred for many specializations (which might include specializations with different constants).
Example
We'll use SnoopCompile.flatten_demo
, which runs @snoopi_deep
on a workload designed to yield reproducible results:
julia> tinf = SnoopCompile.flatten_demo()
InferenceTimingNode: 0.002148974/0.002767166 on InferenceFrameInfo for Core.Compiler.Timings.ROOT() with 1 direct children
julia> accumulate_by_source(flatten(tinf))
7-element Vector{Tuple{Float64, Method}}:
(6.0012999999999996e-5, getproperty(x, f::Symbol) in Base at Base.jl:33)
(6.7714e-5, extract(y::SnoopCompile.FlattenDemo.MyType) in SnoopCompile.FlattenDemo at /pathto/SnoopCompile/src/deep_demos.jl:35)
(9.421e-5, dostuff(y) in SnoopCompile.FlattenDemo at /pathto/SnoopCompile/src/deep_demos.jl:44)
(0.000112057, SnoopCompile.FlattenDemo.MyType{T}(x) where T in SnoopCompile.FlattenDemo at /pathto/SnoopCompile/src/deep_demos.jl:34)
(0.000133895, domath(x) in SnoopCompile.FlattenDemo at /pathto/SnoopCompile/src/deep_demos.jl:40)
(0.000154382, packintype(x) in SnoopCompile.FlattenDemo at /pathto/SnoopCompile/src/deep_demos.jl:36)
(0.003165266, ROOT() in Core.Compiler.Timings at compiler/typeinfer.jl:75)
Compared to the output from flatten
, the two inferences passes on getproperty
have been consolidated into a single aggregate call.
loctrigs = accumulate_by_source(itrigs::AbstractVector{InferenceTrigger})
Aggregate inference triggers by location (function, file, and line number) of the caller.
Example
We collect data using the SnoopCompile.itrigs_demo
:
julia> itrigs = inference_triggers(SnoopCompile.itrigs_demo())
2-element Vector{InferenceTrigger}:
Inference triggered to call MethodInstance for double(::UInt8) from calldouble1 (/pathto/SnoopCompile/src/parcel_snoopi_deep.jl:762) inlined into MethodInstance for calldouble2(::Vector{Vector{Any}}) (/pathto/SnoopCompile/src/parcel_snoopi_deep.jl:763)
Inference triggered to call MethodInstance for double(::Float64) from calldouble1 (/pathto/SnoopCompile/src/parcel_snoopi_deep.jl:762) inlined into MethodInstance for calldouble2(::Vector{Vector{Any}}) (/pathto/SnoopCompile/src/parcel_snoopi_deep.jl:763)
julia> accumulate_by_source(itrigs)
1-element Vector{SnoopCompile.LocationTrigger}:
calldouble1 at /pathto/SnoopCompile/src/parcel_snoopi_deep.jl:762 (2 callees from 1 callers)
SnoopCompile.inference_triggers
— Functionitrigs = inference_triggers(tinf::InferenceTimingNode; exclude_toplevel=true)
Collect the "triggers" of inference, each a fresh entry into inference via a call dispatched at runtime. All the entries in itrigs
are previously uninferred, or are freshly-inferred for specific constant inputs.
exclude_toplevel
determines whether calls made from the REPL, include
, or test suites are excluded.
Example
We'll use SnoopCompile.itrigs_demo
, which runs @snoopi_deep
on a workload designed to yield reproducible results:
julia> tinf = SnoopCompile.itrigs_demo()
InferenceTimingNode: 0.004490576/0.004711168 on InferenceFrameInfo for Core.Compiler.Timings.ROOT() with 2 direct children
julia> itrigs = inference_triggers(tinf)
2-element Vector{InferenceTrigger}:
Inference triggered to call MethodInstance for double(::UInt8) from calldouble1 (/pathto/SnoopCompile/src/deep_demos.jl:86) inlined into MethodInstance for calldouble2(::Vector{Vector{Any}}) (/pathto/SnoopCompile/src/deep_demos.jl:87)
Inference triggered to call MethodInstance for double(::Float64) from calldouble1 (/pathto/SnoopCompile/src/deep_demos.jl:86) inlined into MethodInstance for calldouble2(::Vector{Vector{Any}}) (/pathto/SnoopCompile/src/deep_demos.jl:87)
julia> edit(itrigs[1]) # opens an editor at the spot in the caller
julia> ascend(itrigs[2]) # use Cthulhu to inspect the stacktrace (caller is the second item in the trace)
Choose a call for analysis (q to quit):
> double(::Float64)
calldouble1 at /pathto/SnoopCompile/src/deep_demos.jl:86 => calldouble2(::Vector{Vector{Any}}) at /pathto/SnoopCompile/src/deep_demos.jl:87
calleach(::Vector{Vector{Vector{Any}}}) at /pathto/SnoopCompile/src/deep_demos.jl:88
...
SnoopCompile.callerinstance
— Functionmi = callerinstance(itrig::InferenceTrigger)
Return the MethodInstance mi
of the caller in the selected stackframe in itrig
.
SnoopCompile.callingframe
— Functionitrigcaller = callingframe(itrig::InferenceTrigger)
"Step out" one layer of the stacktrace, referencing the caller of the current frame of itrig
.
You can retrieve the proximal trigger of inference with InferenceTrigger(itrigcaller)
.
Example
We collect data using the SnoopCompile.itrigs_demo
:
julia> itrig = inference_triggers(SnoopCompile.itrigs_demo())[1]
Inference triggered to call MethodInstance for double(::UInt8) from calldouble1 (/pathto/SnoopCompile/src/parcel_snoopi_deep.jl:762) inlined into MethodInstance for calldouble2(::Vector{Vector{Any}}) (/pathto/SnoopCompile/src/parcel_snoopi_deep.jl:763)
julia> itrigcaller = callingframe(itrig)
Inference triggered to call MethodInstance for double(::UInt8) from calleach (/pathto/SnoopCompile/src/parcel_snoopi_deep.jl:764) with specialization MethodInstance for calleach(::Vector{Vector{Vector{Any}}})
SnoopCompile.skiphigherorder
— Functionitrignew = skiphigherorder(itrig; exact::Bool=false)
Attempt to skip over frames of higher-order functions that take the callee as a function-argument. This can be useful if you're analyzing inference triggers for an entire package and would prefer to assign triggers to package-code rather than Base functions like map!
, broadcast
, etc.
Example
We collect data using the SnoopCompile.itrigs_higherorder_demo
:
julia> itrig = inference_triggers(SnoopCompile.itrigs_higherorder_demo())[1]
Inference triggered to call MethodInstance for double(::Float64) from mymap! (/pathto/SnoopCompile/src/parcel_snoopi_deep.jl:706) with specialization MethodInstance for mymap!(::typeof(SnoopCompile.ItrigHigherOrderDemo.double), ::Vector{Any}, ::Vector{Any})
julia> callingframe(itrig) # step out one (non-inlined) frame
Inference triggered to call MethodInstance for double(::Float64) from mymap (/pathto/SnoopCompile/src/parcel_snoopi_deep.jl:710) with specialization MethodInstance for mymap(::typeof(SnoopCompile.ItrigHigherOrderDemo.double), ::Vector{Any})
julia> skiphigherorder(itrig) # step out to frame that doesn't have `double` as a function-argument
Inference triggered to call MethodInstance for double(::Float64) from callmymap (/pathto/SnoopCompile/src/parcel_snoopi_deep.jl:711) with specialization MethodInstance for callmymap(::Vector{Any})
By default skiphigherorder
is conservative, and insists on being sure that it's the callee being passed to the higher-order function. Higher-order functions that do not get specialized (e.g., with ::Function
argument types) will not be skipped over. You can pass exact=false
to allow ::Function
to also be passed over, but keep in mind that this may falsely skip some frames.
SnoopCompile.InferenceTrigger
— TypeInferenceTrigger(callee::MethodInstance, callerframes::Vector{StackFrame}, btidx::Int, bt)
Organize information about the "triggers" of inference. callee
is the MethodInstance
requiring inference, callerframes
, btidx
and bt
contain information about the caller. callerframes
are the frame(s) of call site that triggered inference; it's a Vector{StackFrame}
, rather than a single StackFrame
, due to the possibility that the caller was inlined into something else, in which case the first entry is the direct caller and the last entry corresponds to the MethodInstance into which it was ultimately inlined. btidx
is the index in bt
, the backtrace collected upon entry into inference, corresponding to callerframes
.
InferenceTrigger
s are created by calling inference_triggers
. See also: callerinstance
and callingframe
.
SnoopCompile.runtime_inferencetime
— Functionridata = runtime_inferencetime(tinf::InferenceTimingNode)
ridata = runtime_inferencetime(tinf::InferenceTimingNode, profiledata; lidict)
Compare runtime and inference-time on a per-method basis. ridata[m::Method]
returns (trun, tinfer, nspecializations)
, measuring the approximate amount of time spent running m
, inferring m
, and the number of type-specializations, respectively. trun
is estimated from profiling data, which the user is responsible for capturing before the call. Typically tinf
is collected via @snoopi_deep
on the first call (in a fresh session) to a workload, and the profiling data collected on a subsequent call. In some cases you may want to repeat the workload several times to collect enough profiling samples.
profiledata
and lidict
are obtained from Profile.retrieve()
.
SnoopCompile.parcel
— Functionpc = parcel(calls; subst=[], exclusions=[])
assigns each compile statement to the module that owns the function. Perform string substitution via subst=["Module1"=>"Module2"]
, and omit functions in particular modules with exclusions=["Module3"]
. On output, pc[:Module2]
contains all the precompiles assigned to Module2
.
Use SnoopCompile.write(prefix, pc)
to generate a series of files in directory prefix
, one file per module.
ttot, pcs = SnoopCompile.parcel(tinf::InferenceTimingNode)
Parcel the "root-most" precompilable MethodInstances into separate modules. These can be used to generate precompile
directives to cache the results of type-inference, reducing latency on first use.
Loosely speaking, and MethodInstance is precompilable if the module that owns the method also has access to all the types it need to precompile the instance. When the root node of an entrance to inference is not itself precompilable, parcel
examines the children (and possibly, children's children...) until it finds the first node on each branch that is precompilable. MethodInstances
are then assigned to the module that owns the method.
ttot
is the total inference time; pcs
is a list of module => (tmod, pclist)
pairs. For each module, tmod
is the amount of inference time affiliated with methods owned by that module; pclist
is a list of (t, mi)
time/MethodInstance tuples.
See also: SnoopCompile.write
.
Example
We'll use SnoopCompile.itrigs_demo
, which runs @snoopi_deep
on a workload designed to yield reproducible results:
julia> tinf = SnoopCompile.itrigs_demo()
InferenceTimingNode: 0.004490576/0.004711168 on InferenceFrameInfo for Core.Compiler.Timings.ROOT() with 2 direct children
julia> ttot, pcs = SnoopCompile.parcel(tinf);
julia> ttot
0.000220592
julia> pcs
1-element Vector{Pair{Module, Tuple{Float64, Vector{Tuple{Float64, Core.MethodInstance}}}}}:
SnoopCompile.ItrigDemo => (0.000220592, [(9.8986e-5, MethodInstance for double(::Float64)), (0.000121606, MethodInstance for double(::UInt8))])
Since there was only one module, ttot
is the same as tmod
. The ItrigDemo
module had two precomilable MethodInstances, each listed with its corresponding inclusive time.
SnoopCompile.write
— Functionwrite(prefix::AbstractString, pc::Dict; always::Bool = false)
Write each modules' precompiles to a separate file. If always
is true, the generated function will always run the precompile statements when called, otherwise the statements will only be called during package precompilation.
Other utilities
SnoopCompile.read
— FunctionSnoopCompile.read("compiledata.csv")
reads the log file produced by the compiler and returns the functions as a pair of arrays. The first array is the amount of time required to compile each function, the second is the corresponding function + types. The functions are sorted in order of increasing compilation time. (The time does not include the cost of nested compiles.)
SnoopCompile.format_userimg
— Functionpc = format_userimg(calls; subst=[], exclusions=[])
generates precompile directives intended for your base/userimg.jl script. Use SnoopCompile.write(filename, pc)
to create a file that you can include
into userimg.jl
.
Demos
SnoopCompile.flatten_demo
— Functiontinf = SnoopCompile.flatten_demo()
A simple demonstration of @snoopi_deep
. This demo defines a module
module FlattenDemo
struct MyType{T} x::T end
extract(y::MyType) = y.x
function packintype(x)
y = MyType{Int}(x)
return dostuff(y)
end
function domath(x)
y = x + x
return y*x + 2*x + 5
end
dostuff(y) = domath(extract(y))
end
It then "warms up" (forces inference on) all of Julia's Base
methods needed for domath
, to ensure that these MethodInstances do not need to be inferred when we collect the data. It then returns the results of
@snoopi_deep FlattenDemo.packintypes(1)
See flatten
for an example usage.
SnoopCompile.itrigs_demo
— Functiontinf = SnoopCompile.itrigs_demo()
A simple demonstration of collecting inference triggers. This demo defines a module
module ItrigDemo
@noinline double(x) = 2x
@inline calldouble1(c) = double(c[1])
calldouble2(cc) = calldouble1(cc[1])
calleach(ccs) = (calldouble2(ccs[1]), calldouble2(ccs[2]))
end
It then "warms up" (forces inference on) calldouble2(::Vector{Vector{Any}})
, calldouble1(::Vector{Any})
, double(::Int)
:
cc = [Any[1]]
ItrigDemo.calleach([cc,cc])
Then it collects and returns inference data using
cc1, cc2 = [Any[0x01]], [Any[1.0]]
@snoopi_deep ItrigDemo.calleach([cc1, cc2])
This does not require any new inference for calldouble2
or calldouble1
, but it does force inference on double
with two new types. See inference_triggers
to see what gets collected and returned.
SnoopCompile.itrigs_higherorder_demo
— Functiontinf = SnoopCompile.itrigs_higherorder_demo()
A simple demonstration of handling higher-order methods with inference triggers. This demo defines a module
module ItrigHigherOrderDemo
double(x) = 2x
@noinline function mymap!(f, dst, src)
for i in eachindex(dst, src)
dst[i] = f(src[i])
end
return dst
end
@noinline mymap(f::F, src) where F = mymap!(f, Vector{Any}(undef, length(src)), src)
callmymap(src) = mymap(double, src)
end
The key feature of this set of definitions is that the function double
gets passed as an argument through mymap
and mymap!
(the latter are higher-order functions).
It then "warms up" (forces inference on) callmymap(::Vector{Any})
, mymap(::typeof(double), ::Vector{Any})
, mymap!(::typeof(double), ::Vector{Any}, ::Vector{Any})
and double(::Int)
:
ItrigHigherOrderDemo.callmymap(Any[1, 2])
Then it collects and returns inference data using
@snoopi_deep ItrigHigherOrderDemo.callmymap(Any[1.0, 2.0])
which forces inference for double(::Float64)
.
See skiphigherorder
for an example using this demo.