Performance when Scripting with Kotlin
Jul 3, 2023 · 5 minute read · Commentscode
kotlin
Not too long ago, I came across Kaushik’s blog post and repo about Kotlin scripting. Around the same time, I was trying to automate a process of cleaning up some images I had. Given a set of transparent pngs, I wanted to find out if the last line was a solid line, and, if so, and if nothing is touching that line, remove it. While typically, I turn to using Pillow and Python for image manipulation scripts, I decided to give writing a Kotlin script a try instead.
I wrote the script (using javax.imageio
) and tested it and everything was great. I then proceeded to combine my Kotlin script with the shell to run it on the set of images like this:
# look at directories 351, 352, ... 360
for i in `seq 350 360`; do
DIR="/path/$i"
cd $DIR
# we have 15 images in each directory, sub_image_0.png, sub_image_1.png, ...
for j in `seq 0 14`; do
IMAGE=sub_image_$j.png
kotlin /Users/ahmedre/Documents/code/kotlin/line_finder/finder.main.kts $IMAGE
done
cd -
done
The script loops on a set of directories, each with a number, and on a set of 15 images inside that directory. For each image, I run the Kotlin script, passing it as a parameter. The surprise to me came when I found that running this took ~2.5 minutes! 150 images isn’t a massive amount, and these aren’t very large images, so why does it take so long?
KScript
At this point, I remembered having come across KScript at some point. It provides a wrapper around kotlinc
, caching script compilation among other things. I replaced /usr/bin/env kotlin
with /usr/bin/env kscript
in the script, and replaced kotlin
with kscript
. This brought the time down to ~1 minute and 12 seconds.
I continued reading, after which I found that KScript has a package option used to deploy scripts as standalone binaries. I ran kscript --package finder.main.kts
and took the finder.main
binary and replaced it in the for loop above. This brought the time down to ~29 seconds.
Compiled Jar
Maybe it’s slow due to Kotlin Scripting, I thought. What if I make a compiled jar instead? I modified the script (adding a main method, etc), and used kotlinc to build a jar, using kotlinc finder.kt -include-runtime -d finder.jar
to generate a new jar, and used java -jar finder.jar
in place of the existing Kotlin command in the loop. Running this took ~28 seconds also.
This makes sense, since it seems that KScript is precompiling and caching the compiled script.
The Culprit
Bringing down the run to 28 seconds is great, but still felt way too long for a set of 150 images. As I was thinking about this while rerunning the script, I noticed something interesting - every few runs of the script, I’d get a dock icon in macOS, which would then disappear, followed by another one. This brought me to a realization - what if the reason this is so slow is that we are processing a single file per run, causing the jvm to spawn once for each image processed.
Going back to the original script and measuring the time of the entire method, I saw that processing an image takes roughly 75ms. Based on this, the expected time would be 75ms * 150 images =~ 11.25 seconds. Moreover, updating the initial shell loop to add a time
in front of each run shows runs that are taking a bit longer (over 110ms) per run. The combined signal from these should have been enough to consider this optimization sooner.
What if we modified the script to run on each directory of images instead of on each image? I modified the script to handle a directory at a time, and re-ran the loop, without the inner loop from 0-14. The updated times were really surprising.
- Running the Kotlin Script with
kotlin
: 15 seconds. - Running the Kotlin Script with
kscript
: 16 seconds. - Running the Kotlin Script with a packaged
kscript
: 10 seconds. - Running with a compiled Jar: 11 seconds.
The real issue here was the cost of starting up a jvm for each file. Partially combining the files (to do 10 jvm process starts instead of 150), brought the time from ~28 seconds, to close to 10 seconds (or, with a vanilla Kotlin script, from ~2.5 minutes to 15 seconds). Using this information, combining the 10 runs down to a single run by further modifying the code to support nested directories brings the run time down to ~1.5 seconds (compiled), or ~5 seconds using Kotlin scripting 1. Not bad!
An Untested Idea
Kotlin Multiplatform is very powerful, and provides us the ability to compile Kotlin for non-JVM platforms (by going through LLVM). KMP could easily allow us to take our script and compile it for non-JVM platforms, therefore making it native. In other words, given something like finder.kt
, I could do something like:
# install kotlin-native - on macOS, we can do:
brew install kotlin-native
# for the first time running kotlin-native on macOS, we'd have to clear the
# quarantine extended attribute.
xattr -d com.apple.quarantine '/opt/homebrew/Caskroom/kotlin-native/1.8.21/kotlin-native-macos-aarch64-1.8.21/konan/nativelib'/*
# now we can run it through the kotlin-native compiler
# won't work if we have any java.* or android.* imports
kotlinc-native finder.kt -o finder
This would give us a native command line application, without the jvm. We’d expect the performance to be better than what we’ve seen so far, due to not going through the jvm. I didn’t test this approach, however, due to the fact that ImageIO
is a jvm only construct. To do this, I’d have to use skia, or expect
/actual
methods for reading and manipulating pixels on the various platforms.
Takeaways
There are three key takeaways here:
- I was again reminded that the adages of “Measure before optimizing,” and “Premature optimization is the root of all evil” are both true. Measure first, and it becomes clear where to spend time to get the most impact.
- There is a cost to spinning up a jvm. Keep this in mind when writing Kotlin scripts.
- Converting a vanilla Kotlin script to a native script is very compelling, and something I will consider in the future.
-
This is ~10ms per image, which is a lot less than the estimate of 75ms per image (since that 75ms was a measure of the entire execution). ↩︎