Introduction to Sikuli: A GUI Automation Tool

Sikuli is an interesting tool which can appeal to both novice and seasoned automation professionals. It automates almost anything you see on screen using screenshots. This open source tool uses image recognition to identify and control GUI components. In this post, we will introduce you to this tool. Over the time, we will explore it further in subsequent posts.

This tool relies on GUI for automation purpose. While writing a Sikuli script, you just have to provide a screenshot of any GUI element alongwith the corresponding action(Eg. click, type etc.). For example, if you want to click on an on screen button, just take a screenshot of that button and assign it to click action. The whole command can be read as ‘click(button.png)’. Where the click is a Sikuli function and button.png is a screenshot. Simple, isn’t it?. The screenshots taken are stored in a folder created by script.Sikuli

At a very high level, Sikuli tool can be divided in two parts:

  • Integrated Development Environment (IDE): Helps in preparing scripts by taking screenshots.
  • API/Sikuli Script: A Jython and Java library for GUI interaction and keyboard/mouse events.

Both of these components are now part of SikuliX, where X stands for eXperimental.

Development of Sikuli:
It started as a research project at the User Interface Design group at MIT. It is now maintained and developed by Raimund Hocke by help of open source community.

Name:
Sikuli name is derived from a Huichol Indian word which means God’s Eye, the power to see and understand things unknown.

Benefits:
Below are some benefits of this tool.

  • Can be used to automate web application and desktop application as it relies on screenshots. So if an application has GUI, Sikuli can be used to automate it.
  • Can be used in automation of mobile apps by using emulators. Native mobile support is not there though.
  • It is open source tool. So, it highly useful when integrated with other tools like Selenium. By integrating it with Selenium, it can address Selenium’s major quirks such as modal box handling, browser dialog boxes handling etc.
  • Can easily deal with Flash/Flex objects.
  • Comes with basic text recognition (OCR). Therefore, it can be used to read texts on images.
  • Has support for various platforms namely Windows XP+, Mac OS 10.6+ and most Linux/Unix flavors.
  • Can easily be used for automation of web elements having dynamic IDs and XPath.
  • Support for wide variety of programming and scripting languages such as Java, Jython, JRuby, Scala, Groovy etc.
  • Easy to setup and use.

For more details about this tool’s testing capabilities, check out the videos on its site.

Drawbacks:
To list few gripes in this otherwise awesome automation tool.

  • Maintenance of scripts can be real pain if your application’s GUI is ever changing.
  • Even the change in text labels can result in failure of script.
  • Window on which test is conducted needs be on visible screen unlike Selenium which does not have this limitation.

We hope you enjoyed reading through this Sikuli introduction. We plan to cover much more about this amazing tool in future posts. Never miss those posts by subscribing to our feeds.

Comments
  1. AnkuLua App

Leave a Reply

Your email address will not be published.