One of the coolest features in ARKit 2 is image tracking. This lets you specify some references images to ARKit. When those images are recognized, you get the chance to display cool content, such as videos, sound or animations. This is perfect use-case for museums, product promotions and commercials. It can be also used for playing videos when a bottle of whisky is detected, which is what we are going to do today.
In order to do this, you will need Xcode 10 (at the moment of writing, living it’s last beta days). Xcode 10 is needed because of ARKit 2, which has the image tracking feature. First, you need to create an Augmented Reality project with SceneKit.
First, add the two videos that we will be playing. Those can be found in the git repo, with a link at the end of this tutorial. Otherwise, feel free to add your own videos. Don’t forget to add the required camera permissions in the Info.plist file. Next, in the view controller, define two static properties for the urls of the two videos that we are going to play.
Also, we will have a dictionary called players, which will map the type of whisky with the corresponding video player.
Next, in the viewDidLoad method, we need to set the delegate of the sceneView to be the ViewController. We will need the delegate methods to add and display the videos. We will also add observers for the players, which will be called whenever a video has finished. When that happens, we want to move the video to the beginning, in order for it to be prepared the next time it starts.
In the viewWillAppear method, we will setup the image tracking configuration.
The image tracking configuration expects some reference images, which are specified in a folder in the assets. For our case, we will specify two images to be tracked. After we have setup the image tracking configuration, we will run the session with that configuration.
ARKit works best with narrow images and this will be a challenge for our whisky bottle recognition. That’s why we are taking only the labels, in those parts that are straight and easy for recognition. In order for the images to be detected faster, ARKit expects images with bigger resolution, as well as colourful histogram, with as many details as possible. Our Ballantines label satisfies this, but the Jack Daniels label can be a bit better. However, as you can see with the video, ARKit still recognizes this label as well. Also, make sure to specify the expected width of those images, in the Attributes inspector.
Next, let’s go into the interesting part, which is implementing the delegate methods for the scene view.
In the nodeForAnchor delegate method, we are checking whether the anchor is of type ARImageAnchor. If it is, that means that ARKit detected one of those images that we have specified. Next, we take the player for the detected image. We create SCNPlane, with the size of the detected image and set the player as the contents of that plane. Then, we start the player and we are adding it a newly created plane node. We rotate the plane for 90 degrees, which is -pi/2. Then, we add this node to a parent node we have created at the beginning.
This will play the video after an image is recognized. However, when the video stops the first time, it’s not played anymore. In order to solve this, we will implement the didUpdateNode:forAnchor method, which is called every time a node is updated. First, we check if the anchor is of type ARImageAnchor. If it is, we check whether the anchor is in currently visible in the AR scene. If it is, and the player is not playing, we are starting the player. This will re-start the player when the image is in the field of view of the phone.
That’s everything we needed to do, in order to have image tracking enabled for our ARKit app. As you can see, it’s pretty simple and it looks pretty cool. You can find the source code of this project here.