Here at Norigin we are building TV Streaming Experiences for a wide range of Large Screen devices. Recently, we decided to open source some of the code we use in our TV App framework, since we felt certain components could be beneficial across all Smart TV development projects.
Our first open source project addresses Smart TV Navigation with React.
When developing for Smart TVs (or Connected TVs), game consoles and set-top boxes, there is one special thing about them — User Input method. Normally it is remote control with the directional keys. For some TVs like LG there might be also a pointer input (Magic Remote), and for Apple TV there is a directional touch pad.
This way of navigation is called Spatial Navigation (or Directional Navigation). To interact with the elements on the screen we have to move the focus (navigate to) the element and press a selection button (OK button, Enter button etc.) when the element is focused. There should be only one focused element on the screen. As developers we have to implement the logic for this type of navigation ourselves, as there is no default implementation. At least not on the Web platforms. The complexity of this feature is often underestimated and it can be quite challenging in certain scenarios. There is always a risk of introducing bugs like having more than one focused element on the screen or losing focus completely. It requires you to have a strict and robust state management system to keep track of what is focused on the screen and how to transfer the focus when transitioning between screens or modal elements. Luckily React has quite a lot of ways to organise and manage the state, so let’s have a look at a few patterns of how to implement spatial navigation.
Most common patterns
Distributed Navigation Logic
Perhaps the most straightforward pattern. Each component keeps the state of which child component is focused now, and also handles the key events to decide what to focus in response to those events. While this method might give a full control of navigation logic inside one component, it is not the most scalable solution. It requires to implement navigation logic for every component. This logic gets spread across the whole application and might take near 15% of the codebase. It also means that navigation logic needs to be tested for every single component and any improvements done in one component don’t benefit other use cases. Another issue with this approach is that all the components need to be aware of their parent components logic (e.g. expecting some prop from parent to indicate that it is focused now) as well as the children components structure. It becomes tricky when developing UI components in isolation from each other, when components might be frequently rearranged and ideally should not be aware of each other. When components are replaced or moved, it requires to also update the navigation logic in all relevant places. In the example below you can see a simplified implementation of distributed navigation of Home screen that renders multiple rows of gallery items. It handles vertical key events and switches rows, while every row handles horizontal key evens and switches between gallery items.
Focus Maps is another common pattern when working with spatial navigation. Component might have a Focus Map, an object that is pre-calculated for each direction and contains the focus keys (focus IDs or indices) of the elements that needs to be focused in response to key press events. This allows to keep parent component clean from key handling, because it’s done in the children components. The parent component is still responsible for constructing of the Focus Map.
This method is just a way to delegate key handling to children components. In some scenarios it makes it easier to also define special cases like focusing Side Menu. For example here
GalleryRow decides when to call “left” focus map, and the logic for this is straightforward: it is called when the first item is focused in a Row and the left key press event occurs. The parent doesn’t need to handle this logic and can just focus Side Menu right away when
onSetFocus is called with the Side Menu focus key param. However it’s still yet another variant of Distributed Navigation Logic and doesn’t scale well.
To organise the spatial navigation logic within the app we can use helpers such as
Grid to handle directional key events and manage the state of focused children components. This might help to encapsulate the navigation logic and to easily wrap any component inside
FocusableComponent. These helpers can store the current focus key in a context, and every child
FocusableComponent can subscribe to this context and see when it gets focused. This pattern is used in BBC T.A.L. and is the middle-ground between Distributed and Centralised Logic. The downside of this pattern is that you have to follow strict structure of your components tree and organise them in the rows, columns or grids as well as wrap every focusable component. In case when you have dynamic layout or using A/B testing in your app it might get hard to maintain since you need to update the structure of rows/columns whenever some components are moved around.
Doing it Smart
Since we are making apps for Smart TVs, our navigation system also has to be Smart, otherwise it wouldn’t work ¯\_(ツ)_/¯
In our company we care a lot about Developer Experience (or DX). We are constantly working towards improving it and making our devs’ lives easier. Good DX brings better motivation, which brings better quality of the code and makes us more efficient. The main motivation to create our own solution for spatial navigation was to have excellent DX when implementing this feature in any TV app. All the solutions above are still far from perfect scenario that you can imagine from the developer PoV. So what is the perfect way of implementing it in the code? Could we make some smart system that will allow us to just say “I want these components on the screen to be focusable” and it will figure out itself how to navigate between them? Why do we have to care about rows, columns etc. if all our components are already on the screen and we know their dimensions and coordinates? How can we avoid handling the parent-child focus propagation manually? Inspiration for this came from the article from Netflix. This approach was partially used in
react-tv-navigation package (kudos to contributors!).
Wrapping Focusable components
What is the minimal effort to make Component focusable? One way is to create a wrapping component e.g.
This requires to create another nested level in JSX, as well as introducing this wrapper in render functions. Another way is to use HOC (higher order component):
We went with the second one and we are using
recompose to create the HOC.
So now when we have a focusable component, what is the minimal functionality that this component needs to be enhanced with?
First of all it needs to have a
focused state to indicate when it is focused. Also each focusable component needs to have some
focusKey to identify it. In order to navigate between focusable components, something needs to store global state of the current focused component on the screen. We didn’t want to have any navigation logic inside the components. Each focusable component needs to be registered in come global system and report it’s own coordinates on mount. As well as delete itself from this system on unmount. So the next step is to create the global system or service to keep the list of all focusable components and manage the state of current focused component.
Centralized Navigation Logic
Spatial Navigation service is keeping the focus key of currently focused component. It also serves as a Storage for all focused components on the screen. Since we are not handling any navigation logic in the components, this logic is implemented in the Service (centralized). The navigation logic itself is quite straightforward. When user presses the directional key, Spatial Navigation Service is trying to find the best candidate to be focused next in that direction, based on the shortest distance between the currently focused item and potential target item. The algorithm itself is quite advanced and inspired by this implementation.
The Service also provides an interface for any focusable component to imperatively set focus to any other component, or onto itself.
Each focused component needs to be connected to the Service. It might be done either via React Context API or by passing some kind of reference of the component to the Service. Initially we implemented it with the Context, e.g. wrapping the whole App into another HOC that served as the Context Provider:
Each component was subscribed to the Context and whenever the current focus key is changed, each focusable component compares the new focused key with its own focus key to determine if it’s
focused now. This approach didn’t work well for us, because each focusable HOC on the screen needs to re-render in response to Context update. It doesn’t mean that each wrapped component needs to re-render though, if the
focused state didn’t change for it. However on low-end devices it caused performance issues, because React tree reconciliation caused by HOC updates still takes time. In the end we went with more imperative approach. Instead of Context, the Spatial Navigation Service gets access to each component’s state handlers (created by
withStateHandlers recompose HOC), for example
onUpdateFocus. When focusable component is mounted, it passes the reference to this handler to the Service. This way we ensure that only two components will be updated at a time: the one that got new focus, and the one that lost focus:
Even though each focusable component reports its own dimensions and position on the screen, the UI on the screen is not linear, it has certain hierarchy. We can have focusable elements inside scrolling lists or other wrappers/containers, so relying only on the global coordinates on the screen to measure the distance is not enough:
In this case when we try to navigate to the left, the next element according to the global coordinated is one of the menu items, but the expected would be to focus the next left element in the scrolling row, which is out of the screen (marked with dashed border).
In order to improve this, we have to structure our UI into a Focusable Tree. We can make scrollable lists as focusable components, or even the whole page. In the example above we can make Menu as focusable (green border) as well as the scrollable list (blue border).
If we restrict the directional logic to prioritise sibling components first, the system will focus the next item in the scrollable list (so we can scroll to it afterwards):
But what if there is no good candidates to be focused amongst siblings anymore? This can be solved via delegation of the directional action to the parent focusable component:
In this scenario the system attempts to focus the sibling element to the left, but there is none. It delegates the “left” action to the parent list component, which then attempts to focus its sibling to the left, which is the Menu.
But focusing the Menu itself is not really enough. Intuitively we expect it to focus some Menu Item. This is done via down-tree propagation:
Even though it sounds complicated with the up-tree and down-tree propagation, you don’t have to worry about it since this is all done automagically by the Service.
Putting all together
Here is an example of a simple implementation with the Menu, Menu Items and Scrollable List with Items inside:
In the real world scenario something might go wrong, the focus might jump somewhere where you don’t expect it to jump, or disappear for some reason etc. To make it easy for you to understand what’s happening we have implemented two debug modes:
The first one will output helpful console statements to understand what is going on when the Service is trying to focus next element in the direction of navigation. Visual debug will draw borders around each focusable component, as well as highlight the points that are used to calculate the distance between components when navigating between them.
We are constantly improving our navigation system when we find new use cases or just trying to simplify things for developers even more. If you got inspired, check it out on Github and of course feel free to contribute!
Thanks to Espen Erikstad.