— PROJECT NAME
Video Intelligence
— ROLE
Interface Design
Frontend Development
— DATE
May 2017
During 2016-2017 Google was pushing their Cloud Machine Learning API. Google created series of demos to showcase the power of these new services with the goal of inspiring developers and organizations to create new products using the ML APIs.
I was hired as a consultant to design and develop an application that would showcase the Video Intelligence API. The application was presented during the opening keynote address at the Google Cloud Next annual developer conference.
DESIGN CONCEPT
To demo the API we wanted to showcase how the keywords were automatically detected and how users to quickly jump to a timestamp within each video.
Below is the initial sketch Sara presented me with and the high fidelity designs I created for the demo. Because this was a Google demo, the design style used was material design to keep in line with Google’s design requirements.
From a UX point of view, we needed the video to stay visible while a user scrolled through the list of keyword results. This allowed the user to continuing watching the video and jump to different time stamps while exploring the list of keywords.
We also included a search option so users could find multiple videos with a specific keyword. The results needed to included the frequency of each keyword along with the ability to play a preview of the video while still viewing other results.
HOW IT WORKS
The Video Intelligence API enables users to annotate videos stored locally or in Cloud Storage, or live-streamed, with contextual information at the level of the entire video, per segment, per shot, and per frame. The API returns a collection of JSON objects that contains information about keywords identified in the video.
[{
"description": "Dog",
"language_code": "en-us",
"locations": [ {
"segment": {
"start_time_offset": 7090474,
"end_time_offset": 8758738
},
"confidence": 0.99793893,
"level": "SHOT_LEVEL"
}]
}, … ]
The JSON provides microseconds of where the keyword appears in the video and tells us the confidence level for each keyword occurrence.