Apps can now see! What more can Google Cloud Vision API do?

If you are a pet lover, you will have done what I am about to tell you by now. When you feel stressed at work or elsewhere and your pet isn’t with you, you try garnering some canine love on your phone. And google has been your best friend in accomplishing the task for a while now. Google photos can sift through the myriad memories on your phone to find the ones featuring your pet instantly.

Google Cloud Vision API

Google accomplishes this enviable task using machine learning applications.  Now this powerful Image Classification technology has been mainstreamed!

Google recently released its Google Cloud vision API to developers everywhere. Developers can use google’s powerful machine learning apparatus to build applications that can see as well as understand the content of static images that a user takes or stores in Google Cloud storage. Additionally, developers can also use the REST API to  use cloud vision in environments other than Google cloud services. Google has positioned the Cloud Vision API as a ‘Image Analytics-as-a-Service’ offering.

The key functionalities of Cloud Vision API:

  1. Image Recognition & Identification: See & understand content of images
  2. Image classification: sort images into categories
  3. Sentiment Analysis: Detect emotions on faces
  4. Text Recognition: Recognize words printed in an image (this extends to several languages besides English)
  5. Text Extraction: Can extract text from images

Use Cases for Google Cloud Vision API:  

7. Store only your best images

Imagine an application that will auto-delete those ‘not-so-spick’ pics.

Google Cloud Vision

An app that does the job for users will get lapped up easily. Google’s Image sentiment analysis is a tool that can read your face if not your mind.

6. Insights from machine sight

If you have an enterprise image bank, don’t let it just sit there, Get actionable insights! Data mining can be more valuable than mineral mining, in a world that’s information driven. Information is wealth is an old adage.  One way to use the Cloud Vision API is for pattern recognition. For example, the US Postal Service employs machine learning to identify handwriting. Another instance is of Aerosense, a subsidiary of Sony Mobile Communications Inc, an early adopter of Cloud Vision API. They used it on their large database  of images to gain meaningful insights.

5. Image Moderation

  • Being able to recognize the contents of an image will mean that websites that allow image uploads by users can easily isolate and disallow inappropriate content. Facebook currently relies on users flagging objectionable images, but with a machine learning algorithm it can disallow or delete images by itself.  ‘Photofy’ an early user of the limited release version noted that the API did flag violent and adult content on user-created photos in accordance with their abuse policies
  • Another way to use image moderation functionality would be to restrict the type of images users upload to your website to your niche. Say you host images of exquisite and beautiful flowers, you can model your algorithm to accept only images of flowers.

4. Build Metadata for large image databases

Google’s Label/entity detection feature picks out the primary element (an object) in an image by comparing it with a broad set of object categories. This can be leveraged to build metadata on your enterprise image catalog.Cloud Vision API thus transforms your enterprise image collection into an asset that can support image-based searches or recommendations.

3. Text extraction & analysis

Google Cloud Vision

Design an app that lets users identify food items with allergic ingredients with a simple snap. Once a user takes a snap of the product labelling, the app can extract and analyze the text pertaining to ingredients and detect if there is anything that you are sensitive to. The cloud vision API supports Optical Character Recognition (OCR) to retrieve text in a wide variety of languages.

2. Landmark, Logo & Label Detection

The cloud vision API can identify popular natural and manmade structures as well as company logos and product logos from images.

1. Realtime People Counter

A people counter can measure how many people are moving through a particular passage/entrance and also ascertain the direction of their movement. This can be used in retail stores and shopping malls to count footfalls and measure conversion rates easily.  

The limited release version of the cloud vision API had piqued the interest of several enterprises. With the release of the beta version to developers everywhere, we are certain of seeing many new and innovative use cases in the future.

Murali Dodda is a Cloud Technology Specialist with over 15 years of experience. He graduated from the prestigious IIT Madras. Murali provides 'technology and business leadership' to startups and has overseen successful exits for several of them. He is currently leading a team of technologists at Bitmin, a hot new startup delivering cloud services. Murali uses his weekends to catch up on the latest developments in technology innovation, product development, and entrepreneurship domains. Being an enthusiastic blogger, he shares exciting developments & his experiences with designing & deploying cloud strategies through his blog. If you want an inside view of cloud deployment for real-world clients, follow this blog.

Leave a reply:

Your email address will not be published.

Site Footer