What I have learned

Describing the visual world

Senior software engineer at Microsoft, Saqib Shaikh, and Seeing AI user, Sean Randall, tell OT  how the app was developed to help blind and visually impaired people achieve independence

Taking a picture with an Iphone

What were your ambitions with Seeing AI?

Saqib Shaikh (SS): As someone who is blind, I've long imagined a device that would describe what's going on around me, similar to when I'm with a friend. While I have a successful career and travel the world independently, there are still those frustrating moments when assistance is useful. While we are still striving to realise my original idea, conceived back at university, artificial intelligence (AI) has brought us closer. I wanted to create a system that describes the visual world using the technology available today that can grow as the technology advances in the years to come.

When developing the app, what were the main things you had to consider for blind and visually impaired users?

SS: We focused on creating a streamlined experience that works well with the other aids people used. For example, the voiceover screen reader. When testing with our early users, we found that for many this was their first experience using a camera app, so we included detailed text and video tutorials.

"I wanted to create a system that describes the visual world using the technology available today that can grow as the technology advances in the years to come"

What daily tasks were targeted as key areas of development?

SS: We quickly decided that our early priority areas would be around reading text, recognising people and identifying products.

Were there any hurdles during development and testing?

SS: During early testing we found that, out of necessity, users were using the app in poor lighting conditions. This led us to implement a system where the torch will automatically turn on when it is too dark.

Was there anything that needed changing after trials?

SS: We had constant feedback from our testers and kept iterating on new solutions. For example, users liked being able to identify products based on the barcode but struggled to find where the barcode was. This led us to create a system where the app will play beeping sounds to indicate how close you are to the barcode – a bit like the children’s game hot-or-cold.


Since its launch, has there been any feedback that Microsoft is working on?

SS: We're continually getting messages from users all around the world and look forward to improving Seeing AI based on this feedback.

When you first heard about Seeing AI, what potential impact on your daily life did you think it could help with?

Sean Randall (SR): I already used different apps to identify objects and read text, so the initial thrill for me was face recognition. I work in an office half the time, where knowing who’s walked into the room without being able to see them has never happened before. Of course, I found myself using almost everything the app has to offer eventually.

How did it work when you tested the app?

SR: It was incredible. My familiarity with other apps providing some similar services helped, but Seeing AI’s speed, efficiency and range of features keeps me coming back time after time.

In what ways has this enhanced your independence?

SR: I use an old phone for face recognition exclusively; it sits on my desk facing the door, and I have it paired to a Braille display. It is sometimes triggered by people walking passed the glass window in the door, but if they open the door and walk into the office, I get their name if I’ve already put them into the app. It’s unobtrusive, silent and accurate. I am able to greet tens of people by name, even if they don’t speak to me first.

I use Seeing AI on my main phone to:

  • Tell whether lights are on or off
  • Read expiration dates on food packages, identify printed material on screens and paper and determine addressees of material
  • Verify whether the photo I’m sending to my sighted family or colleagues shows what I think it does
  • And now to read handwriting (this Christmas was the first time that I could identify the sender of handwritten Christmas cards on my own).

What other areas would you like to see the app explore?

SR: My top feature requests are batch processing and auto-capture of pages in document mode. I would also like a travel mode where with a live camera view you could be told about pre-selected objects as you’re walking around. For that to happen convincingly the camera needs to be divorced from the phone, but I would happily wear a headcam, or mount something on my shoulder or similar so that I could have all this information hands-free.

Would you recommend utilising apps like this to other users, and why?

​SR: As a blind person, they make daily life so much smoother, quicker and independent. At work, home, or anywhere in between I can almost guarantee that if I don’t pull Seeing AI out once within a given hour, it’s probably because I’m asleep.