An initial review from a musician’s point of view
I recently gained access to the early(ish) beta of Be My Eyes virtual volunteer feature now called “Be My AI.” My initial impressions of this artificial intelligence tool for blind people are positive.
AI is making its way in to every corner of the digital space. There are some who are fearful of it, but I see AI as simply a tool, one which can be used for good or evil. Most sighted people likely take for granted their ability to see and evaluate their surroundings. Beyond that, they likely don’t recognize the freedom they have to be selective about the detail they take in, and their ability to search for specific detail amongst the available visual information. Be My AI is a useful accessibility tool giving blind people the ability to take in visual information and search for specifics in that visual information.
What is it and what does it do?
Be My AI is a new feature of Be My Eyes, a mobile application for blind people which allows the user to take a picture and have that picture analyzed by the AI. The interface takes the form of a text chat. The user takes a picture and Be My AI responds, as if in a text message, with a detailed description of what it sees. The user is then given the opportunity to ask questions via a text box. Another picture, presumably from a new angle, can also be sent for further description.
The questions feature is really exciting. You can ask it for more specific detail and, if it is visible, Be My AI will try and respond. It will also tell you if there isn’t enough information in the photo and suggest taking another picture, or calling a sighted volunteer.
From a musicians point of view
I am impressed. I gave it a picture of me in my studio. In the picture there are two guitars hanging on the wall. Besides giving me a nice description of their colors, Be My AI recognized that one is a bass guitar and the other is a double cutaway. I see this as a fun and useful way to get information when a suitably knowledgeable person with vision is not available. Asking specific questions can uncover detail which is very surprising in its specificity.
I also sent it a picture of a Native Instruments Komplete Kontrol s25 keyboard in my studio. It was able to read the labels on the controls, and, to an extent, describe the buttons and controls in terms of sections which may help understand how to use the keyboard. It’s description of the keyboard was not perfect. While it did read labels, it sometimes read the words without context, and without explaining that certain functions required the press of another button. This may be beyond the scope of its capabilities. A sighted person could have told me that certain buttons have two functions based on whether the shift is pressed. That’s why the “call a volunteer” feature is available.
It is also important to remember, and Be My Eyes states this clearly, that the AI should not be trusted in critical or dangerous situations. It has what have been termed “hallucinations.” But it is certainly useful at other times, even though the user should keep in mind that Be My AI should be understood as possibly inaccurate.
In a non musical context, I was able to get descriptions of my surroundings, and it recognized what kinds of trees were around and that there was a tomato plant with red tomatoes in my yard. It does a good job describing things spatially. It uses terms such as “On the right” or “in front of” or “in the foreground/background.” It also does a good job using words such as “may or might be” which convey if it isn’t completely sure of what it is seeing.
I’m sure we’re just seeing the first new green chutes of technological possibility emerging in to reality. I’m excited to keep exploring my world with Be My AI as an accessibility tool.