Show Number 80 - Dr. Joshua Miele

Listen to Show 80

Dr. Joshua Miele works at the Smith-Kettlewell Eye Research Institute and is involved with some amazing projects. In this interview, I talk to him about audio descriptions and an innovative new way they can be provided to the blind and others that would benefit from them. Listen to this podcast to learn more, and attend their webinar on January 24th.

The video description research discussed in this podcast was funded by the National Eye Institute of the U.S. National Institutes of Health grant titled: "Innovative Techniques for Creating and Distributing Described Video for the Blind". Grant #1 R01 EY020925-01. The Video Description Research and Development Center is funded by The Department of Education OSEP Grant #H327J110005.

Useful Links:

Sign up for the webinar

Josh Miele's web page which describes some of his projects

The main site for the Smith-Kettlewell Video Description Research and Development Center

An interview Josh did with KQED about his map development project and other good info

A video showing how Josh uses his iPhone to aid mobility

A video demonstrating the WearaBraille


Beth: Hello, and welcome to the Disability411 Podcast, show number 80. I’m your host, Beth Case, and yes, I know it’s been unexcusably long time since my last episode, but as most of you know, I’m getting my doctorate and sometimes school just take all of my time and attention. But I have four new shows all lined up for Disabilty411. So, we are back; we are back with a vengeance. I appreciate those of you who have hung in and stayed subscribed to the feed, and if you just help me to spread the word out and let your friends and colleagues know that Disabilty411 is still here and that we have a lot new great new content coming out, I would really appreciate it.

So, to kick off our revitalization, I have a wonderful interview with Dr. Joshua Miele from the Smith-Kettlewell Eye Research Institute. They have so much really cool stuff going on there that I have asked him if he wouldn’t mind coming back and do some additional interviews with me just to let you know some of the great new technology that they are developing.

But in this interview, I talked to him about a new project they have in developing ways for people to create audio descriptions for the blind in videos, and it’s really exciting. It’s very, very cool. They also have a series of seminars and of webinars that I encourage all of you to participate in. One of them is coming up very soon on January 24th. So, listen to this interview and if it sounds like it’s something that you’d like to find out more about or if you just like to learn more about audio descriptions in general, then visit for the link and information on how you can sign up for that webinar. So, enough of me rambling and let’s get right to the interview.

I’m on the phone right now with Dr. Dr. Joshua Miele, and Josh, thank you for taking the time to talk to me to day.

Josh: It is my pleasure really to be here and I’m looking forward to chatter with you.

Beth: Yeah, this will be fun. So, why don’t we just start out with you telling us a little bit about yourself, your background, where you work, what you do, and that sort of thing?

Josh: Sure. Yeah. I think of myself as – I basically call myself an information accessibility researcher. There’s all sorts of aspects to accessibility, and as things have changed in the world of accessibility, information accessibilty has become more and more important. So, that’s really what I focus on, and it gives me a very broad set of things that I find absolutely fascinating. Of course, I’m blind myself, so it’s all pretty self-serving. I work on stuff that I need, frankly, and it has been always been that way.

I had a couple of semesters of work here as a graduate student at the Smith-Kettlewell Eye Research Institute where I am now. It’s an independent nonprofit research institute in San Francisco, doing work on all sorts of of visually related research. There are a number of scientists here that research the visual system and how it works and all sorts of things like motion perception and infant vision development and eye muscles and how they work. Then there are a bunch of opthalmologist that are interested in the pathology of vision, what can go wrong. In particular, strabismus is a big area of research here.

And then there’s a whole group working on the rehabilitation engineering research or what I call -- in a more of modern vein -- “accessibility research”, and that’s the group I work with. We do all sorts of R&D on technology that bilnd and visually impaired people would find useful in a wide variety of contexts. My area is focused on things like maps and and spatial information, including street maps for orientation and mobility and audio-tactile graphics, which are tactile diagrams that can be made to talk based on where you touch them. I’ve done a bunch of different types of things there. I’ve also been involved with alternate input devices. We’ve developed a thing called the WearaBraille, which is a prototype of an accelerometer-based virtual wireless Braille keyboard, which is totally goofey.

Beth: That’s fabulous!

Josh: Thank you. Yeah, the WearaBraille is a really cool project, but basically it’s all about information accessibility and that’s where the video stuff comes in as well. I mean, video accessibility is a type of information that, until I think recently, has been thought of as sort of eye thing. Video description is great for entertainment, but so much of the energy of the accessibility world is focused on education and employment and entertainment is – you know, I call them the 3 E’s. Entertainment often sort of gets a back seat. It takes a lesser role to the other two, but of course, there’s a great deal of stuff about education and employement in video accessibility as well, whether they’re educational videos or documentaries or training videos for the workplace, description of what’s going on in the visual component of the video is really quite important. And the social aspects of being able to enjoy even just pure entertainment video is, I think, much more important than has been recognised in the past really.

Beth: That really brings us to why I wanted to talk to you tonight, because I wanted you to tell us more about your Video Description Research and Development Center.

Josh: Yes, the very charismatically named Video Description Research and Development Center also known as the VDRDC.

Beth: Just flows off the tongue, doesn’t it?

Josh: Yeah, it really does. Everybody just .. you know, you say it once and people remember it like crazy. We call it as VDRDC as well. The reason it’s called that is actually because it’s funded through a Department of Education grant through the Office of Special Education project, and that’s what they asked us to propose. So we called it what they wanted us to call it, and that is essentially. … Even though it doesn’t roll off the tongue, it does describe exactly what we’re doing. So, it is, at the very least, descriptive, which is what we’re all about, right?

Beth: Exactly.

Josh: The VDRDC is doing a bunch of innovative research and development around new technologies for video description. We’re also doing a number of interesting outreach and dissemination activities where we basically – we have a set of technologies that we’reinterested in investigating. But we also are really concerned with making sure that the used cases, the types of things that the people in the community need around video description, in particular, educators, because this is the Department of Education grant. So, we really want to make sure that educators and students have input to the process, and we also want to make sure that they know what it is that we are doing. We’re interested in making video description more accessible in general to the teacher in the classroom and the students. So, a number of projects are really – I’ll talk about the technologies that we’re doing, which are really cool -- but a bunch of the stuff is really just about connecting with the stake holders, which are the people who are actually using description or who actually need description and making sure that they have access to the resources that they need, even the current, existing resources, so that they can have access to video description for education in whatever context they happen to be in.

Beth: Before get into too much into some of the details of what you’re working on, I think people are finally getting the idea that videos have to be captioned for their deaf students, and they’re familiar with that. They can see the captioning on their televison, and most programs, most televisions now can display captions. And I think, as someone who has worked in post-secondary services, faculty and administration are finally starting to get the message that, yes, we need to caption our educational videos. But I’m not sure that they are very familiar with video descriptions and exactly what that is. So, can you just give a very basic idea to someone who maybe has never encountered audio descriptions before?

Josh: I sure can, and that’s exactly the fact. What you just pointed out is that people aren’t really familiar with it, and that’s exactly where we are in trying to help people have a better understanding of it. Video description is, at its heart, using descriptive language to provide access to the non-sighted viewer – the blind viewer – of what’s going on in the screen. Much of that stuff doesn’t need to be described because it is obvious from the existing sound track. So, it’s really only certain things that need to be voiced and the way it’s done is essentially with a voice over. You add a voice over channel to the audion. You hear all the existing sound track plus some additional descriptive annotations that would be recorded by a describer.

For example, in a documentary, the describer might read out loud any on-screen text so when a person is being interviewd on-screen, there may be a text label underneath that person that says who they are and where they work. It’s that type of thing that a describer can read aloud and record onto the sound track, and the blind viewer gets to hear those description along with the sound track.

So, it is really an art. There are a number of different guidelines that are available for different types of materials that tell how the description should be done, but really, it’s a very simple process where you just think, ‘OK, what is it that the blind person needs to know about this in order to have it be helpful’? And you have to be suscint and clear, because there’s also some other audio that you don’t want to step on. You don’t want to interrupt the existing program too much to provide information about what’s on-screen. So, it’s a balancing act and it’s a very interesting one because there’s a lot of decisions to be made.

Beth: Are you developing tools to help people do this to their own videos? Or is your focus more on awareness and linking them up to professional resources that can do it for them?

Josh: Like I said, we have sort of a two-pronged approach. One is doing technology R&D where we are developing technologies that can be used by teachers and by parents to provide access to their own videos. So, you can actually, as a teacher for example, describe a video that you are going to be using in your classroom, and then your student can have access to that description. We’re working on tools that would allow that. And there are existing tools that will allow that as well. We’ve got some twists on that that we think are going to really improve the useability of these materials. The outreach stuff that we are doing is also important because basically, if the teachers don’t know about it, it’s not going to get used.

Beth: Exactly.

Josh: And also, similarly if we, as the researchers don’t listen to the teachers about what it is that they need, we’re going to design the wrong tools. So it’s really a dialogue that we’re holding with the community as to both what they need and what we’re doing.

Beth: So, give us a little sneak peak. What are some of the things that you have in development?

Josh: Well, you were just asking about being able to record your own descriptions. There are range of watching the video and recording a description, and then you can burn that onto a DVD or put it onto a video tape or put that recording onto the web so that the description is available with the video. We’re actually developing a really cool set of tools called the Descriptive Video Exchange, we call it a DVX. That is a tool for crowd sourcing video description. So, basically it’s kind of like a wiki for description where anybody can describe anything for anybody else, and if everybody is using the DVX tools, then everybody, everywhere has access to the descriptions without the need for re-distributing any of the video material. The way it works is we store on a server all of the descriptive information and the timing information so that when somebody uses the DVX video player to watch a video, that video player automatically checks our server and looks for descriptions that are available, and it re-synchronizes the descriptions at playback time. So, it works with DVD’s, it works with streamed videos and any other kind of video that you might mention. We’re even looking at ways of doing it with things like Netflix and Apple TV and stuff like that.

Beth: It’s a separate file. You’re not making any changes to the source material. Then you don’t have to worry about things like copyright.

Josh: Exactly. We also don’t have to worry about getting the specially described version into the hands of the people who need it, because simply by using the software, anybody will have access to all of the described material on the system. So, if I record for example, a description of the Ken Burns jazz documentary; if I do a description of that, which really could just be mostly reading the names of the people that are talking, then when you watch that video – whether you get the DVD set or stream it from somewhere, if you watch that using the DVX player, when you put the DVD in, it automatically checks the server to see if there’s a description. Lo and behold, I have recorded a description, and it’ll say, ‘there is one description recorded by Josh Miele.’ And maybe it would have ratings. We think that there ultimately may be many people contributing these descriptions and there will be a whole system for figuring out which ones are better than others, user ratings and other sorts of ranking criteria.

So, basically, you would, simply by using the mainstream video, be able to access my description of it or any number of other descriptions that might even be better than mine. So DVX is a way of dealing with the copyright issues in that you don’t need to re-distribute or modify anybody’s copyrighted material. But it also addresses the issue of getting the described information into the hands of the people who need it. So, it’s a distribution approach as well. And finally, it is a production approach because it uses the power of crowd sourcing to describe materials that would otherwise never be described. I mean, we believe that there are people who would be enthusiastic about 1950’s monsters movies, right? Or about 1920’s silent movies or other movies that have a great deal of cultural significance but probably wouldn’t receive any funding to be described. So, we think that by harnessing the power of the crowd, we will gain access to a lot more description than we ever would be able to otherwise. So, it really has three benefits to it: getting around the copyright issues, the distribution and the production.

Another aspect to it, which basically means – because we’re synchronizing the descriptions at playback time -- we can actually give the viewer a great deal of control over the description presentation that is absent when the descriptions are just recorded onto the soundtrack. For example, doing it this way, we can allow the viewer to not only turn on and off the descriptions but for example, you could do speech compression on just the description so that it would speed the descriptive clips up to fit into smaller spaces. Or you might hear only an audio signal that there’s a description available but not necessarily play it unless you hit a button to play the current description. Or maybe there’s a description that is partially stepping on some dialogue, so you might want to stop the description before it’s actually finished but listen to the dialogue. You can silence the description without silencing the dialogue. So there’s a lot of individual control. Another thing that we’re building into it is the ability to stop video playback during a description and then resume playback immediately after the description is finished so that you can have an “extended” description, is the extended description being much longer than you might otherwise have time for. Because it’s all tightly synchronized and coordinated by the DVX player, it would be really easy to do any or all of those things in combination depending on what the user wants.

At the bottom of this discussion is really the fact that description is, by it’s very nature and editorial process, you have to decide what it is you’re interested in focusing on and what it is you’re not going to describe. As they say, the picture being worth a thousand words, you don’t have time for a thousand words. So you are going to be picking and choosing what you want to describe. And different people will choose to describe different things even if we’re following guidelines, different people will choose to describe different things because they see different things as having different values for need for emphasis.

Getting back to the nature of research and development, what we’re doing is looking at how these innovative ideas meet reality and what it is that consumers need when a consumer actually tries to use these features or record a set of descriptions, what types of things do they need? How can we use technology to make these ideas be effective? Or conversely, maybe some of these ideas are actually not effective. Maybe not all of these ideas are going to work out. I’m giving you a sneak peak of the types of things that we are investigating and the types of idea that we have. But that’s why it’s called R&D, because some of it just isn’t going to work out. But we have high hopes for quite a bit of it.

Beth: I’m getting very excited about this idea but with the realization that this is still in the research phase. So, let’s go and shift gears to the other part of your project where you’re doing the outreach and also getting input from the consumers. What do you have going on there?

Josh: Well, the Video Description Research and Development Center has a set of extraordinary partners. Descriptive Partners is called the Description Leadership Network, and it’s basically a group of organizations that have a strong interest in the outcomes that we’re working on. They’re groups that stand to benefit a great deal from innovation in video description and want to be at the table. These are the organizations like The American Council of the Blind, The National Federation of the Blind, The American Foundation for the Blind. And we also have a number of companies that are in the description business such as Caption Max and Dicapta and Narritive Television Network. We also have partners that are involved with museum accessability. The Metropolitan Museum of Art, for example, is one of our partners. We have technology partners like the Ideal Group.

You can find out, by the way, all about our partners by going to, and we have a list of description leadership network partners and what they’re doing with us.

So, one of our partners is the Describe and Caption media Program (DCMP), and they are actually spearheading our webinar series. We’ve got a series of webinars for teachers focusing on description in the classroom. We’re going to have one webinar per semester over the next couple of years so a series of four are planned at this point. The first one is actually coming right up on January 24th , and folks that are interested in signing up for that – this very first of the four webinars, the teachers, by the way they’re free –

Beth: Important point.

Josh: Important point. People like that, but if you go to

Beth: As always, all of the links and everything that he’s talking about go in our show notes at

Josh: So, this first webinar is actually going to cover some important basics about using description in the classrooom and in education in general. So, we’re going to give them a background in what is description? How is description beneficial for blind and visually impaired students? And by the way, for other students as well, there’s new research showing that students with different types of reading and print disabilities can also benefit from description. That’s kind of a cool synergy as well, but the idea is that we want teachers to be able to get access to the materials that are already out there. There’s already a great set of resources of existing described educational materials.

So, really this first webinar is going to talk about what are the benefits of description; how do you use description in a classroom context; and where do you get described materials? Future webinars will talk about things like how do you record your own descriptions? Where can you find out more of the guidelines for describing different types of materials? And then, of course, we will ultimately also be doing quite a bit of talking about the types of technologies that we’re developing at the VDRDC and how those can benefit teachers and students as well.

Beth: If somebody is listening to this podcast after the date. Will it be recorded for people to watch later?

Josh: It definitely will be. Yeah, you can go to that same URL for the registration and there’ll be links to the previous webinars. You’ll also be able to get to the webinars after the fact through So there’ll be plenty of ways to get access to that stuff.

Beth: OK. The project that we talked about is really great if the students has the ability to watch the video through your DVX software. What if they’re in an assembly or some kind of large gathering where they’re showing a movie?

Josh: Oh. OK. I mentioned that we are working on a number of different projects at the VDRDC, and one of them is we call “COVA,” which stands for Choreographed and Orchestrated Video Annotation. I don’t know if you are familiar with apps like Shazam or Sound Hound.

Beth: Yeah, where you can hold your phone up let’s say to radio if you don’t know what the name of that song is you’re listening to, you just hold your phone up to it, and it can identify the song and the artist. And sometimes, you can even click on the button and buy the song, right?

Josh: Exactly! Exactly! So, we are interested in using that same type of technology so that if you’ve got a mobile device, you could actually listen to the soundtrack of a movie that’s playing where you are and your phone or mobile device would be able not only to identify the movie but would know where in the movie you are.

Beth: Really?

Josh: So, it would get both the identification and timing information from the audio that it can hear in the room. So, you don’t need to have any special video player. All you need is a smart phone or a mobile type device that would be able to run a special app that would basically, do the identification and synchronization and would be able to pull off of the web descriptive clips, very similarly to what we’ve are talking before, the DVX. The clips could be stored on a server, and all that needs to happen is that they need to be synchronized with the video. There’s no need for any special hardware or software to be incorporated to the video player. We imagine this type of technology could be used not only in school assemblies that you have mentioned but also in movie theaters that don’t have-

Beth: I was just going to ask that . I was like, do you see this working where someone who could just go to the movie theater and like, have their phone with the headset and they could get their audio descriptions without having to have a special showing. That would be great.

Josh: I absolutely do see it that way. It would also be a great benefit to the theaters because right now, any theater that supports description needs to have special equipment installed and needs to maintain special receivers that they hand out to the blind patrons. And they need to make that those are all in good working order with batteries and so on, but if you bring your own device, then you should be able to get access to it. And of course, maybe everyone doesn’t have their own device, although I think in the next couple of years, that’s going to be pretty rare. But if it is based on standard hardware like android or iPhone, the theater is much more likely to be able to support that kind of stuff and based on the consumer model rather than a proprietary or specialized kind of hardware.

Beth: Maybe the theater could even have a couple of phones that they could check out for the movie if you didn’t have your app phone.

Josh: Exactly! Exactly! The thing is that the Choreographed and Orchestrated part refers to the idea that these things don’t need to be so tightly coupled in the same sense that the dancer listens to the music and doesn’t dance based on what she or he is hearing. In that same way, we’re interested in using the sound and possibly the video of a presentation to be able to synchronize with descriptive playback. You can learn much more about the COVA project at That is a particularly cool technology that we’re interested in researching much more over the next year or two.

Beth: You have so many really cool projects going on, and we have got to get you back on to talk about some of your other things. But I know that for research and funded project, it’s really important to make sure that we give credit where credit is due so I want you to tell us about your funding sources again.

Josh: Yeah, the VDRDC is funded by the Department of Education’s Office of Special Education Programs. That’s a two-year grant, and we’re really delighted about it. The Descriptive Video Exchange Research is also funded by the National Eye Institute which is part of the NIH (The National Institutes of Health). Of course, Smith-Kettlewell has a number of funding sources and one of the major projects and groups here is the Rehabilitation Engineering Research Center which is funded by the National Institute on Disability and Rehabilitaion Research (NIDRR) and they’re also a Department of Education agency. You know, we are really lucky we have really cool research, and it’s wonderful that we are able to keep it going based on these funders. So, it is not to be taken for granted.

Beth: Absolutely. So, if anyone would like to learn more about your project, follow what you’re doing, find out more about webinars or other things you may have going on. What can they do?

Josh: Well, we have a number of websites that people can go to. The best place to learn more about the video description project, I think, is to go to There are links to all the video-related stuff. To learn more about my other information accessibility project, I think the best place to go would be That talks more about the non-video related stuff, more about the maps and the audio tactile graphics and the WearaBraille. And of course, for the webinars, you can go to either – you can get there from -- but you can also go to the Described in Caption media program at

Beth: I will definitely be signing up for that webinar myself. So again, we have been very lucky to have Dr. Joshua Miele with us from Smith-Kettlewell. Thank you so much for spending time with me tonight.

Josh: Thank you very much. It’s been a pleasure.

Beth: Doesn’t that sound exciting? He really has a lot of very cool projects going on. If you go to our website at, I will have all of the links that he mentioned, including the link to the Smith-Kettlewell Eye Research Institute, Dr. Miele’s website, the website where you can register for the webinar, as well as some other related websites and videos that I think you would be interested in. So, go check that out. You can also find all of our past shows and all of the transcripts for this and our past shows at our website. And again, please help spread the word that Disablity411 is back. We’re going to have another show very soon.

The Disability411 podcast is protected by the Creative Commons Attribution Non-commercial Share Alike 3.0 United States License which means you can share our podcast, you just can't make any money off it. Visit our website at to find show notes, past shows and transcripts of all the episodes as well as useful links, blogs and much more. Email us at Music brought to you by the Brobdingnagian Bards and used with permission.