Echo360 has partnered with Amazon to provide transcription services for captures and videos in Echo360. ASR stands for "automatic speech recognition" and is a service provided by Amazon that uses computers to translate speech into text and sync the text with the video. Beyond the feature that sends videos to Amazon for automatic transcription, Echo360 also provides an interface for manually uploading transcription files to captures, bypassing the automated service altogether.
The ASR toggle in the Institution Features page turns on or off automatic ASR transcription service. What does that mean?
- If the toggle is ON, all captures are sent to Amazon for automatic transcriptions at the time the video or capture is published.
- If the toggle is OFF, admins and instructors can manually transcribe and upload the transcription files (in webVTT format) to captures and videos as needed. Nothing is sent for transcription while the toggle is off.
- If the toggle WAS on but is turned off, captures/videos that received automated transcriptions during the time it was on retain those transcriptions; they are not removed. Future published captures are not sent for transcription.
Manual transcription for video media is always available. See Manually Adding Transcriptions to Captures for more information.
The below sections provide some of the more technical details about the transcription service. Unless otherwise specified, the transcriptions referenced below and their application refer specifically to "automatic machine-generated transcriptions provided by the Amazon ASR service". Manual transcriptioning (creating and uploading a webVTT transcription file) is identified where it applies.
What is the difference between Transcriptions and Closed Captions?
Transcriptioning is different than "captioning" in several ways. Transcriptioning is "speech to text" and does not include sound effects and non-speech elements typically included in closed captions. Furthermore, an automatic speech recognition transcription service is likely not going to meet the accuracy levels required of closed captions for hearing impaired individuals. You will find this particularly true in the case of low volume captures or those where the audio is interfered with by background noise, or possibly even in the case of non-native speakers whose accent is thick enough to cause word-recognition problems for the transcription service.
All that being said, transcriptions can be applied much faster than closed captions and typically cost less to generate. Transcriptions may provide a "good enough" solution for providing both visual and audio content for lectures, and in particular may work as an interim measure during the interval (sometimes a day or more, depending on required accuracy levels) while closed captions are being applied.
Alternately, because both closed caption files and transcription files use the WEBVTT standard, it is possible to have the automatic transcriptions generated, then edited for accuracy, then uploaded as closed captions as well as having the edited transcription uploaded to replace the original transcription file.
When do transcriptions get applied?
Transcriptions are requested at the time the item is published to a section, which is different from third party closed captions that are requested when the capture is created or manually sent for caption processing (if closed captioning is configured for your institution).
Furthermore transcriptions are requested for ALL video media on publish. This includes appliance-generated captures, instructor-initiated ad hoc captures, and uploaded videos.
For example, an instructor generates an ad hoc capture but selects "Library" as the Publish-to location, that video will not be transcribed until it is published to a class in a section. If an instructor uploads a video directly to the Class List in a section, that video is sent for automatic transcriptioning and will display them when they are finished.
Keep also in mind that "availability" of a capture is not the same as "publishing". You can publish a video while making it not-yet available for students, but the video is still published, and therefore will trigger transcriptioning at that time (if ASR is turned on).
If a video is edited while it is currently published, it will be sent for transcription once the edits are complete and the user clicks Save. This means that the video may be transcribed twice (once on initial publish; once on edit while published).
Important: The connection to make between the above two points is that if you/your instructors' standard procedure is to generate a capture, auto-publish it to a section but not make it available to students for some period while the instructor edits the video, that video will be transcribed twice; once when it is initially published, then again after the edits are complete and saved.
If a Manual Transcription is applied to a capture/video before it is published, and THEN the video is published (and the ASR toggle is ON), the video is sent for and will receive automatic transcription. In this case, the uploaded transcription is considered the "original" and the automated one is an update. Reverting to the original would return the originally uploaded transcription. While we don't expect this to be a common use-case, we wanted to note it here for you. See Managing Transcriptions for Captures for more information.
How long does it take for a transcription to appear?
It takes at least 30 minutes for a video to receive automatic transcriptions, longer for videos that are more than an hour in length and/or if the transcription service is processing a large number of requests at the time.
Currently transcriptions are only viewable from within the classroom and cannot be seen in the content details playback panel. From the content details page, you can refer to the Transcript entry on the right side of the bottom section of the page (below Captions) to see if a transcription has been applied to this video. The Transcript entry will read "Add" if there is no transcription, and will read "Update" if there is one.
In what instances are videos NOT automatically transcribed?
Transcriptions are not provided for videos longer than 4 hours. This is an Amazon restriction. If you publish a video that is longer than 4 hours, it will not be automatically transcribed. You can manually transcribe these videos and upload the transcription file to apply one. Alternately, you can edit longer videos into shorter segments, which will then be transcribed when they are published.
Transcriptions are not "back-applied" to captures already published. Captures and videos already published in the system at the time the ASR Toggle is turned on must be RE-published to get automatic transcriptions added to them.
NOTE: If you remove then re-publish a capture to a class to obtain transcriptions, you will remove student video view data from the section analytics for that class/video. BETTER OPTION is to create a "holding class" solely for temporarily publishing videos (or use an expired section), publish the older, non-transcribed videos to the class then remove them. The act of publishing will trigger the automatic transcription; the video does not need to be left in the class for more than several seconds. All currently published versions of the video will see the transcriptions.
Captures/videos that have already been auto-transcribed and are later re-published to a new location are NOT sent for automatic transcription again, as long as the file has not changed. The ASR service sees that the video has an automated transcription, compares the audio file byte-for-byte, and as long as it is the same as the file originally submitted (has not been edited) the video is not re-submitted for transcriptions.
Edited videos that are not currently published are not sent for transcription on edit. They will be transcribed (or re-transcribed if applicable) if/when they are published.
Echo360’s ASR offering is a paid service, and each customer/institution has an allocation of transcription hours included as part of your Echo360 contract. Allocations are based on the annual contract period, and are reset each year.
ASR usage is based on the number of capture hours published to courses (the number of hours of video being transcribed). If your contractual allocation does not provide sufficient transcription coverage, you can pre-purchase additional hours by contacting your regional account team.
Echo360 contract-allocated transcription hours will not roll over year-over-year. However, any additional hours you purchase above the contract-granted ones that are not consumed will not expire or reset.
If/When you reach your ASR Allocation limit, your Echo360 account representative will notify you as soon as possible. At that point, you will be asked if you wish to purchase more hours to continue using the service. If you do not, the ASR service will be turned off for your institution.
Existing automated transcriptions will always remain with the media they have been applied to; they are not removed regardless of whether you continue to use the ASR service or not. The ASR Allocation simply determines whether you have transcription hours available in your account, and therefore whether or not more media can be sent for automatic transcription. Manual transcription and upload is always available.