Having a dictionary is just a small step in the big process. After obtaining the dictionary in a usable format, the data has to be uploaded to a (My)SQL database, because the flat file is not really usefull for individual lookups and the intention is to have word-splitted word-sound file. In addition the individual sylables and letters have to be added.
Best option is to put the data in the dedicated database. The compare between indexed and non-indexed lookup (for the complete Chitanka DB) is presented in the previous article. The raw DB is 70MB downloaded and 700MB unpacked. The sound files will take significant amount of space, so the approach will be to generate the basic sound files for letters and sylables and afterwards to generate the sound files for the encountered words.
So first part is to upload the data to the DB and second is to generate the sound files.
A simple approach used to upload the sylables and generate the sound files for them is described at “Connecting to MS SQL Server DB”. The script is slow, but it is currently better to have something that need improvement than something not working at all.
As the initial sylable and word parts are 30-60kB, when using WAV file, the files have to be converted. 6000 sylables and parts will take around 300MB , which can be compressed to much smaller size, which will mean less data transfer. Converting to MP3 format results in 30 MB files. A pretty lame approach was chosen for the conversion- to use the VLC portable to convert the files. It is easily interfaced, I have it already downloaded.
$outputExtension = ".mp3"
$bitrate = 22
$channels = 1
$filesToConvert = get-childitem -Path $inputFolder -Filter "*.wav"
$startTime = Get-Date
foreach($inputFile in $filesToConvert)
{
$outputFileName = $outputFolder+([System.IO.Path]::GetFileNameWithoutExtension($inputFile.FullName))+ $outputExtension
$processArgs = "-I dummy -vvv `"$($inputFile.FullName)`" --sout=#transcode{acodec=`"mp3`",ab=`"$bitrate`",`"channels=$channels`"}:standard{access=`"file`",mux=`"wav`",dst=`"$outputFileName`"} vlc://quit"
start-process $processName $processArgs -wait
}
The speed is not great, because it starts the full application, which is definetly not long term approach, but overnight it converts the files.
As the name of the file is same as the sylable text, additional upload of the links to files to the DB is not required (file path+name+extension is enough to find thr coresponding file). Probably the file and folder approach will be used instead of the DB table for the audio files. Another is having them in audio sprite(s) (but this will complicate the consequent updates). The files are tiny (for the sylables at least) so the file access time might be significant. Using them as files will offer easier portability so probably this approach will be chosen.
Well after some initial testing, the voice parameters have to be adjusted, volume, some sylables are incorrect (because new row division rules were used instead of the real sylables. For example A[some sylable] – A can not be lest alone on the row)
Same approach can be used also for the words- request the word from web API, the local scripts will create audio file and convert it to MP3 and save it to dedicated folder. The files for the words can be saved/cached in dedicated folder so the initial word generation might be a little slow, but the speed will generally improve as more data is used.
There is also difference in the perception if a word is in sentence and if it is alone and same is valid also for the sylables in the word.
Example links to some of the audio files (as this is Bulgarian language project, the words are in Bulgarian):
There is much optimization required, but this will be performed later.