Validating audio data
And the farmers did not mind – all but one in the sample of almost 350 households consented to the use of audio recorders. The audio files served as an irrefutable proof of authenticity of household visits.They showed the date, time and length of each interview.
I never use the sndhdr utility, so, it may be better to use another one. Would using sndhdr simply duplicate what's being done with Great update.
The only problem (for this specific case) is I need the operation to be run before it writes to disk, because our web app renders uploaded content in real-time.
A user could upload a file with malicious code, and easily pass the above validations. My question is, what library or approach should I use to do this. This means someone could still include other data behind a valid header.
What I want to do next is validate that the audio file is, indeed, an audio file (before it writes to disk). 12mb )") if not file.content_type in ['audio/mpeg','audio/mp4', 'audio/basic', 'audio/x-midi', 'audio/vorbis', 'audio/x-pn-realaudio', 'audio/vnd.rn-realaudio', 'audio/x-pn-realaudio', 'audio/vnd.rn-realaudio', 'audio/wav', 'audio/x-wav']: raise Validation Error("Sorry, we do not support that audio MIME type. Is to verify the whole file which costs more CPU but also has a stricter policy.
The use of audio transcripts helped identify any problematic areas of the questionnaire, which in the end led to greater consistency of the data collected.
The audio files were also useful for identifying and revising incorrect data – missing, illogical, inconsistent or outlier values, etc.My fieldwork relied on paper and pencil interviewing.The use of audio recording, however, can also benefit any computer-assisted data collection.But for many projects, the use of audio recording can help collect consistently high-quality data at a low cost.And of course, a rigorous study of the impacts of audio recording on respondents’ behavior and answers is needed for this solution to become widely accepted.Anybody who has worked with original data has a long list of associated complaints, and they range from missing observations to outliers, to poor or non-existent codebooks. Markus Goldstein discusses this very issue in his blog post this summer, citing both his experiences doing fieldwork in Ghana and a recent paper by Arden Finn and Vimal Ranchhod in the context of South Africa.