As a media practitioner, I am interested in language media documentation, particularly for indigenous, endangered and low-resource languages, as a democracy-making tool.
In 2015, I founded OpenSpeaks, an open resource for citizen documentors and archivists for participatory language media documentation and archiving. Since 2014, I have written and directed nine nonfiction films.
I have also created the largest speech data repository in the Odia language containing over 61,000 recordings, all under a Public Domain Release, for use in Automatic Speech Recognition (ASR) research and application development.
OpenSpeaks was originally incubated in 2015 inside the Wikimedia Commons (Wikipedia’s sister project and an open multimedia repository). I expanded it through 2017 to accommodate a wide range of OERs, open source software among others. I produced three documentaries with support from National Geographic Society under the ambit of OpenSpeaks during 2017-2019. Like all open projects, it is now maintained by a volunteer-led community at Wikiversity, another Wikipedia sister project and an open learning platform. In October 2020, I received a small grant from Creative Commons to further OpenSpeaks’ curriculum on open content, Creative Commons Licenses, and content framework. OpenSpeaks also won me a widespread support – including the ONA 2017 MJ Bear Fellowship and a part in Mozilla Open Leadership Series. I spoke in two TEDx events, Creative Commons Global Summit 2019 and 2020, Wikimania 2019 and Celtic Knot Conference 2018 among others.
My personal interest in the journalistic documentation of oral history led me to making most of my films. including three documentaries as a 2017 National Geographic Explorer, and the 2021 documentary “MarginalizedAadhaar” as the Yoti Digital Identity Fellow. These films are all under Creative Commons Licenses. My first produced films were two educational web series on the Hindi and Kannada-language Wikipedias.
In 2019, I was awarded the Yoti Digital Identity Fellowship which helped me founded a research project titled “MarginalizedAadhaar” and study the exclusion of some of India’s marginalized groups due to the use of the biometric ID Aadhaar. As a result, I made the 2021 documentary “MarginalizedAadhaar”. The film was premiered at the re:publica Conference 2021, held virtually and in-person in Berlin, where I also delivered a featured talk. The film and the research outcomes have been a part of academic and other studies, and I have presented in many avenues: Othering & Belonging Conference 2021 (UC Berkely), 7th International Conference on Language Documentation and Conservation (ICLDC; University of Hawaiʻi at Mānoa), Sciences Po, Paris (_as a part of a course taught by Pratik Sibal), Response-ability Tech 2021 Summit and in an episode of the ID16.9 podcast series by Biometric Update.
This documentary by [Subhashish] focuses on issues of exclusion of marginalised commnities[..].
—Endangered Languages Archive
This Film is Flipping Brilliant! Congratulations [Subhashish] & thank you. Your film eloquently distills the pros + cons of biometric digital ID Aadhaar featuring expertise of Savita Bailur [and] Sunil Abraham [and] others. I’d like use it with masters students.
Good job! Must watch.
The importance of understanding the conditions of surveillance and citizens’ rights is vividly brought to light in Subhashish Panigrahi’s provocative film Marginalized Aadhaar and its accompanying text. Through Panigrahi’s work, we learn that the effects of surveillance—no matter how well intended—can be uneven and often discriminatory. In India, the state’s intervention to deliver basic amenities to all, using biometric monitoring, perversely enacts a statelessness for those already at the margins—the technical and political entangle to determine who should count and who is beyond counting.
—Kandrea Wade, Alex Taylor, Daniela Rosner, Mikael Wiberg
The collection and use of private data, especially biometric data, signals a panopticon state that empowers the state to constantly surveil citizens.
This revealing documentary by one of our Digital identity Fellows, @subhapa, highlights exclusion within India’s Aadhaar implementation, and is well worth a watch.
Other films: Public Domain Day (honorable mention in the Public Domain Day Short Film Contest Highlights Works of 1925), Who Owns the Content? (screened at “13th Native Spirit Indigenous Film Festival – North Americas”, SOAS World Language Institute, London) and Karinding [DOI: 10.5240/15FD-F65B-F01B-846E-EB71-E].
In 2021 I started a pilot under the OpenSpeaks project for building a voice data repository in the Odia language and its northern dialect Baleswaria. The goal was to create speech data as a foundational layer for speech synthesis research and application development. Using Lingua Libre, I finished recording pronunciations of 61,000 words, all under a Public Domain (Creative Commons CC0 1.0) release, including over 6,100 words in Baleswaria. This is the largest repository of Public-Domain speech data in Odia.
Discover how @subhapa built the largest repository of public-domain voice data in the Odia language in India. Congrats on this milestone, and thank you for supporting the free knowledge movement.
I have also recorded the pronunciation of another 4,000+ sentences in Odia on Mozilla Common Voice. This was a part of my Mozilla Festival 2022 presentation.
In 2020, I produced the podcast series “O Foundation Conversations”, which included discussions around the diversity of societies, languages, cultures, geopolitics, and conflicts. The series ran for five episodes and is currently on hiatus.
As a part of my Digital Identity Fellowship, I produced a podcast called “MarginalizedAadhaar” and co-produced a show with colleagues.
I led the development of Project Ol Chiki at the Centre for Internet and Society’s Access to Knowledge program which resulted in creation of a font family “Guru Gomke” in the Ol chiki writing system (used to write the Santali language), input methods to type in Santali and supporting educational resources – all openly-licensed. Indian type designer Pooja Saxena designed the typeface and Wikimedians Jnanaranjan Sahu and Nasim Ali created the tools for the input methods. Many noted Santali-language speakers contributed in the entire development. I led the overall project development and contributed in designing the input method and OERs. As an open source project, Guru Gomke received many other contributions over the time.
The Odia alphabet is used primarily for my native language Odia has fewer typefaces than other South Asian writing systems. Furthermore, there are not enough display types. During 2009 and 2010, I experimented with designing a few Odia typefaces such as eOdissa Anamana, eOdissa Bahuda, eOdissaBOXUni, eOdissaKaanthaUni and eOdissa-Majhi-Uni. Self-teaching about type design in a rather amateur way helped me connect with many professional type designers, particularly, during the Typography Day 2013. Those connections led to many collaborations, including Project Ol Chiki mentioned above. I created the first pangram in Odia which found its use in the “An Introductory Manual of Odia Calligraphy” by Aksharaya, an educational collective studying Indian scripts. I have advised many typefaces for the development of Baloo Bhaina, Baloo Bhaina 2 and a few other fonts by EkTpe.
I have taught short-term courses and chapters for university and graduate students. Some of these include a class titled “MarginalizedAadhaar: Aadhaar and exclusion of marginalised communities in India” as a part of the course “Aadhar: India’s digital identity project for a billion people” taught by Prateek Sibal, Sciences Po, Paris.; a course on how citizen documentors and archivists go about documenting languages at the Department of Humanities and Social Sciences at the Indian Institute of Technology Madras; a class on documenting and archiving endangered languages using citizen science strategies and tools for the Srishti Institute of Art Design and Technology Bangalore; a course on South Asian language Wikipedias at the Christ University Bangalore and another at the Indian Institute of Mass Communications Dhenkanal.