
At times, it is important to step in and protect the identity of someone sharing an experience in an Active Sensemaking collection to help keep anonymity. We ask people not to share identifiable details about themselves or the people they write about. Some projects center around sensitive or volatile subjects where identifying people or places could inflame the situation even more.
With any topic like this, we review the contents of the experience and its story title to eliminate this potential problem. Until recently, we did this review of content manually by having someone read all entries and redact identifying information. Here’s an example that I created:
This kind of thing happens all the time. The male bosses, like Keven Johnstone and Tom Temple, constantly belittle the female assistants. I try not to get involved but it is so demoralizing that these women who are just trying to make a living have to put up with this kind of crap.
becomes…
This kind of thing happens all the time. The male bosses constantly belittle the female assistants. I try not to get involved but it is so demoralizing that these women who are just trying to make a living have to put up with this kind of crap.
We take out the names as we are not promoting any kind of individual retribution, but aiming to get the overall flavor of what’s happening, along with the signifier data that goes with it.
I was trying out a new tool to work with the story data called Atlasti.
I noticed it had capabilities to identify Named Entity Recognition. Atlasti defines this as “Automatically search and code for any people, organizations, locations, or miscellaneous objects (e.g., works of arts, languages, political parties, books, etc.).” In addition, the software can work with the following languages: English, German, Spanish and Portuguese.
I picked a current project where I had 2,114 entries and imported data as a project into Atlasti. I went through both Title and Experience for all entries looking specifically for individuals. I wasn’t interested in organizations, locations, or miscellaneous objects. The software procedure went through all entries in two and half minutes and identified 8 people identified, 7 who were people we didn’t want identified and one was the respondent’s name. To export the entries with these names, I went to the Code Manager. I exported as a report selected these documents’ experiences along with the SID (unique identifier). From there we redacted these names from the data.
The entire time invested from set-up to conclusion was 30 minutes. What a great savings! Our last project doing a manual review for 400 entries cost us around $500 USD plus administrative overhead time. My commercial license for Atlasti was $642 USD annually. If I was to extrapolate the cost for 400 entries to 2114 entries, it would have cost approximately $2600 USD. Of course, you could find ways to do this less expensively, but the speed was the winner in my mind in addition to the quality of results. For those running narrative projects, this is something to consider.