Equitable Accessibility

Why does Equitable Accessibility matter? What does it really do for scientific datasets?

Open AllClose All

About 16% of the population of the world currently experiences significant disability. The figures below speak to the specifics.

Worldwide hearing loss statistics
Worldwide hearing loss statistics
Worldwide neurodiversity statistics
Worldwide neurodiversity statistics

The way in which we perceive and interact with the world is like a multi-modal modal dataset, sound, vision, touch, movement, taste, smell, etc. For researchers with disabilities, their dataset may have noise in a given mode, or missing data in a given mode, or a mode entirely missing. However, when we do multi-modal data analysis, removing certain modes and deeply exploring others often reveals totally new insights into the data – ones it is difficult to get looking at all modes. There are numerous studies into this topic, including:

  • Abdolrahmani, A., Storer, K. M., Roy, A. R. M., Kuber, R., & Branham, S. M. (2020). Blind leading the sighted: drawing design insights from blind users towards more productivity-oriented voice interfaces. ACM Transactions on Accessible Computing (TACCESS), 12(4), 1-35.
  • Dobel, C., Nestler-Collatz, B., Guntinas-Lichius, O., Schweinberger, S. R., & Zäske, R. (2020). Deaf signers outperform hearing non-signers in recognizing happy facial expressions. Psychological research, 84, 1485-1494.
  • Han, C., Mitra, P., & Billah, S. M. (2024, May). Uncovering Human Traits in Determining Real and Spoofed Audio: Insights from Blind and Sighted Individuals. In Proceedings of the CHI Conference on Human Factors in Computing Systems (pp. 1-14).
  • Pang, W., Xing, H., Zhang, L., Shu, H., & Zhang, Y. (2020). Superiority of blind over sighted listeners in voice recognition. The Journal of the Acoustical Society of America, 148(2), EL208-EL213.
  • Grant, A., & Kara, H. (2021). Considering the Autistic advantage in qualitative research: the strengths of Autistic researchers. Contemporary Social Science, 16(5), 589-603.
  • Taylor, H., & Vestergaard, M. D. (2022). Developmental dyslexia: disorder or specialization in exploration?. Frontiers in psychology, 13, 889245.
  • Schippers, L. M., Horstman, L. I., Velde, H. V. D., Pereira, R. R., Zinkstok, J., Mostert, J. C., … & Hoogman, M. (2022). A qualitative and quantitative study of self-reported positive characteristics of individuals with ADHD. Frontiers in Psychiatry, 13, 922788.
  • Bury, S. M., Hedley, D., Uljarević, M., & Gal, E. (2020). The autism advantage at work: A critical and systematic review of current evidence. Research in Developmental Disabilities, 105, 103750.
  • Hatak, I., Chang, M., Harms, R., & Wiklund, J. (2021). ADHD symptoms, entrepreneurial passion, and entrepreneurial performance. Small business economics, 57, 1693-1713.

Upholding respect among colleagues and peers is essential for innovation. A commitment to mutual respect enhances the quality of discourse and promotes effective teamwork in the field of data science and related areas.

There are many studies on the relationship between tools, respect, and teaming for success:

  • Castro, Franz, et al. “Experiences of researchers with disabilities at academic institutions in the United States.” Plos one 19.8 (2024): e0299612.
  • Croker, Anne, and Joy Higgs. “The RESPECT model of collaboration.” Collaborating in Healthcare: Reinterpreting Therapeutic Relationships. Rotterdam: SensePublishers, (2016). 43-54.
  • Dobni, C. Brooke. “The DNA of innovation.” Journal of Business Strategy 29, no. 2 (2008): 43-50.
  • Dovey, Ken. “The role of trust in innovation.” The learning organization 16, no. 4 (2009): 311-325.
  • Hwang, I‐Ting, et al. “How people with intellectual and developmental disabilities on collaborative research teams use technology: A rapid scoping review.” Journal of Applied Research in Intellectual Disabilities 35.1 (2022): 88-111.
  • Frohman, Alan L. “Managers at work: Building a culture for innovation.” Research-Technology Management 41, no. 2 (1998): 9-12.
  • Wu, Jia Rung, et al. “Employer practices for integrating people with disabilities into the workplace: a scoping review.” Rehabilitation Research, Policy, and Education 37.1 (2023): 60-79.

What can a data creator control?

A data creator controls many facets and can apply equitable and accessible practices throughout. Explore some practical tips below.

Open AllClose All

File formats were previously presented on the Data Cleaning and Standardization page. To recall, these are the main questions to consider about your file formats:

  • Open or proprietary?
  • Common or low use?
  • Supported by many software platforms or only one?
  • Freestanding or reliant on embedded programs, files, or scripts?
  • Lossless or lossy?

Common file formats that are far more likely to have screen reader integration and other support systems. The more software platforms support a file type, the more likely at least one form of access can zoom on visual data, change the font for text data, etc. High-use, open data formats are more likely to be supported by state-of-the-art tools from accommodations research.

Language forms the basis of human interoperability for shareable data. A data creator has a great deal of control over the language they choose to use for documentation and metadata, and the standards they choose to adhere to. In our data documentation discussion we presented the importance of not making assumptions about your data user from an interdisciplinary standpoint. This is also important from an equitable accessibility standpoint.

The critical concept here is clarity and precision. Datasets and their associated metadata/documentation should avoid words and phrases with “double meanings” as much as possible, because it can skew both understanding and interpretation of data. We list a few guiding principles to consider.

  1. Avoid using idiomatic expressions. Because open data is intended for an international audience, researchers without a shared cultural background may misinterpret idioms or turns of phrase.
    • Common Example: “Set the stage” is a fairly common idiom referring to “creating the necessary conditions and context for something to happen” (Top 20 commonly used idioms for research writing). However, in domains like materials science and chemistry there are actual physical stages frequently set up as a part of methodology. Be precise in describing methodology and careful with reference to common experimental phrases.
  2. Avoid analogy.
    • Common Example: “master/slave” is used in numerous contexts, including particle in cell flow computationparallel runtime execution modelsperiodic boundary conditionsfile systemshard drive technologygrid computing, and graph traversal. Because it is an analogy, it covers too many potential relationships, making it difficult for data users to be precise. Use alternatives that clearly define the desired relationship, such as “hierarchical”, “primary-secondary”, “active-passive”, “origin-clone” or “local-remote.”
    • Common Example: “dummy value” is particularly common in data science, but has multiple potential meanings for what the value is and how it is represented in a given data set. The use of the word “dummy” can be misleading. Words such as “placeholder value”, “sample value”, or “test data/test value” more precisely capture the nature of the value in the data context.
  3. Avoid overly general adjectives.
    • Common Example: Words describing age are particularly susceptible to ambiguity. Avoid adjectives like “elderly”, “young”, or “middle aged” in favor of exact age demographics (persons over 65, persons between the ages of 12-14″ etc. As with idioms, international interoperability is also key. Different countries define different age ranges for “child” or “adult.” Similarly, defining age based on a culture specific phenomenon (i.e. “high-school aged” or “elementary school aged”) should be avoided.
  4. Be careful with commonly used words that have special meanings in data science.
    • Common Example: “Minority” has a specific meaning in data science, meaning a part that is less than half a whole number. However, has also historically been used to refer to traditionally underserved or underrepresented communities. Use the right word(s) to describe your specific data.
    • Common Example: “Normal” is a deeply overloaded term in data science, math, and statistics, and should be avoided if possible. When using the word “normal” to denote, for example, a person without disabilities, it is clearer to use words such as “nondisabled person,” “sighted person,” “hearing person”, “neurotypical person” etc. with specific reference to your study. When using normal to refer to non-anomalous state for a given experiment, it is better to use the word “non-anomalous” or “desired state.”

There are many comprehensive resources on language standards in scientific contexts, Cambridge Proofreading, LetPub, and APA Style are great starting resources.

Reading large blocks of uninterrupted test is overwhelming for everyone, but is particularly overwhelming for neurodiverse researchers. Formatting can be particularly important for this subset of researchers. Some principles to follow are:

  • Separate ideas with whitespace
  • Use bullets
  • Bold rather than italicize important words
  • Be consistent

There are numerous naming standards in existence. In particular, there is a common set of machine-interoperable naming conventions in use:

Common Machine-interoperable naming conventions:

  • camelCase: First word lower case, first letter of each subsequent word upper case.
  • PascalCase: First letter of every word upper case.
  • snake_case: All lower case, words separated by underscore.
  • SCREAMING_SNAKE_CASE: All upper case, words separated by underscore

Of these four, snake_case and SCREAMING_SNAKE_CASE are the most accessible for researchers with dyslexia, low vision, or using screen readers.

SCREAMING_SNAKE_CASE, however, can be overwhelming and overstimulating for neurodiverse researchers.

Our recommendation? snake_case!

Text Components

Use a sans serif font of at least 12-14 pt to make your figure easier to read for people with dyslexia. Larger font is also important for researchers with low vision! You can do this in matplotlib:

plt.rcParams.update({'font.size': 12})

Note that the default font is already sans serif in matplotlib.

An additional low vision recommendation is set your resolution high enough for zooming in to still be clear. Which of these images is more clear to you?

You can change the default in matplotlib as well one of these two ways:

plt.savefig(f'data_image.png', dpi=300)
plt.savefig(f'data_image.svg')

Color

There are multiple types of color vision deficiency (CVD):

  • Protanopia – Reduces sensitivity to red light
  • Deuteranopia – Reduces sensitivity to green light
  • Tritanopia – Reduces sensitivity to blue light
  • Achromatopsia -A total loss or reduction of all three colors
Three different types of Color Vision Deficiency (CVD) (Image from https://www.orcam.com/en-us/blog/color-blindness)
Three different types of Color Vision Deficiency (CVD) (Image from https://www.orcam.com/en-us/blog/color-blindness)

It is important to design data visualizations, presentations, etc., with color deficiency in mind. For example:

Adding Additional Cues for Color Vision Deficiency (CVD)
Adding Additional Cues for Color Vision Deficiency (CVD)

The use of symbols and patterns can make otherwise inaccessible images available to those with CVD. The two plots below show how small changes in colors and line patterns can make data significantly more accessible.

This is a particularly pervasive problem in generated images for data using target boxes. While the standard vision view of the plots below may be accessible to most, those with deuteranopia will not be able to see the target boxes at all.

A better standard is to employ a perceptually uniform colormap. According to matplotlib‘s documentation, “For many applications, a perceptually uniform colormap is the best choice; i.e. a colormap in which equal steps in data are perceived as equal steps in the color space. Researchers have found that the human brain perceives changes in the lightness parameter as changes in the data much better than, for example, changes in hue. Therefore, colormaps which have monotonically increasing lightness through the colormap will be better interpreted by the viewer.”

Perceptually uniform colormaps that are CVD-friendly
Perceptually uniform colormaps that are CVD-friendly

While the above figure displays several CVD-friendly options, we recommend cividis. You can set the default tableau to be CVD friendly and the default colormap to cividis using the code below.

plt.style.use('tableau-colorblind10')
plt.rcParams['image.cmap'] = 'cividis'

Large blocks of color can be painful for data users with visual hypersensitivities or chronic migraines. An example can be seen in the collapsible menu below, but please use caution if you have visual hypersensitivities.

  • A real example of a potentially overstimulating image included in a scientific paper
    A real example of a potentially overstimulating image included in a scientific paper
  • Alternative Text

    Have you ever attempted to load a webpage with lots of images on your phone in a bad service area? When the images do not fully load, they will frequently show a “holder” location with a small icon and some words. Which of these images would be more helpful to you?

    An example of no alt text vs. included alt text
    An example of no alt text vs. included alt text

    Alternative text (alt text) is a written alternative to an image, video, audio captions, link, etc. It is normally a short description of the content or information associated with that piece of media. Alt text is particularly helpful for screen readers, those with low internet bandwidth, or mobile device users.

    When writing alt text, consider the following:

    • Convey the content
    • Mention color (if it is important to understanding the image)
    • Share humor
    • Transcribe text

    Some possible alt text options for this picture may be:

    • An image of the planet Earth depicting the portion of the world (28%) with vision impairment
    • A pie chart of planet Earth where a 28%slice represents the portion of the population with vision impairment

    Have you ever gotten code from another researcher that looks like this?

    class thisdoessomething:
        def func(self, x, y):
            return y**2 + y*x + x**3
        def next(self, y, x, h):
            y_j = y + h*self.func(x, y)
            x_j = x + h
            return y_j, x_j

    This is not very accessible. If you know the formula for Euler’s Method, you may recognize what this is doing, but how could it be made more approachable?

    class EULERS_METHOD:
        def derivative_function(self, x, y):
            return y**2 + y*x + x**3
        def numerical_approximation(self,
            current_function_value,
            current_step_value, step_size):
            approximate_value = current_function_value + step_size *
                                self.derivative_function(current_step_value,
                                                         current_function_value)
            next_step_value = current_step_value + step_size
            return approximate_value, next_step_value

    This code is much more explicit! Not only do we have a clear name for the class and its purpose, we also now know exactly what each variable is intended to do.